Solr Cloud 8.5.1 - HDFS and Erasure Coding

2020-09-16 Thread Joe Obernberger
Anyone use Solr with Erasure Coding on HDFS?  Is that supported? Thank you -Joe

Re: Solr with HDFS configuration example running in production/dev

2020-08-24 Thread Joe Obernberger
) ... 50 more Attaching it as a file too. Thanks! On Wed, Aug 19, 2020 at 9:37 PM Joe Obernberger mailto:joseph.obernber...@gmail.com>> wrote: Your exception didn't come across - can you paste it in? -Joe On 8/19/2020 10:50 AM, Prashant Jyoti wrote: > You're right

Re: Solr with HDFS configuration example running in production/dev

2020-08-19 Thread Joe Obernberger
Your exception didn't come across - can you paste it in? -Joe On 8/19/2020 10:50 AM, Prashant Jyoti wrote: You're right Andrew. Even I read about that. But there's a use case for which we want to configure the said case. Are you also aware of what feature we are moving towards instead of

Re: Creating 100000 dynamic fields in solr

2020-05-11 Thread Joe Obernberger
Could you use a multi-valued field for user in each of your products? So productA and a field User that is a list of all the users that have productA.  Then you could do a search like: user:User1 AND Product_A_cost:[5 TO 10] user:(User1 User5...) AND Product_B_cost[0 TO 40] -Joe On

Re: Delete on 8.5.1

2020-04-30 Thread Joe Obernberger
Hi All - while I'm still getting the error, it does appear to work (still gives the error - but a search of the data then shows less results - so the delete is working).  In some cases, it may be necessary to run the query several times. -Joe On 4/29/2020 9:03 AM, Joe Obernberger wrote: Hi

Re: Delete on 8.5.1

2020-04-29 Thread Joe Obernberger
(SolrSearcher.java:60) On 4/28/2020 11:50 AM, Joe Obernberger wrote: Hi all - I'm running this query on solr cloud 8.5.1 with the index on HDFS: curl http://enceladus:9100/solr/PROCESSOR_LOGS/update?commit=true -H "Connect-Type: text/xml" --data-binary 'StartTime:[2020-01-01T01:02:43Z T

Delete on 8.5.1

2020-04-28 Thread Joe Obernberger
Hi all - I'm running this query on solr cloud 8.5.1 with the index on HDFS: curl http://enceladus:9100/solr/PROCESSOR_LOGS/update?commit=true -H "Connect-Type: text/xml" --data-binary 'StartTime:[2020-01-01T01:02:43Z TO 2020-04-25T00:00:00Z]' getting this response:   1   500   54091  

Query confusion - solr cloud 8.2.0

2020-04-07 Thread Joe Obernberger
I'm running the following query: id:COLLECT2601697594_T496 AND (person:[80 TO 100]) That returns 1 hit. The following query also returns the same hit: id:COLLECT2601697594_T496 AND ((POP16_Rez1:blue_Sky AND POP16_Sc1:[80 TO 100]) OR (POP16_Rez2:blue_Sky AND POP16_Sc2:[80 TO 100]) OR

Re: SolrCloud 8.2.0 - adding a field

2020-04-01 Thread Joe Obernberger
Nevermind - I see that I need to specify an existing collection not a schema.  There is no collection called UNCLASS - only a schema. -Joe On 4/1/2020 4:52 PM, Joe Obernberger wrote: Hi All - I'm trying this: curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field&qu

SolrCloud 8.2.0 - adding a field

2020-04-01 Thread Joe Obernberger
Hi All - I'm trying this: curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field":{"name":"Para450","type":"text_general","stored":"false","indexed":"true","docValues":"false","multiValued":"false"}}' http://ursula.querymasters.com:9100/api/cores/UNCLASS/schema This

Re: Solr 8.2.0 - Schema issue

2020-03-10 Thread Joe Obernberger
[5] in submitted tasks"}} I don't see anything in the logs. -Joe On 3/6/2020 1:43 PM, Joe Obernberger wrote: Thank you Erick - I have no record of that, but will absolutely give the API RELOAD a shot!  Thank you! -Joe On 3/6/2020 10:26 AM, Erick Erickson wrote: Didn’t we talk ab

Re: Solr 8.2.0 - Schema issue

2020-03-06 Thread Joe Obernberger
, Erick On Mar 6, 2020, at 05:34, Joe Obernberger wrote: Hi All - any ideas on this? Anything I can try? Thank you! -Joe On 2/26/2020 9:01 AM, Joe Obernberger wrote: Hi All - I have several solr collections all with the same schema. If I add a field to the schema and index

Re: Solr 8.2.0 - Schema issue

2020-03-06 Thread Joe Obernberger
Hi All - any ideas on this?  Anything I can try? Thank you! -Joe On 2/26/2020 9:01 AM, Joe Obernberger wrote: Hi All - I have several solr collections all with the same schema.  If I add a field to the schema and index it into the collection on which I added the field, it works fine

Solr 8.2.0 - Schema issue

2020-02-26 Thread Joe Obernberger
schema), I get an error that the field doesn't exist. If I restart the cluster, this problem goes away and I can add a document with the new field to any solr collection that has the schema.  Any work-arounds that don't involve a restart? Thank you! -Joe Obernberger

Split Shard - HDFS Index - Solr 7.6.0

2020-02-10 Thread Joe Obernberger
1755371094",     "exception": {         "msg": "not enough free disk space to perform index split on node sys-hadoop-1:9100_solr, required: 306.76734546013176, available: 16.772361755371094",         "rspCode": 500     },     "status": {         "state": "failed",         "msg": "found [] in failed tasks"     } } -Joe Obernberger

NoClassDefFoundError - Faceting on 8.2.0

2020-02-05 Thread Joe Obernberger
! -Joe Obernberger

Re: native Thread - solr 8.2.0

2019-12-10 Thread Joe Obernberger
y jstack; you can even force GC to release that native stack space. Then, rewrite the app, or reduce heap to enforce GC. On Tue, Dec 10, 2019 at 9:44 AM Shawn Heisey wrote: On 12/9/2019 2:23 PM, Joe Obernberger wrote: Getting this error on some of the nodes in a solr cloud during heav

native Thread - solr 8.2.0

2019-12-09 Thread Joe Obernberger
Getting this error on some of the nodes in a solr cloud during heavy indexing: null:org.apache.solr.common.SolrException: Server error writing document id COLLECT20005437492077_activemq:queue:PAXTwitterExtractionQueue to the index at

Re: Solr 8.2.0 - Unable to write response

2019-11-01 Thread Joe Obernberger
Thank you Shawn. What I'm trying to get for my application is the commitTimeMSec. I use that value to build up an alias of solr collections.  Is there a better way? -Joe On 11/1/2019 10:17 AM, Shawn Heisey wrote: On 11/1/2019 7:20 AM, Joe Obernberger wrote: Hi All - getting this error from

Solr 8.2.0 - Unable to write response

2019-11-01 Thread Joe Obernberger
Hi All - getting this error from only one server in a 45 node cluster when calling COLSTATUS.  Any ideas? 2019-11-01 13:17:32.556 INFO  (qtp694316372-44709) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections params={name=UNCLASS_2019_1_18_36=COLSTATUS=javabin=2} status=0

Solr 8.2 - Added Field - can't facet using alias

2019-10-11 Thread Joe Obernberger
Hi All, I've added a field with: curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field":{"name":"FaceCluster","type":"plongs","stored":false,"multiValued":true,"indexed":true}}' http://miranda:9100/solr/UNCLASS_2019_8_5_36/schema It returned success.  In the UI, when I

8.2.0 - REPLACENODE

2019-09-30 Thread Joe Obernberger
Hi All - I just ran the REPLACENODE command on a cluster with 5 nodes in it.  I ran the command async, and it failed with: { "responseHeader":{ "status":0, "QTime":11}, "Operation replacenode caused

Re: auto scaling question - solr 8.2.0

2019-09-26 Thread Joe Obernberger
) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:555) ... 6 more At this point, no nodes are hosting one of the collections. -Joe On 9/26/2019 1:32 PM, Joe Obernberger wrote: Hi all - I have a 4 node cluster for test, and created several solr collections with 2 shards and 2 replicas each

auto scaling question - solr 8.2.0

2019-09-26 Thread Joe Obernberger
Hi all - I have a 4 node cluster for test, and created several solr collections with 2 shards and 2 replicas each. I'd like the global policy to be to not place more than one replica of the same shard on the same node.  I did this with this curl command: curl -X POST -H

HDFS Shard Split

2019-09-16 Thread Joe Obernberger
Hi All - added a couple more solr nodes to an existing solr cloud cluster where the index is in HDFS.  When I try to a split a shard, I get an error saying there is not enough disk space.  It looks like it is looking on the local file system, and not in HDFS. "Operation splitshard casued

Re: Clustering error - Solr 8.2

2019-08-30 Thread Joe Obernberger
more details in the logfiles? It could be also that a parameter is configured with a different default? Try also to change the Solr version in solrconfig.xml to a higher one, e.g. 8.0.0 Am 29.08.2019 um 16:12 schrieb Joe Obernberger : Thank you Erick. I'm upgrading from 7.6.0 and as far

Re: Clustering error - Solr 8.2

2019-08-29 Thread Joe Obernberger
On Aug 28, 2019, at 5:18 PM, Joe Obernberger wrote: Hi All - trying to use clustering with SolrCloud 8.2, but getting this error: "msg":"Error from server at null: org.apache.solr.search.SyntaxError: Query Field 'features' is not a valid field name", The URL, I'm using is: ht

Clustering error - Solr 8.2

2019-08-28 Thread Joe Obernberger
Hi All - trying to use clustering with SolrCloud 8.2, but getting this error: "msg":"Error from server at null: org.apache.solr.search.SyntaxError: Query Field 'features' is not a valid field name", The URL, I'm using is: http://solrServer:9100/solr/DOCS/select?q=*%3A*=/clustering=true=true

Re: Solr on HDFS

2019-08-02 Thread Joe Obernberger
e use it. Thanks Kyle On Fri, 2 Aug 2019 at 08:58, Joe Obernberger wrote: Thank you. No, while the cluster is using Cloudera for HDFS, we do not use Cloudera to manager the solr cluster. If it is a configuration/architecture issue, what can I do to fix it? I'd like a system where servers can

Re: Solr on HDFS

2019-08-02 Thread Joe Obernberger
, recreate and index the affected collection, while you work your other isues. On Aug 1, 2019, at 16:40, Joe Obernberger wrote: Been using Solr on HDFS for a while now, and I'm seeing an issue with redundancy/reliability. If a server goes down, when it comes back up, it will never recover

Solr on HDFS

2019-08-01 Thread Joe Obernberger
Been using Solr on HDFS for a while now, and I'm seeing an issue with redundancy/reliability.  If a server goes down, when it comes back up, it will never recover because of the lock files in HDFS. That solr node needs to be brought down manually, the lock files deleted, and then brought back

Solrj + Aliases

2019-07-25 Thread Joe Obernberger
Hi All - I've created an alias, but when I try to index to the alias using CloudSolrClient, I get 'Collection not Found: TestAlias'.  Can you not use an alias name to index to with CloudSolrClient?  This is with SolrCloud 8.1. Thanks! -Joe

ShardSplit with HDFS

2019-06-19 Thread Joe Obernberger
SolrCloud version 7.6.0 on 4 nodes.  Thank you! -Joe Obernberger

Re: Solr 7.6.0 - won't elect leader

2019-05-30 Thread Joe Obernberger
. Yours might be different. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On May 30, 2019, at 5:47 AM, Joe Obernberger wrote: More info - looks like a zookeeper node got deleted somehow. NoNode for /collections/UNCLASS_30DAYS/leaders/shard31/leader I

Re: Solr 7.6.0 - won't elect leader

2019-05-30 Thread Joe Obernberger
)     at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:1328)     ... 9 more Can I manually enter information for the leader? How would I get that? -Joe On 5/30/2019 8:39 AM, Joe Obernberger wrote: Hi All - I have a 40 node cluster that has been running great for a long while, but it all came down due to OOM.  I adjusted

Solr 7.6.0 - won't elect leader

2019-05-30 Thread Joe Obernberger
Hi All - I have a 40 node cluster that has been running great for a long while, but it all came down due to OOM.  I adjusted the parameters and restarted, but one shard with 3 replicas (all NRT) will not elect a leader.  I see messages like: 2019-05-30 12:35:30.597 INFO 

Schema API Version 2 - 7.6.0

2019-05-22 Thread Joe Obernberger
is correct - true?  Thank you! -Joe Obernberger

Re: High CPU usage with Solr 7.7.0

2019-02-27 Thread Joe Obernberger
Just to add to this.  We upgraded to 7.7.0 and saw very large CPU usage on multi core boxes - sustained in the 1200% range.  We then switched to 7.6.0 (no other configuration changes) and the problem went away. We have a 40 node cluster and all 40 nodes had high CPU usage with 3 indexes

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
Reverted back to 7.6.0 - same settings, but now I do not encounter the large CPU usage. -Joe On 2/12/2019 12:37 PM, Joe Obernberger wrote: Thank you Shawn.  Yes, I used the settings off of your site. I've restarted the cluster and the CPU usage is back up again. Looking at it now, it doesn't

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
you For the gceasy.io site - that is very slick!  I'll use that in the future.  I can try using the standard settings, but again - at this point it doesn't look GC related to me? -Joe On 2/12/2019 11:35 AM, Shawn Heisey wrote: On 2/12/2019 7:35 AM, Joe Obernberger wrote: Yesterday, we

Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
Yesterday, we upgraded our 40 node cluster from solr 7.6.0 to solr 7.7.0.  This morning, all the nodes are using 1200+% of CPU. It looks like it's in garbage collection.  We did reduce our HDFS cache size from 11G to 6G, but other than that, no other parameters were changes. Top shows: top -

Solr 7.1 nodes shutting down

2018-08-10 Thread Joe Obernberger
Hi All - having an issue that seems to be related to the machine being under a high CPU load.  Occasionally a node will fall out of the solr cloud cluster.  It will be using 200% CPU and show the following exception: 2018-08-10 15:36:43.416 INFO  (qtp1908316405-203450) [c:models s:shard3

Re: CloudSolrClient URL Too Long

2018-07-13 Thread Joe Obernberger
Shawn - thank you!  That works great.  Stupid huge searches here I come! -Joe On 7/12/2018 4:46 PM, Shawn Heisey wrote: On 7/12/2018 12:48 PM, Joe Obernberger wrote: Hi - I'm using SolrCloud 7.3.1 and calling a search from Java using: org.apache.solr.client.solrj.response.QueryResponse

CloudSolrClient URL Too Long

2018-07-12 Thread Joe Obernberger
Hi - I'm using SolrCloud 7.3.1 and calling a search from Java using: org.apache.solr.client.solrj.response.QueryResponse response = CloudSolrClient.query(ModifiableSolrParams) If the ModifiableSolrParams are long, I get an error: Bad Message 414reason: URI Too Long I have the maximum number

Re: Can't recover - HDFS

2018-07-03 Thread Joe Obernberger
)     at java.lang.Thread.run(Thread.java:748) Thank you very much for the help! -Joe On 7/2/2018 8:32 PM, Shawn Heisey wrote: On 7/2/2018 1:40 PM, Joe Obernberger wrote: Hi All - having this same problem again with a large index in HDFS.  A replica needs to recover, and it just spins retrying over

Re: Solr 7.1.0 - NoNode for /collections

2018-07-02 Thread Joe Obernberger
Just to add to this - looks like the only valid replica that is remaining is a TLOG type, and I suspect that is why it no longer has a leader.  Poop. -Joe On 7/2/2018 7:54 PM, Joe Obernberger wrote: Hi - On startup, I'm getting the following error.  The shard had 3 replicas, but none

Solr 7.1.0 - NoNode for /collections

2018-07-02 Thread Joe Obernberger
Hi - On startup, I'm getting the following error.  The shard had 3 replicas, but none are selected as the leader.  I deleted one, and adding a new one back, but that had no effect, and at times the calls would timeout.  I was having the same issue with another shard on the same collection and

Can't recover - HDFS

2018-07-02 Thread Joe Obernberger
Hi All - having this same problem again with a large index in HDFS.  A replica needs to recover, and it just spins retrying over and over again.  Any ideas?  Is there an adjustable timeout? Screenshot: http://lovehorsepower.com/images/SolrShot1.jpg Thank you! -Joe Obernberger

Exception writing to index; possible analysis error - 7.3.1 - HDFS

2018-06-22 Thread Joe Obernberger
I'm getting an error on some of the nodes in my solr cloud cluster under heavy indexing load.  Once the error happens, that node, just repeatedly gets this error over and over and will no longer index documents until a restart.  I believe the root cause of the error is: File

Re: Solr 7 + HDFS issue

2018-06-12 Thread Joe Obernberger
4.liv",   "_4jqm.cfe",           "_4jqm.cfs",   "_4jqm.si",   "_4jqm_1.liv",   "_4jqn.cfe",   "_4jqn.cfs",   "_4jqn.si",   "_4jqn_2.liv",   "_4jqr

Solr 7 + HDFS issue

2018-06-11 Thread Joe Obernberger
We are seeing an issue on our Solr Cloud 7.3.1 cluster where replication starts and pegs network interfaces so aggressively that other tasks cannot talk.  We will see it peg a bonded 2GB interfaces.  In some cases the replication fails over and over until it finally succeeds and the replica

Re: Solr OOM Crashes / JVM tuning advice

2018-04-11 Thread Joe Obernberger
Just as a side note, when Solr goes OOM and kills itself, and if you're running HDFS, you are guaranteed to have write.lock files left over.  If you're running lots of shards/replicas, you may have many files that you need to go into HDFS and delete before restarting. -Joe On 4/11/2018

Solr7.1.0 - deleting collections when using HDFS

2018-04-10 Thread Joe Obernberger
Hi All - I've noticed that if I delete a collection that is stored in HDFS, the files/directory in HDFS remain.  If I then try to recreate the collection with the same name, I get an error about unable to open searcher.  If I then remove the directory from HDFS, the error remains due to files

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-05 Thread Joe Obernberger
ode it is creating a matrix with a row for each document and a column for each feature. This can get large quite quickly. By choosing fewer features you can make this matrix much smaller. Its fairly easy to make the train function work on a random sample of the training set on each iteration rather then the en

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-05 Thread Joe Obernberger
, Joe Obernberger wrote: Thank you Shawn - sorry so long to respond, been playing around with this a good bit.  It is an amazing capability.  It looks like it could be related to certain nodes in the cluster not responding quickly enough.  In one case, I got the concurrent.ExecutionException

Re: Largest number of indexed documents used by Solr

2018-04-05 Thread Joe Obernberger
50 billion per day?  Wow!  How large are these documents? We have a cluster with one large collection that contains 2.4 billion documents spread across 40 machines using HDFS for the index.  We store our data inside of HBase, and in order to re-index data we pull from HBase and index with

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-05 Thread Joe Obernberger
(RetryExec.java:89)     at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)     at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)     at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)     at org.apache.http.impl.clie

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-02 Thread Joe Obernberger
from http://vesta:9100/solr/MODEL1024_1522696624083_shard20_replica_n75 reporting any issues? When you go to that url is it back up and running? Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Apr 2, 2018 at 3:55 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Hi All

Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-02 Thread Joe Obernberger
ache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)     ... 1 more Last Check: 4/2/2018, 3:47:15 PM Thank you! -Joe Obernberger

Re: Frequency of Full reindex on SolrCloud

2018-01-02 Thread Joe Obernberger
Almost never.  I would only run a re-index for newer versions (such as 6.5.2 to 7.2) that have a required feature or schema changes such as changing the type of an existing field (int to string for example).  Not sure what you mean by 'every delta', but I would assume you just mean new data? 

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-11 Thread Joe Obernberger
replicas can take over leadership if the leader goes down so they must have an up-to-date-after-last-index-sync set of tlogs. At least that's my current understanding... Best, Erick On Fri, Dec 8, 2017 at 12:01 PM, Joe Obernberger <joseph.obernber...@gmail.com> wrote: Anyone have any though

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-08 Thread Joe Obernberger
Anyone have any thoughts on this?  Will TLOG replicas use less network bandwidth? -Joe On 12/4/2017 12:54 PM, Joe Obernberger wrote: Hi All - this same problem happened again, and I think I partially understand what is going on.  The part I don't know is what caused any of the replicas

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-04 Thread Joe Obernberger
very". The smoking gun here is that there are no errors on the follower, just the notification that the leader put it into recovery. There are other variations on the theme, it all boils down to when communications fall apart replicas go into recovery. Best, Erick On Wed, Nov 22, 2

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-27 Thread Joe Obernberger
.  Anyone else run into this? Thanks. -Joe On 11/27/2017 11:28 AM, Joe Obernberger wrote: Thank you Erick.  Right now, we have our autoCommit time set to 180 (30 minutes), and our autoSoftCommit set to 12.  The thought was that with HDFS we want less frequent, but larger operations

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-27 Thread Joe Obernberger
ot;. The smoking gun here is that there are no errors on the follower, just the notification that the leader put it into recovery. There are other variations on the theme, it all boils down to when communications fall apart replicas go into recovery. Best, Erick On Wed, Nov 22, 2017 at 11:02

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-22 Thread Joe Obernberger
: On 11/22/2017 6:44 AM, Joe Obernberger wrote: Right now, we have a relatively small block cache due to the requirements that the servers run other software.  We tried to find the best balance between block cache size, and RAM for programs, while still giving enough for local FS cache.  This came out

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-22 Thread Joe Obernberger
isk of finally loosing the data quite a bit. So I would try looking into the code and figure out what the problem is here and maybe compare the state in HDFS and ZK with a shard that works. regards, Hendrik On 21.11.2017 23:57, Joe Obernberger wrote: Hi Hendrick - the shards in question ha

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-22 Thread Joe Obernberger
looking into the code and figure out what the problem is here and maybe compare the state in HDFS and ZK with a shard that works. regards, Hendrik On 21.11.2017 23:57, Joe Obernberger wrote: Hi Hendrick - the shards in question have three replicas.  I tried restarting each one (one by one)

FORCELEADER not working - solr 6.6.1

2017-11-21 Thread Joe Obernberger
Hi All - sorry for the repeat, but I'm at a complete loss on this.  I have a collection with 100 shards and 3 replicas each.  6 of the shard will not elect a leader.  I've tried the FORCELEADER command, but nothing changes. The log shows 'Force leader attempt 1.  Waiting 5 secs for an active

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
so far was those lock files and if you delete and recreate collections/cores and it sometimes happens that the data was not cleaned up in HDFS and then causes a conflict. Hendrik On 21.11.2017 21:07, Joe Obernberger wrote: We've never run an index this size in anything but HDFS, so I have

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
and it sometimes happens that the data was not cleaned up in HDFS and then causes a conflict. Hendrik On 21.11.2017 21:07, Joe Obernberger wrote: We've never run an index this size in anything but HDFS, so I have no comparison.  What we've been doing is keeping two main collections - all data

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
ctions/cores and it sometimes happens that the data was not cleaned up in HDFS and then causes a conflict. Hendrik On 21.11.2017 21:07, Joe Obernberger wrote: We've never run an index this size in anything but HDFS, so I have no comparison. What we've been doing is keeping two main collections

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
lly recovers pretty well. regards, Hendrik On 21.11.2017 20:12, Joe Obernberger wrote: We set the hard commit time long because we were having performance issues with HDFS, and thought that since the block size is 128M, having a longer hard commit made sense.  That was our hypothesis anyw

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
Keeper and then delete all lock files that belong to the node that I'm starting. regards, Hendrik On 21.11.2017 14:07, Joe Obernberger wrote: Hi All - we have a system with 45 physical boxes running solr 6.6.1 using HDFS as the index. The current index size is about 31TBytes. With 3x replication that

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
logic to my Solr start up script which scans the log files in HDFS and compares that with the state in ZooKeeper and then delete all lock files that belong to the node that I'm starting. regards, Hendrik On 21.11.2017 14:07, Joe Obernberger wrote: Hi All - we have a system with 45 physical boxes

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
ory factory has some parameters to tweak that are a mystery to me... All in all, this is behavior that I find mystifying. Best, Erick On Tue, Nov 21, 2017 at 5:07 AM, Joe Obernberger <joseph.obernber...@gmail.com> wrote: Hi All - we have a system with 45 physical boxes running solr 6.6.1 us

Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
Hi All - we have a system with 45 physical boxes running solr 6.6.1 using HDFS as the index.  The current index size is about 31TBytes.  With 3x replication that takes up 93TBytes of disk. Our main collection is split across 100 shards with 3 replicas each.  The issue that we're running into

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Joe Obernberger
Very nice article - thank you!  Is there a similar article available when the index is on HDFS?  Sorry to hijack!  I'm very interested in how we can improve cache/general performance when running with HDFS. -Joe On 9/18/2017 11:35 AM, Erick Erickson wrote: This is suspicious too. Each

Re: Machine Learning for search

2017-08-23 Thread Joe Obernberger
to classify the whole result set. In this scenario the search engine ranking will already be returning relevant candidate documents and the model is only used to get a better ordering of the top docs. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Aug 22, 2017 at 12:32 PM, Joe Obernberger

Machine Learning for search

2017-08-22 Thread Joe Obernberger
Hi All - One of the really neat features of solr 6 is the ability to create machine learning models (information gain) and then use those models as a query.  If I want a user to be able to execute a query for the text Hawaii and use a machine learning model related to weather data, how can I

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
:00.93 java Note that the OS didn't actually give PID 29566 80G of memory, it actually gave it 275m. Right? Thanks again! -Joe On 8/18/2017 4:15 PM, Shawn Heisey wrote: On 8/18/2017 1:05 PM, Joe Obernberger wrote: Thank you Shawn. Please see: http://www.lovehorsepower.com/Vesta for screen

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
f the app - just using it to see the monitoring isn't half as useful. On Fri, Aug 18, 2017 at 3:31 PM, Joe Obernberger <joseph.obernber...@gmail.com <mailto:joseph.obernber...@gmail.com>> wrote: Hi Walter - I see what you are saying, but the machine is not activel

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
see a server with 100Gb of memory and processes (java and jsvc) using 203Gb of virtual memory. Hmm. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Aug 18, 2017, at 12:05 PM, Joe Obernberger <joseph.obernber...@gmail.com> wrote: Thank you

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
usage stayed low for a while, but then eventually comes up to ~800% where it will stay. Please let me know if there is other information that I can provide, or what I should be looking for in the GC logs. Thanks! -Joe On 8/18/2017 2:25 PM, Shawn Heisey wrote: On 8/18/2017 10:37 AM, Joe

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
. On Fri, Aug 18, 2017 at 12:37 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Indexing about 15 million documents per day across 100 shards on 45 servers. Up until about 350 million documents, each of the solr instances was taking up about 1 core (100% CPU). Recently, they all

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
spent? It can be very helpful for debugging this sort of problem. On Fri, Aug 18, 2017 at 12:37 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Indexing about 15 million documents per day across 100 shards on 45 servers. Up until about 350 million documents, each of the solr ins

Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
Indexing about 15 million documents per day across 100 shards on 45 servers. Up until about 350 million documents, each of the solr instances was taking up about 1 core (100% CPU). Recently, they all jumped to 700%. Is this normal? Anything that I can check for? I don't see anything

Re: Classify stream expression questions

2017-08-14 Thread Joe Obernberger
mber of records to process I would recommend batch processing. This blog explains the parallel batch framework: http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-para llel-etl-and.html Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Aug 14, 2017 at 7:53 PM, Joe Obernberger < jo

Classify stream expression questions

2017-08-14 Thread Joe Obernberger
Hi All - I'm using the classify stream expression and the results returned are always limited to 1,000. Where do I specify the number to return? The stream expression that I'm using looks like:

Re: Solr 6.6.0 - Indexing errors

2017-07-18 Thread Joe Obernberger
n ex) { System.out.println("Error writting: "+ex); } } } Then I copied the files to the 45 servers and restarted solr 6.6.0 on each. It came back up OK, and it has been indexing all night long. -Joe On 7/17/2017 3:15 PM, Erick Eri

Short Circuit Reads -

2017-07-18 Thread Joe Obernberger
Hi All - does SolrCloud support using Short Circuit Reads when using HDFS? Thanks! -Joe

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
ok at one Solr log from each shard to see whether this is an issue. Best, Erick On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger <joseph.obernber...@gmail.com> wrote: So far we've indexed about 46 million documents, but over the weekend, these errors started coming up. I would expect that

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
y you'll just to look at one Solr log from each shard to see whether this is an issue. Best, Erick On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger <joseph.obernber...@gmail.com> wrote: So far we've indexed about 46 million documents, but over the weekend, these errors started coming up. I w

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
. This would confirm if there is basic issue with indexing / cluster setup. On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Some more info: When I stop all the indexers, in about 5-10 minutes the cluster goes all green. When I start just one i

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
? Thank you! -Joe On 7/17/2017 8:36 AM, Joe Obernberger wrote: We've been indexing data on a 45 node cluster with 100 shards and 3 replicas, but our indexing processes have been stopping due to errors. On the server side the error is "Error logging add". Stack trace: 2017-07-17 12

Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
We've been indexing data on a 45 node cluster with 100 shards and 3 replicas, but our indexing processes have been stopping due to errors. On the server side the error is "Error logging add". Stack trace: 2017-07-17 12:29:24.057 INFO (qtp985934102-5161548) [c:UNCLASS s:shard58

Re: Auto commit Error - Solr Cloud 6.6.0 with HDFS

2017-07-14 Thread Joe Obernberger
) at java.lang.Thread.run(Thread.java:748) The whole log can be found here: http://lovehorsepower.com/solr.log the GC log is here: http://lovehorsepower.com/solr_gc.log.3.current -Joe On 7/12/2017 9:25 AM, Shawn Heisey wrote: On 7/12/2017 7:14 AM, Joe Obernberger wrote: Started up a 6.6.0

Re: NullPointerException on openStreams

2017-07-14 Thread Joe Obernberger
nctionName("count", CountMetric.class) .withFunctionName("facet", FacetStream.class) .withFunctionName("sum", SumMetric.class) .withFunctionName("unique", UniqueStream.class) .withFunctionName("uniq", UniqueMetric.class) .w

Solr 6.6.0 - Deleting Collections - HDFS

2017-07-14 Thread Joe Obernberger
When I delete a collection, it is gone from the GUI, but the directory is not removed from HDFS. The directory is empty, but the entry is still there. Is this expected? As shown below all the MODEL1007_* collections have been deleted. hadoop fs -du -s -h /solr6.6.0/* 3.3 G 22.7 G

Re: NullPointerException on openStreams

2017-07-13 Thread Joe Obernberger
.withFunctionName("facet", FacetStream.class) .withFunctionName("sum", SumMetric.class) .withFunctionName("unique", UniqueStream.class) .withFunctionName("uniq", UniqueMetric.class) .withFunctionName("innerJoin", InnerJoinStr

  1   2   >