Re: SolrCloud indexing triggers merges and timeouts

2019-07-12 Thread Rahul Goswami
Upon further investigation on this issue, I see the below log lines during the indexing process: 2019-06-06 22:24:56.203 INFO (qtp1169794610-5652) [c:UM_IndexServer_MailArchiv_Spelle_66AC8340-4734-438A-9D1D-A84B659B1623 s:shard22 r:core_node87

Re: SolrCloud indexing triggers merges and timeouts

2019-07-05 Thread Rahul Goswami
Shawn,Erick, Thank you for the explanation. The merge scheduler params make sense now. Thanks, Rahul On Wed, Jul 3, 2019 at 11:30 AM Erick Erickson wrote: > Two more tidbits to add to Shawn’s explanation: > > There are heuristics built in to ConcurrentMergeScheduler. > From the Javadocs: > *

Re: SolrCloud indexing triggers merges and timeouts

2019-07-03 Thread Erick Erickson
Two more tidbits to add to Shawn’s explanation: There are heuristics built in to ConcurrentMergeScheduler. From the Javadocs: * If it's an SSD, * {@code maxThreadCount} is set to {@code max(1, min(4, cpuCoreCount/2))}, * otherwise 1. Note that detection only currently works on * Linux; other

Re: SolrCloud indexing triggers merges and timeouts

2019-07-03 Thread Shawn Heisey
On 7/2/2019 10:53 PM, Rahul Goswami wrote: Hi Shawn, Thank you for the detailed suggestions. Although, I would like to understand the maxMergeCount and maxThreadCount params better. The documentation

Re: SolrCloud indexing triggers merges and timeouts

2019-07-02 Thread Rahul Goswami
Hi Shawn, Thank you for the detailed suggestions. Although, I would like to understand the maxMergeCount and maxThreadCount params better. The documentation mentions that maxMergeCount : The maximum number

Re: SolrCloud indexing triggers merges and timeouts

2019-06-13 Thread Shawn Heisey
On 6/6/2019 9:00 AM, Rahul Goswami wrote: *OP Reply* : Total 48 GB per node... I couldn't see another software using a lot of memory. I am honestly not sure about the reason for change of directory factory to SimpleFSDirectoryFactory. But I was told that with mmap at one point we started to see

Re: SolrCloud indexing triggers merges and timeouts

2019-06-12 Thread Rahul Goswami
Updating the thread with further findings: So turns out that the nodes hosting Solr are VMs with Virtual disks. Additionally, a Windows system process (the infamous PID 4) is hogging a lot of disk. This is indicated by disk reponse times in excess of 100 ms and a disk drive queue length of 5

Re: SolrCloud indexing triggers merges and timeouts

2019-06-06 Thread Rahul Goswami
Thank you for your responses. Please find additional details about the setup below: We are using Solr 7.2.1 > I have a solrcloud setup on Windows server with below config: > 3 nodes, > 24 shards with replication factor 2 > Each node hosts 16 cores. 16 CPU cores, or 16 Solr cores? The info may

Re: SolrCloud indexing triggers merges and timeouts

2019-06-05 Thread Shawn Heisey
On 6/5/2019 9:39 AM, Rahul Goswami wrote: I have a solrcloud setup on Windows server with below config: 3 nodes, 24 shards with replication factor 2 Each node hosts 16 cores. 16 CPU cores, or 16 Solr cores? The info may not be all that useful either way, but just in case, it should be

Re: SolrCloud indexing triggers merges and timeouts

2019-06-05 Thread Walter Underwood
Yes, set Xmx and Xms the same. We run an 8 GB heap for all our clusters. Unless you are doing some really memory-intensive stuff like faceting, 8 GB should be fine. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jun 5, 2019, at 1:05 PM, Gus Heck

Re: SolrCloud indexing triggers merges and timeouts

2019-06-05 Thread Gus Heck
Probably not a solution, but so.ething I notice off the bat... generally you want Xmx and Xms set to the same value so the jvm doesn't have to spend time asking for more and more memory, and also reduce the chance that the memory is not available by the time solr needs it. On Wed, Jun 5, 2019,

Re: SolrCloud indexing

2018-04-16 Thread Erick Erickson
ment. > > Regards, > Moshe Recanati > CTO > Mobile + 972-52-6194481 > Skype: recanati > > More at: www.kmslh.com | LinkedIn | FB > > > -Original Message- > From: Shawn Heisey <apa...@elyograg.org> > Sent: Sunday, April 15, 2018 8:23 PM >

RE: SolrCloud indexing

2018-04-15 Thread Moshe Recanati | KMS
Sunday, April 15, 2018 8:23 PM To: solr-user@lucene.apache.org Subject: Re: SolrCloud indexing On 4/15/2018 1:22 AM, Moshe Recanati | KMS wrote: > > We’re using SolrCloud as part of our product solution for High > Availability. > > During upgrade of a version we need to run full index

RE: SolrCloud indexing

2018-04-15 Thread Moshe Recanati | KMS
r-user <solr-user@lucene.apache.org> Subject: Re: SolrCloud indexing I think you're saying you want to prove out the upgrade in some kind of test setup then switch live traffic. What's commonly used for that is collection aliasing. You just create a new collection and populate it and check i

Re: SolrCloud indexing

2018-04-15 Thread Erick Erickson
I think you're saying you want to prove out the upgrade in some kind of test setup then switch live traffic. What's commonly used for that is collection aliasing. You just create a new collection and populate it and check it out. When you're satisfied that it's doing what you want, use the

Re: SolrCloud indexing

2018-04-15 Thread Shawn Heisey
On 4/15/2018 1:22 AM, Moshe Recanati | KMS wrote: We’re using SolrCloud as part of our product solution for High Availability. During upgrade of a version we need to run full index build on our Solr data. What are you upgrading?  If it's Solr, you should pause/stop indexing while you

Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible?

2017-08-30 Thread Susheel Kumar
? In my case, I am quite sure that the collections will > never be queried simultaneously. So will the "running but idle" collection > slow me down? > > Johannes > > -Ursprüngliche Nachricht- > Von: Susheel Kumar [mailto:susheel2...@gmail.com] > Gesendet: Mitt

Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible?

2017-08-30 Thread Johannes Knaus
om] Gesendet: Mittwoch, 30. August 2017 17:36 An: solr-user@lucene.apache.org Betreff: Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible? Yes, absolutely. You can create as many as collections you need (like you would create table in relational world). On Wed,

Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible?

2017-08-30 Thread Susheel Kumar
Yes, absolutely. You can create as many as collections you need (like you would create table in relational world). On Wed, Aug 30, 2017 at 10:13 AM, Johannes Knaus wrote: > I have a working SolrCloud-Setup with 38 nodes with a collection spanning > over these nodes with 2

Re: SolrCloud indexing

2015-05-12 Thread Bill Au
Thanks for the reply. Actually in our case we want the timestamp to be populated locally on each node in the SolrCloud cluster. We want to see if there is any delay in the document being distributed within the cluster. Just want to confirm that the timestamp can be use for that purpose. Bill

Re: SolrCloud indexing

2015-05-09 Thread Shawn Heisey
On 5/9/2015 8:41 PM, Bill Au wrote: Is the behavior of document being indexed independently on each node in a SolrCloud cluster new in 5.x or is that true in 4.x also? If the document is indexed independently on each node, then if I query the document from each node directly, a timestamp

Re: SolrCloud indexing

2015-05-09 Thread Bill Au
Is the behavior of document being indexed independently on each node in a SolrCloud cluster new in 5.x or is that true in 4.x also? If the document is indexed independently on each node, then if I query the document from each node directly, a timestamp could hold different values since the

Re: SolrCloud indexing

2015-05-08 Thread Vincenzo D'Amore
I have just added a comment to the CWiki. Thanks again for your prompt answer Erick. Best, Vincenzo On Fri, May 8, 2015 at 12:39 AM, Erick Erickson erickerick...@gmail.com wrote: bq: ...forwards the index notation to itself and any replicas... That's just odd phrasing. All that means is

Re: SolrCloud indexing

2015-05-07 Thread Vincenzo D'Amore
Thanks Shawn. Just to make the picture more clear, I'm trying to understand why a 3 node solrcloud cluster and a old style solr server take same time to index same documents. But in the wiki is written: If the machine is a leader, SolrCloud determines which shard the document should go to,

Re: SolrCloud indexing

2015-05-07 Thread Vincenzo D'Amore
Thanks Erick. I'm not sure I got your answer. I try to recap, when the raw document has to be indexed, it will be forwarded to shard leader. Shard leader indexes the document for that shard, and then forwards the indexed document to any replicas. I want just be sure that when the raw document is

Re: SolrCloud indexing

2015-05-07 Thread Erick Erickson
bq: ...forwards the index notation to itself and any replicas... That's just odd phrasing. All that means is that the document sent through the indexing process on the leader and all followers for a shard and is indexed independently on each. This is as opposed to the old master/slave situation

Re: SolrCloud indexing

2015-05-07 Thread Shawn Heisey
On 5/7/2015 3:04 AM, Vincenzo D'Amore wrote: Thanks Erick. I'm not sure I got your answer. I try to recap, when the raw document has to be indexed, it will be forwarded to shard leader. Shard leader indexes the document for that shard, and then forwards the indexed document to any replicas.

Re: SolrCloud indexing

2015-05-05 Thread Erick Erickson
bq: Does it mean that all the indexing is done by the leaders in one node? no. The raw document is forwarded from the leader to the replica and it's indexed on all the nodes. The leader has a little bit of extra work to do routing the docs, but that's it. Shouldn't be a problem with 3 shards.

Re: solrcloud indexing completed event

2014-07-01 Thread Giovanni Bricconi
Thank you Erick, Fortunately I can modify the data feeding process to start my post-indexing tasks. 2014-06-30 22:13 GMT+02:00 Erick Erickson erickerick...@gmail.com: The paradigm is different. In SolrCloud when a client sends an indexing request to any node in the system, when the

Re: solrcloud indexing completed event

2014-06-30 Thread Erick Erickson
The paradigm is different. In SolrCloud when a client sends an indexing request to any node in the system, when the response comes back all the nodes (leaders, followers, etc) have _all_ received the update and processed it. So you don't have to care in the same way. As far as different segments,

RE: SolrCloud Indexing question

2013-08-07 Thread Kalyan Kuram
Thank you so much for the suggestion, Is the same recommended for querying too i found it very slow when i do query using clousolrserver Kalyan Date: Tue, 6 Aug 2013 13:25:37 -0600 From: s...@elyograg.org To: solr-user@lucene.apache.org Subject: Re: SolrCloud Indexing question On 8/6/2013

Re: SolrCloud Indexing question

2013-08-06 Thread Shawn Heisey
On 8/6/2013 12:55 PM, Kalyan Kuram wrote: Hi AllI need suggestion on how to send indexing commands to 2 different solr server,Basically i want to mirror my index,here is the scenarioi have 2 cluster, each cluster has one master and 2 slaves with external zookeeper in the fronti need suggestion

RE: SolrCloud indexing blocks if node is recovering

2012-11-06 Thread Markus Jelsma
https://issues.apache.org/jira/browse/SOLR-4038 Still trying to gather the logs -Original message- From:Mark Miller markrmil...@gmail.com Sent: Sat 03-Nov-2012 14:17 To: Markus Jelsma markus.jel...@openindex.io Cc: solr-user@lucene.apache.org Subject: Re: SolrCloud indexing

RE: SolrCloud indexing blocks if node is recovering

2012-11-03 Thread Markus Jelsma
@lucene.apache.org Subject: Re: SolrCloud indexing blocks if node is recovering Doesn't sound right. Still have the logs? - Mark On Fri, Nov 2, 2012 at 9:45 AM, Markus Jelsma markus.jel...@openindex.io wrote: Hi, We just tested indexing some million docs from Hadoop to a 10 node 2 rep

Re: SolrCloud indexing blocks if node is recovering

2012-11-03 Thread Mark Miller
next monday. I assume you're not too interested in the OOM machine but all surrounding nodes that blocked instead? -Original message- From:Mark Miller markrmil...@gmail.com Sent: Sat 03-Nov-2012 03:14 To: solr-user@lucene.apache.org Subject: Re: SolrCloud indexing blocks if node

Re: SolrCloud indexing blocks if node is recovering

2012-11-02 Thread Mark Miller
Doesn't sound right. Still have the logs? - Mark On Fri, Nov 2, 2012 at 9:45 AM, Markus Jelsma markus.jel...@openindex.io wrote: Hi, We just tested indexing some million docs from Hadoop to a 10 node 2 rep SolrCloud cluster with this week's trunk. One of the nodes gave an OOM but indexing

Re: SolrCloud indexing question

2012-04-20 Thread Jamie Johnson
my understanding is that you can send your updates/deletes to any shard and they will be forwarded to the leader automatically. That being said your leader will always be the place where the index happens and then distributed to the other replicas. On Fri, Apr 20, 2012 at 7:54 AM, Darren Govoni

Re: SolrCloud indexing question

2012-04-20 Thread Darren Govoni
Gotcha. Now does that mean if I have 5 threads all writing to a local shard, will that shard piggyhop those index requests onto a SINGLE connection to the leader? Or will they spawn 5 connections from the shard to the leader? I really hope the formerthe latter won't scale well. On Fri,

Re: SolrCloud indexing question

2012-04-20 Thread Jamie Johnson
I believe the SolrJ code round robins which server the request is sent to and as such probably wouldn't send to the same server in your case, but if you had an HttpSolrServer for instance and were pointing to only one particular intsance my guess would be that would be 5 separate requests from the