Re: 2 Async exceptions during distributed update issue...

2019-11-19 Thread Erick Erickson
core_node30, shard=s:shard15} - > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > 2 Async exceptions during distributed update: > 10.YYY.40.62:8983 failed to respond > 10.YYY.40.62:8983 failed to respond > > 2019-11-14T17:45:48,821 - ERROR [qtp8762

2 Async exceptions during distributed update issue...

2019-11-19 Thread Fiz N
rd15} - org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: 2 Async exceptions during distributed update: 10.YYY.40.62:8983 failed to respond 10.YYY.40.62:8983 failed to respond 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - {collection=c:ducts, core=x:ducts_shard7_replica1, node_name=

Fwd: 2 Async exceptions during distributed update

2019-11-18 Thread Fiz N
; there are 64 cores that are sending batches concurrently. > >> > > >> > > >> > > >> > *I am getting the following error when the batch size is 1000. Please > >> > advise.* > >> > > >> > > >> > > >&g

Re: 2 Async exceptions during distributed update

2019-11-16 Thread Fiz N
; there are 64 cores that are sending batches concurrently. > >> > > >> > > >> > > >> > *I am getting the following error when the batch size is 1000. Please > >> > advise.* > >> > > >> > > >> > > >&g

Re: 2 Async exceptions during distributed update

2019-11-16 Thread Jörn Franke
following error when the batch size is 1000. Please >> > advise.* >> > >> > >> > >> > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - >> > {collection=c:ducts, core=x:ducts_shard15_replica1, >> > node_name=n:10.YYY.4

Re: 2 Async exceptions during distributed update

2019-11-16 Thread Jörn Franke
worked fine without error and if the batch size >> is >> > 1000 the following error occurs. >> > >> > there are 64 cores that are sending batches concurrently. >> > >> > >> > >> > *I am getting the following error when the batch size is 1000. Plea

Re: 2 Async exceptions during distributed update

2019-11-16 Thread Fiz N
s >> > 1000 the following error occurs. >> > >> > there are 64 cores that are sending batches concurrently. >> > >> > >> > >> > *I am getting the following error when the batch size is 1000. Please >> > advise.* >>

Re: 2 Async exceptions during distributed update

2019-11-15 Thread Fiz N
sending batches concurrently. > > > > > > > > *I am getting the following error when the batch size is 1000. Please > > advise.* > > > > > > > > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - > > {collection=c:

Re: 2 Async exceptions during distributed update

2019-11-14 Thread Jörn Franke
the following error when the batch size is 1000. Please > advise.* > > > > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - > {collection=c:ducts, core=x:ducts_shard15_replica1, > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, shard=s:shard1

2 Async exceptions during distributed update

2019-11-14 Thread Fiz N
ucts, core=x:ducts_shard15_replica1, node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, shard=s:shard15} - org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: 2 Async exceptions during distributed update: 10.YYY.40.62:8983 failed to respond 10

Re: Async exceptions during distributed update

2018-05-14 Thread Jay Potharaju
Adding some more context to my last email Solr:6.6.3 2 nodes : 3 shards each No replication . Can someone answer the following questions 1) any ideas on why the following errors keep happening. AFAIK streaming solr clients error is because of timeouts when connecting to other nodes. Async

Re: Async exceptions during distributed update

2018-05-13 Thread Jay Potharaju
Hi, I restarted both my solr servers but I am seeing the async error again. In older 5x version of solrcloud, solr would normally recover gracefully in case of network errors, but solr 6.6.3 does not seem to be doing that. At this time I am not doing only a small percentage of deletebyquery

Re: Async exceptions during distributed update

2018-05-09 Thread Emir Arnautović
Hi Jay, Network blip might be the cause, but also the consequence of this issue. Maybe you can try avoiding DBQ while indexing and see if it is the cause. You can do thread dump on “the other” node and see if there are blocked threads and that can give you more clues what’s going on. Thanks,

Re: Async exceptions during distributed update

2018-05-08 Thread Jay Potharaju
Hi Emir, I was seeing this error as long as the indexing was running. Once I stopped the indexing the errors also stopped. Yes, we do monitor both hosts & solr but have not seen anything out of the ordinary except for a small network blip. In my experience solr generally recovers after a network

Re: Async exceptions during distributed update

2018-05-08 Thread Emir Arnautović
Hi Jay, This is low ingestion rate. What is the size of your index? What is heap size? I am guessing that this is not a huge index, so I am leaning toward what Shawn mentioned - some combination of DBQ/merge/commit/optimise that is blocking indexing. Though, it is strange that it is happening

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
I have about 3-5 updates per second. > On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: > >> On 5/7/2018 5:05 PM, Jay Potharaju wrote: >> There are some deletes by query. I have not had any issues with DBQ, >> currently have 5.3 running in production. > > Here's the big

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Thanks for explaining that Shawn! Emir, I use php library called solarium to do updates/deletes to solr. The request is sent to any of the available nodes in the cluster. > On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: > >> On 5/7/2018 5:05 PM, Jay Potharaju wrote: >>

Re: Async exceptions during distributed update

2018-05-07 Thread Shawn Heisey
On 5/7/2018 5:05 PM, Jay Potharaju wrote: > There are some deletes by query. I have not had any issues with DBQ, > currently have 5.3 running in production. Here's the big problem with DBQ.  Imagine this sequence of events with these timestamps: 13:00:00: A commit for change visibility happens.

Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
How many concurrent updates can be sent? Do you always send updates to the same node? Do you use solrj? Emir On Tue, May 8, 2018, 1:02 AM Jay Potharaju wrote: > The updates are pushed in real time not batched. No complex analysis and > everything is committed using

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
There are some deletes by query. I have not had any issues with DBQ, currently have 5.3 running in production. Thanks Jay Potharaju On Mon, May 7, 2018 at 4:02 PM, Jay Potharaju wrote: > The updates are pushed in real time not batched. No complex analysis and >

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
The updates are pushed in real time not batched. No complex analysis and everything is committed using autocommit settings in solr. Thanks Jay Potharaju On Mon, May 7, 2018 at 4:00 PM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > How do you send documents? Large batches? Complex

Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
How do you send documents? Large batches? Complex analysis? Do you send all batches to the same node? How do you commit? Do you delete by query while indexing? Emir On Tue, May 8, 2018, 12:30 AM Jay Potharaju wrote: > I didn't see any OOM errors in the logs on either of

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
I didn't see any OOM errors in the logs on either of the nodes. I saw GC pause of 1 second on the box that was throwing error ...but nothing on the other node. Any other recommendations? Thanks Thanks Jay Potharaju On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju wrote: >

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Ah thanks for explaining that! Thanks Jay Potharaju On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Node A receives batch of documents to index. It forwards documents to > shards that are on the node B. Node B is having issues with GC so it takes > a

Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
Node A receives batch of documents to index. It forwards documents to shards that are on the node B. Node B is having issues with GC so it takes a while to respond. Node A sees it as read timeout and reports it in logs. So the issue is on node B not node A. Emir -- Monitoring - Log Management

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Yes, the nodes are well balanced. I am just using these boxes for indexing the data and is not serving any traffic at this time. The error indicates it is having issues errors on the shards that are hosted on the box and not on the other box. I will check GC logs to see if there were any issues.

Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
Hi Jay, My first guess would be that there was some major GC on other box so it did not respond on time. Are your nodes well balanced - do they serve equal amount of data? Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training

Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Hi, I am seeing the following lines in the error log. My setup has 2 nodes in the solrcloud cluster, each node has 3 shards with no replication. From the error log it seems like all the shards on this box are throwing async exception errors. Other node in the cluster does not have any errors in