Re: 2 Async exceptions during distributed update issue...
First, please don’t use the “schemaless” mode (add-unknown-fields-to-schema in your solrconfig) while load testing. There’s quite a bit of work being done when Solr discovers an unknown field that’ll cause some instability under heavy load. Second, when you put a large batch against Solr, you have the possibility of the update just taking longer than the timeout. There are several timeouts you can increase, see the “solr.xml” section of the ref guide. Best, Erick > On Nov 19, 2019, at 12:29 PM, Fiz N wrote: > > > Hello Solr Experts, > > Just wanted to follow up in case my question.Would appreciate help on this. > > SOLR Version : 6.6.2 > OS – Linux 3.1.2 > JDK – 1.8 > > Shard – 16 – All are active. > Xms – 16 gb > Xmx – 16 gb > Host has 64 cores. > > the update processor chain.attaching the complete updateRequestProcessorChain > in a file. > Attaching physical and CPU memory screenshot. > > There are multiple threads sending products to solr. tested with batch size > per thread 50 & 100, it worked fine without error and if the batch size is > 1000 the following error occurs. > > > I am getting the following error when the batch size is 1000. Please advise. > > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - > {collection=c:ducts, core=x:ducts_shard15_replica1, > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, shard=s:shard15} - > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > 2 Async exceptions during distributed update: > 10.YYY.40.62:8983 failed to respond > 10.YYY.40.62:8983 failed to respond > > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - > {collection=c:ducts, core=x:ducts_shard7_replica1, > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, shard=s:shard7} - > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > Async exception during distributed update: 10.YYY.40.81:8983 failed to > respond > > > 2019-11-14T19:36:11,599 - ERROR > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 r:core_node26 > n:10.YYY.40.68:8983_solr c:ducts > s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] - > {ConcurrentUpdateSolrClient.url=http://10.YYY.40.68:8983/solr/ducts_shard11_replica2, > collection=c:ducts, core=x:ducts_shard3_replica2, > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, shard=s:shard3} - > error > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to respond > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) > ~[httpclient-4.4.1.jar:4.4.1] > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) > ~[httpclient-4.4.1.jar:4.4.1] > at > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) > ~[httpcore-4.4.1.jar:4.4.1] > > > 2019-11-14T19:36:14,567 - ERROR > [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 > http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 r:core_node25 > n:10.YYY.40.68:8983_solr c:ducts > s:shard11:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] > - > {ConcurrentUpdateSolrClient.url=http://10.YYY.40.62:8983/solr/ducts_shard2_replica1, > collection=c:ducts, core=x:ducts_shard11_replica2, > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} - > error > java.net.SocketException: Broken pipe (Write failed) > at java.net.SocketOutputStream.socketWrite0(Native Method) > ~[?:1.8.0_232] > at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) > ~[?:1.8.0_232] > at java.net.SocketOutputStream.write(SocketOutputStream.java:155) > ~[?:1.8.0_232] > > > 2019-11-14T19:36:38,851 - ERROR [qtp876213901-542:SolrException@159] - > {collection=c:ducts, core=x:ducts_shard11_replica2, > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} - > null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > Async exception during distributed update: Broken pipe (Write failed) >at > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911) > > > Thanks > >
2 Async exceptions during distributed update issue...
Hello Solr Experts, Just wanted to follow up in case my question.Would appreciate help on this. SOLR Version : 6.6.2 OS – Linux 3.1.2 JDK – 1.8 Shard – 16 – All are active. Xms – 16 gb Xmx – 16 gb Host has 64 cores. the update processor chain.attaching the complete updateRequestProcessorChain in a file. Attaching physical and CPU memory screenshot. There are multiple threads sending products to solr. tested with batch size per thread 50 & 100, it worked fine without error and if the batch size is 1000 the following error occurs. *I am getting the following error when the batch size is 1000. Please advise.* 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - {collection=c:ducts, core=x:ducts_shard15_replica1, node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, shard=s:shard15} - org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: 2 Async exceptions during distributed update: 10.YYY.40.62:8983 failed to respond 10.YYY.40.62:8983 failed to respond 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - {collection=c:ducts, core=x:ducts_shard7_replica1, node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, shard=s:shard7} - org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: 10.YYY.40.81:8983 failed to respond 2019-11-14T19:36:11,599 - ERROR [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 <http://10.148.40.68:8983/solr/products_shard11_replica2> r:core_node26 n:10.YYY.40.68:8983_solr c:ducts s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] - {ConcurrentUpdateSolrClient.url= http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 <http://10.148.40.68:8983/solr/products_shard11_replica2>, collection=c:ducts, core=x:ducts_shard3_replica2, node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, shard=s:shard3} - error org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) ~[httpclient-4.4.1.jar:4.4.1] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) ~[httpclient-4.4.1.jar:4.4.1] at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) ~[httpcore-4.4.1.jar:4.4.1] 2019-11-14T19:36:14,567 - ERROR [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 <http://10.148.40.62:8983/solr/products_shard2_replica1> r:core_node25 n:10.YYY.40.68:8983_solr c:ducts s:shard11:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] - {ConcurrentUpdateSolrClient.url= http://10.YYY.40.62:8983/solr/ducts_shard2_replica1 <http://10.148.40.62:8983/solr/products_shard2_replica1>, collection=c:ducts, core=x:ducts_shard11_replica2, node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} - error java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) ~[?:1.8.0_232] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) ~[?:1.8.0_232] at java.net.SocketOutputStream.write(SocketOutputStream.java:155) ~[?:1.8.0_232] 2019-11-14T19:36:38,851 - ERROR [qtp876213901-542:SolrException@159] - {collection=c:ducts, core=x:ducts_shard11_replica2, node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} - null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Broken pipe (Write failed) at org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972) at org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911) Thanks [^\w-\.] _ -MM-dd'T'HH:mm:ss.SSSZ -MM-dd'T'HH:mm:ss,SSSZ -MM-dd'T'HH:mm:ss.SSS -MM-dd'T'HH:mm:ss,SSS -MM-dd'T'HH:mm:ssZ -MM-dd'T'HH:mm:ss -MM-dd'T'HH:mmZ -MM-dd'T'HH:mm -MM-dd HH:mm:ss.SSSZ -MM-dd HH:mm:ss,SSSZ -MM-dd HH:mm:ss.SSS -MM-dd HH:mm:ss,SSS -MM-dd HH:mm:ssZ -MM-dd HH:mm:ss -MM-dd HH:mmZ -MM-dd HH:mm -MM-dd strings java.lang.Boolean booleans java.util.Date tdates java.lang.Long java.lang.Integer tlongs java.lang.Number tdoubles
Fwd: 2 Async exceptions during distributed update
Thanks Jorn on getting back to me. Please find attached memory/cpu utilization and complete update process chain. Please let me know your thoughts. On Sat, Nov 16, 2019 at 9:09 AM Jörn Franke wrote: > Can you please provide the whole update chain ? Below is the > directoryFactory and not the update processor chain. > > > Am 15.11.2019 um 17:34 schrieb Fiz N : > > > > > > Thanks for your response. > > > > name="DirectoryFactory"class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> > > > > attaching the screenshot of physical memory and cpu. > > Please let me know your thoughts on the below issue. > > > > > >> On Fri, Nov 15, 2019 at 2:18 AM Jörn Franke > wrote: > >> Do you use a updateprocess factory? How does it look like? > >> > >> What is the physical memory size and CPU? > >> What do you mean by “there are 64 cores sending concurrently?” An > application has 64 threats that send concurrently those updates? > >> > >> > Am 15.11.2019 um 02:14 schrieb Fiz N : > >> > > >> > Hi Solr Experts, > >> > > >> > SOLR Version : 6.6.2 > >> > > >> > OS – Linux 3.1.2 > >> > > >> > JDK – 1.8 > >> > > >> > > >> > > >> > Shard – 16 – All are active. > >> > > >> > Xms – 16 gb > >> > > >> > Xmx – 16 gb > >> > > >> > > >> > > >> > Schema fields count – 91 > >> > > >> > Dynamic fields – 83. > >> > > >> > > >> > > >> > There are multiple threads sending products to solr. tested with > batch size > >> > per thread 50 & 100, it worked fine without error and if the batch > size is > >> > 1000 the following error occurs. > >> > > >> > there are 64 cores that are sending batches concurrently. > >> > > >> > > >> > > >> > *I am getting the following error when the batch size is 1000. Please > >> > advise.* > >> > > >> > > >> > > >> > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] > - > >> > {collection=c:ducts, core=x:ducts_shard15_replica1, > >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, > shard=s:shard15} > >> > - > >> > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > >> > 2 Async exceptions during distributed update: > >> > > >> > 10.YYY.40.62:8983 failed to respond > >> > > >> > 10.YYY.40.62:8983 failed to respond > >> > > >> > > >> > > >> > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] > - > >> > {collection=c:ducts, core=x:ducts_shard7_replica1, > >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, > shard=s:shard7} > >> > - > >> > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > >> > Async exception during distributed update: 10.YYY.40.81:8983 failed to > >> > respond > >> > > >> > > >> > > >> > > >> > > >> > 2019-11-14T19:36:11,599 - ERROR > >> > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 > >> > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 > >> > <http://10.148.40.68:8983/solr/products_shard11_replica2> > r:core_node26 > >> > n:10.YYY.40.68:8983_solr c:ducts > >> > > s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131 > ] > >> > - {ConcurrentUpdateSolrClient.url= > >> > http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 > >> > <http://10.148.40.68:8983/solr/products_shard11_replica2>, > >> > collection=c:ducts, core=x:ducts_shard3_replica2, > >> > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, > shard=s:shard3} > >> > - error > >> > > >> > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to > respond > >> > > >> >at > >> > > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) > >> > ~[httpclient-4.4.1.jar:4.4.1] > >> > > >> >at > >> > > org.apache.http.impl.conn.Def
Re: 2 Async exceptions during distributed update
Thanks Jorn on getting back to me. Please find attached memory/cpu utilization and complete update process chain. Please let me know your thoughts. On Sat, Nov 16, 2019 at 9:09 AM Jörn Franke wrote: > Can you please provide the whole update chain ? Below is the > directoryFactory and not the update processor chain. > > > Am 15.11.2019 um 17:34 schrieb Fiz N : > > > > > > Thanks for your response. > > > > name="DirectoryFactory"class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> > > > > attaching the screenshot of physical memory and cpu. > > Please let me know your thoughts on the below issue. > > > > > >> On Fri, Nov 15, 2019 at 2:18 AM Jörn Franke > wrote: > >> Do you use a updateprocess factory? How does it look like? > >> > >> What is the physical memory size and CPU? > >> What do you mean by “there are 64 cores sending concurrently?” An > application has 64 threats that send concurrently those updates? > >> > >> > Am 15.11.2019 um 02:14 schrieb Fiz N : > >> > > >> > Hi Solr Experts, > >> > > >> > SOLR Version : 6.6.2 > >> > > >> > OS – Linux 3.1.2 > >> > > >> > JDK – 1.8 > >> > > >> > > >> > > >> > Shard – 16 – All are active. > >> > > >> > Xms – 16 gb > >> > > >> > Xmx – 16 gb > >> > > >> > > >> > > >> > Schema fields count – 91 > >> > > >> > Dynamic fields – 83. > >> > > >> > > >> > > >> > There are multiple threads sending products to solr. tested with > batch size > >> > per thread 50 & 100, it worked fine without error and if the batch > size is > >> > 1000 the following error occurs. > >> > > >> > there are 64 cores that are sending batches concurrently. > >> > > >> > > >> > > >> > *I am getting the following error when the batch size is 1000. Please > >> > advise.* > >> > > >> > > >> > > >> > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] > - > >> > {collection=c:ducts, core=x:ducts_shard15_replica1, > >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, > shard=s:shard15} > >> > - > >> > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > >> > 2 Async exceptions during distributed update: > >> > > >> > 10.YYY.40.62:8983 failed to respond > >> > > >> > 10.YYY.40.62:8983 failed to respond > >> > > >> > > >> > > >> > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] > - > >> > {collection=c:ducts, core=x:ducts_shard7_replica1, > >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, > shard=s:shard7} > >> > - > >> > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > >> > Async exception during distributed update: 10.YYY.40.81:8983 failed to > >> > respond > >> > > >> > > >> > > >> > > >> > > >> > 2019-11-14T19:36:11,599 - ERROR > >> > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 > >> > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 > >> > <http://10.148.40.68:8983/solr/products_shard11_replica2> > r:core_node26 > >> > n:10.YYY.40.68:8983_solr c:ducts > >> > > s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131 > ] > >> > - {ConcurrentUpdateSolrClient.url= > >> > http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 > >> > <http://10.148.40.68:8983/solr/products_shard11_replica2>, > >> > collection=c:ducts, core=x:ducts_shard3_replica2, > >> > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, > shard=s:shard3} > >> > - error > >> > > >> > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to > respond > >> > > >> >at > >> > > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) > >> > ~[httpclient-4.4.1.jar:4.4.1] > >> > > >> >at > >> > > org.apache.http.impl.conn.Def
Re: 2 Async exceptions during distributed update
Can you please provide the whole update chain ? Below is the directoryFactory and not the update processor chain. > Am 15.11.2019 um 17:34 schrieb Fiz N : > > > Thanks for your response. > > name="DirectoryFactory"class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> > > attaching the screenshot of physical memory and cpu. > Please let me know your thoughts on the below issue. > > >> On Fri, Nov 15, 2019 at 2:18 AM Jörn Franke wrote: >> Do you use a updateprocess factory? How does it look like? >> >> What is the physical memory size and CPU? >> What do you mean by “there are 64 cores sending concurrently?” An >> application has 64 threats that send concurrently those updates? >> >> > Am 15.11.2019 um 02:14 schrieb Fiz N : >> > >> > Hi Solr Experts, >> > >> > SOLR Version : 6.6.2 >> > >> > OS – Linux 3.1.2 >> > >> > JDK – 1.8 >> > >> > >> > >> > Shard – 16 – All are active. >> > >> > Xms – 16 gb >> > >> > Xmx – 16 gb >> > >> > >> > >> > Schema fields count – 91 >> > >> > Dynamic fields – 83. >> > >> > >> > >> > There are multiple threads sending products to solr. tested with batch size >> > per thread 50 & 100, it worked fine without error and if the batch size is >> > 1000 the following error occurs. >> > >> > there are 64 cores that are sending batches concurrently. >> > >> > >> > >> > *I am getting the following error when the batch size is 1000. Please >> > advise.* >> > >> > >> > >> > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - >> > {collection=c:ducts, core=x:ducts_shard15_replica1, >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, shard=s:shard15} >> > - >> > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: >> > 2 Async exceptions during distributed update: >> > >> > 10.YYY.40.62:8983 failed to respond >> > >> > 10.YYY.40.62:8983 failed to respond >> > >> > >> > >> > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - >> > {collection=c:ducts, core=x:ducts_shard7_replica1, >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, shard=s:shard7} >> > - >> > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: >> > Async exception during distributed update: 10.YYY.40.81:8983 failed to >> > respond >> > >> > >> > >> > >> > >> > 2019-11-14T19:36:11,599 - ERROR >> > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 >> > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 >> > <http://10.148.40.68:8983/solr/products_shard11_replica2> r:core_node26 >> > n:10.YYY.40.68:8983_solr c:ducts >> > s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] >> > - {ConcurrentUpdateSolrClient.url= >> > http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 >> > <http://10.148.40.68:8983/solr/products_shard11_replica2>, >> > collection=c:ducts, core=x:ducts_shard3_replica2, >> > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, shard=s:shard3} >> > - error >> > >> > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to >> > respond >> > >> >at >> > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) >> > ~[httpclient-4.4.1.jar:4.4.1] >> > >> >at >> > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) >> > ~[httpclient-4.4.1.jar:4.4.1] >> > >> >at >> > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) >> > ~[httpcore-4.4.1.jar:4.4.1] >> > >> > >> > >> > >> > >> > 2019-11-14T19:36:14,567 - ERROR >> > [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 >> > http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 >> > <http://10.148.40.62:8983/solr/products_shard2_replica1> r:core_node25 >> > n:10.YYY.40.68:898
Re: 2 Async exceptions during distributed update
No screenshot attached. The Apache mailservers filter attachments. Can you please provide an external link. On Fri, Nov 15, 2019 at 5:34 PM Fiz N wrote: > Thanks for your response. > > "${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> > > > > attaching the screenshot of physical memory and cpu. > > Please let me know your thoughts on the below issue. > > > > On Fri, Nov 15, 2019 at 2:18 AM Jörn Franke wrote: > >> Do you use a updateprocess factory? How does it look like? >> >> What is the physical memory size and CPU? >> What do you mean by “there are 64 cores sending concurrently?” An >> application has 64 threats that send concurrently those updates? >> >> > Am 15.11.2019 um 02:14 schrieb Fiz N : >> > >> > Hi Solr Experts, >> > >> > SOLR Version : 6.6.2 >> > >> > OS – Linux 3.1.2 >> > >> > JDK – 1.8 >> > >> > >> > >> > Shard – 16 – All are active. >> > >> > Xms – 16 gb >> > >> > Xmx – 16 gb >> > >> > >> > >> > Schema fields count – 91 >> > >> > Dynamic fields – 83. >> > >> > >> > >> > There are multiple threads sending products to solr. tested with batch >> size >> > per thread 50 & 100, it worked fine without error and if the batch size >> is >> > 1000 the following error occurs. >> > >> > there are 64 cores that are sending batches concurrently. >> > >> > >> > >> > *I am getting the following error when the batch size is 1000. Please >> > advise.* >> > >> > >> > >> > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - >> > {collection=c:ducts, core=x:ducts_shard15_replica1, >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, >> shard=s:shard15} >> > - >> > >> org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: >> > 2 Async exceptions during distributed update: >> > >> > 10.YYY.40.62:8983 failed to respond >> > >> > 10.YYY.40.62:8983 failed to respond >> > >> > >> > >> > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - >> > {collection=c:ducts, core=x:ducts_shard7_replica1, >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, >> shard=s:shard7} >> > - >> > >> org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: >> > Async exception during distributed update: 10.YYY.40.81:8983 failed to >> > respond >> > >> > >> > >> > >> > >> > 2019-11-14T19:36:11,599 - ERROR >> > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 >> > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 >> > <http://10.148.40.68:8983/solr/products_shard11_replica2> r:core_node26 >> > n:10.YYY.40.68:8983_solr c:ducts >> > >> s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131 >> ] >> > - {ConcurrentUpdateSolrClient.url= >> > http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 >> > <http://10.148.40.68:8983/solr/products_shard11_replica2>, >> > collection=c:ducts, core=x:ducts_shard3_replica2, >> > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, >> shard=s:shard3} >> > - error >> > >> > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to >> respond >> > >> >at >> > >> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) >> > ~[httpclient-4.4.1.jar:4.4.1] >> > >> >at >> > >> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) >> > ~[httpclient-4.4.1.jar:4.4.1] >> > >> >at >> > org.apache.http.impl.io >> .AbstractMessageParser.parse(AbstractMessageParser.java:261) >> > ~[httpcore-4.4.1.jar:4.4.1] >> > >> > >> > >> > >> > >> > 2019-11-14T19:36:14,567 - ERROR >> > [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 >> > http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 >> > <http://10.148.40.62:8983/solr/products_shard2_replica1> r:core_node25 >> > n:10.YYY.40.68:8983_solr c:ducts >>
Re: 2 Async exceptions during distributed update
Hi Solr Experts, Do you have any thoughts on the below issue ? On Fri, Nov 15, 2019 at 11:33 AM Fiz N wrote: > Thanks for your response. > > "${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> > > > > attaching the screenshot of physical memory and cpu. > > Please let me know your thoughts on the below issue. > > > > On Fri, Nov 15, 2019 at 2:18 AM Jörn Franke wrote: > >> Do you use a updateprocess factory? How does it look like? >> >> What is the physical memory size and CPU? >> What do you mean by “there are 64 cores sending concurrently?” An >> application has 64 threats that send concurrently those updates? >> >> > Am 15.11.2019 um 02:14 schrieb Fiz N : >> > >> > Hi Solr Experts, >> > >> > SOLR Version : 6.6.2 >> > >> > OS – Linux 3.1.2 >> > >> > JDK – 1.8 >> > >> > >> > >> > Shard – 16 – All are active. >> > >> > Xms – 16 gb >> > >> > Xmx – 16 gb >> > >> > >> > >> > Schema fields count – 91 >> > >> > Dynamic fields – 83. >> > >> > >> > >> > There are multiple threads sending products to solr. tested with batch >> size >> > per thread 50 & 100, it worked fine without error and if the batch size >> is >> > 1000 the following error occurs. >> > >> > there are 64 cores that are sending batches concurrently. >> > >> > >> > >> > *I am getting the following error when the batch size is 1000. Please >> > advise.* >> > >> > >> > >> > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - >> > {collection=c:ducts, core=x:ducts_shard15_replica1, >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, >> shard=s:shard15} >> > - >> > >> org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: >> > 2 Async exceptions during distributed update: >> > >> > 10.YYY.40.62:8983 failed to respond >> > >> > 10.YYY.40.62:8983 failed to respond >> > >> > >> > >> > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - >> > {collection=c:ducts, core=x:ducts_shard7_replica1, >> > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, >> shard=s:shard7} >> > - >> > >> org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: >> > Async exception during distributed update: 10.YYY.40.81:8983 failed to >> > respond >> > >> > >> > >> > >> > >> > 2019-11-14T19:36:11,599 - ERROR >> > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 >> > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 >> > <http://10.148.40.68:8983/solr/products_shard11_replica2> r:core_node26 >> > n:10.YYY.40.68:8983_solr c:ducts >> > >> s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131 >> ] >> > - {ConcurrentUpdateSolrClient.url= >> > http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 >> > <http://10.148.40.68:8983/solr/products_shard11_replica2>, >> > collection=c:ducts, core=x:ducts_shard3_replica2, >> > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, >> shard=s:shard3} >> > - error >> > >> > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to >> respond >> > >> >at >> > >> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) >> > ~[httpclient-4.4.1.jar:4.4.1] >> > >> >at >> > >> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) >> > ~[httpclient-4.4.1.jar:4.4.1] >> > >> >at >> > org.apache.http.impl.io >> .AbstractMessageParser.parse(AbstractMessageParser.java:261) >> > ~[httpcore-4.4.1.jar:4.4.1] >> > >> > >> > >> > >> > >> > 2019-11-14T19:36:14,567 - ERROR >> > [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 >> > http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 >> > <http://10.148.40.62:8983/solr/products_shard2_replica1> r:core_node25 >> > n:10.YYY.40.68:8983_solr c:ducts >> > >> s:shard11:StreamingSolrClients$ErrorReportingCo
Re: 2 Async exceptions during distributed update
Thanks for your response. attaching the screenshot of physical memory and cpu. Please let me know your thoughts on the below issue. On Fri, Nov 15, 2019 at 2:18 AM Jörn Franke wrote: > Do you use a updateprocess factory? How does it look like? > > What is the physical memory size and CPU? > What do you mean by “there are 64 cores sending concurrently?” An > application has 64 threats that send concurrently those updates? > > > Am 15.11.2019 um 02:14 schrieb Fiz N : > > > > Hi Solr Experts, > > > > SOLR Version : 6.6.2 > > > > OS – Linux 3.1.2 > > > > JDK – 1.8 > > > > > > > > Shard – 16 – All are active. > > > > Xms – 16 gb > > > > Xmx – 16 gb > > > > > > > > Schema fields count – 91 > > > > Dynamic fields – 83. > > > > > > > > There are multiple threads sending products to solr. tested with batch > size > > per thread 50 & 100, it worked fine without error and if the batch size > is > > 1000 the following error occurs. > > > > there are 64 cores that are sending batches concurrently. > > > > > > > > *I am getting the following error when the batch size is 1000. Please > > advise.* > > > > > > > > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - > > {collection=c:ducts, core=x:ducts_shard15_replica1, > > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, > shard=s:shard15} > > - > > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > > 2 Async exceptions during distributed update: > > > > 10.YYY.40.62:8983 failed to respond > > > > 10.YYY.40.62:8983 failed to respond > > > > > > > > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - > > {collection=c:ducts, core=x:ducts_shard7_replica1, > > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, > shard=s:shard7} > > - > > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > > Async exception during distributed update: 10.YYY.40.81:8983 failed to > > respond > > > > > > > > > > > > 2019-11-14T19:36:11,599 - ERROR > > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 > > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 > > <http://10.148.40.68:8983/solr/products_shard11_replica2> r:core_node26 > > n:10.YYY.40.68:8983_solr c:ducts > > > s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131 > ] > > - {ConcurrentUpdateSolrClient.url= > > http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 > > <http://10.148.40.68:8983/solr/products_shard11_replica2>, > > collection=c:ducts, core=x:ducts_shard3_replica2, > > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, > shard=s:shard3} > > - error > > > > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to > respond > > > >at > > > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) > > ~[httpclient-4.4.1.jar:4.4.1] > > > >at > > > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) > > ~[httpclient-4.4.1.jar:4.4.1] > > > >at > > org.apache.http.impl.io > .AbstractMessageParser.parse(AbstractMessageParser.java:261) > > ~[httpcore-4.4.1.jar:4.4.1] > > > > > > > > > > > > 2019-11-14T19:36:14,567 - ERROR > > [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 > > http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 > > <http://10.148.40.62:8983/solr/products_shard2_replica1> r:core_node25 > > n:10.YYY.40.68:8983_solr c:ducts > > > s:shard11:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131 > ] > > - {ConcurrentUpdateSolrClient.url= > > http://10.YYY.40.62:8983/solr/ducts_shard2_replica1 > > <http://10.148.40.62:8983/solr/products_shard2_replica1>, > > collection=c:ducts, core=x:ducts_shard11_replica2, > > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, > shard=s:shard11} > > - error > > > > java.net.SocketException: Broken pipe (Write failed) > > > >at java.net.SocketOutputStream.socketWrite0(Native Method) > > ~[?:1.8.0_232] > > > >at > > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) > > ~[?:1.8.0_232] > > > >at java.net.SocketOutputStream.write(SocketOutputStream.java:155) > > ~[?:1.8.0_232] > > > > > > > > > > > > 2019-11-14T19:36:38,851 - ERROR [qtp876213901-542:SolrException@159] - > > {collection=c:ducts, core=x:ducts_shard11_replica2, > > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, > shard=s:shard11} > > - > > > null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > > Async exception during distributed update: Broken pipe (Write failed) > > > > at > > > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972) > > > >at > > > org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911) > > > > > > > > > > Thanks > > > > Fiz.. >
Re: 2 Async exceptions during distributed update
Do you use a updateprocess factory? How does it look like? What is the physical memory size and CPU? What do you mean by “there are 64 cores sending concurrently?” An application has 64 threats that send concurrently those updates? > Am 15.11.2019 um 02:14 schrieb Fiz N : > > Hi Solr Experts, > > SOLR Version : 6.6.2 > > OS – Linux 3.1.2 > > JDK – 1.8 > > > > Shard – 16 – All are active. > > Xms – 16 gb > > Xmx – 16 gb > > > > Schema fields count – 91 > > Dynamic fields – 83. > > > > There are multiple threads sending products to solr. tested with batch size > per thread 50 & 100, it worked fine without error and if the batch size is > 1000 the following error occurs. > > there are 64 cores that are sending batches concurrently. > > > > *I am getting the following error when the batch size is 1000. Please > advise.* > > > > 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - > {collection=c:ducts, core=x:ducts_shard15_replica1, > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, shard=s:shard15} > - > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > 2 Async exceptions during distributed update: > > 10.YYY.40.62:8983 failed to respond > > 10.YYY.40.62:8983 failed to respond > > > > 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - > {collection=c:ducts, core=x:ducts_shard7_replica1, > node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, shard=s:shard7} > - > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > Async exception during distributed update: 10.YYY.40.81:8983 failed to > respond > > > > > > 2019-11-14T19:36:11,599 - ERROR > [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 > http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 > <http://10.148.40.68:8983/solr/products_shard11_replica2> r:core_node26 > n:10.YYY.40.68:8983_solr c:ducts > s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] > - {ConcurrentUpdateSolrClient.url= > http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 > <http://10.148.40.68:8983/solr/products_shard11_replica2>, > collection=c:ducts, core=x:ducts_shard3_replica2, > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, shard=s:shard3} > - error > > org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to respond > >at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) > ~[httpclient-4.4.1.jar:4.4.1] > >at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) > ~[httpclient-4.4.1.jar:4.4.1] > >at > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) > ~[httpcore-4.4.1.jar:4.4.1] > > > > > > 2019-11-14T19:36:14,567 - ERROR > [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 > http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 > <http://10.148.40.62:8983/solr/products_shard2_replica1> r:core_node25 > n:10.YYY.40.68:8983_solr c:ducts > s:shard11:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] > - {ConcurrentUpdateSolrClient.url= > http://10.YYY.40.62:8983/solr/ducts_shard2_replica1 > <http://10.148.40.62:8983/solr/products_shard2_replica1>, > collection=c:ducts, core=x:ducts_shard11_replica2, > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} > - error > > java.net.SocketException: Broken pipe (Write failed) > >at java.net.SocketOutputStream.socketWrite0(Native Method) > ~[?:1.8.0_232] > >at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) > ~[?:1.8.0_232] > >at java.net.SocketOutputStream.write(SocketOutputStream.java:155) > ~[?:1.8.0_232] > > > > > > 2019-11-14T19:36:38,851 - ERROR [qtp876213901-542:SolrException@159] - > {collection=c:ducts, core=x:ducts_shard11_replica2, > node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} > - > null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > Async exception during distributed update: Broken pipe (Write failed) > > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972) > >at > org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911) > > > > > Thanks > > Fiz..
2 Async exceptions during distributed update
Hi Solr Experts, SOLR Version : 6.6.2 OS – Linux 3.1.2 JDK – 1.8 Shard – 16 – All are active. Xms – 16 gb Xmx – 16 gb Schema fields count – 91 Dynamic fields – 83. There are multiple threads sending products to solr. tested with batch size per thread 50 & 100, it worked fine without error and if the batch size is 1000 the following error occurs. there are 64 cores that are sending batches concurrently. *I am getting the following error when the batch size is 1000. Please advise.* 2019-11-14T17:45:44,646 - ERROR [qtp876213901-1169:SolrException@YYY] - {collection=c:ducts, core=x:ducts_shard15_replica1, node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node30, shard=s:shard15} - org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: 2 Async exceptions during distributed update: 10.YYY.40.62:8983 failed to respond 10.YYY.40.62:8983 failed to respond 2019-11-14T17:45:48,821 - ERROR [qtp876213901-202:SolrException@YYY] - {collection=c:ducts, core=x:ducts_shard7_replica1, node_name=n:10.YYY.40.81:8983_solr, replica=r:core_node29, shard=s:shard7} - org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: 10.YYY.40.81:8983 failed to respond 2019-11-14T19:36:11,599 - ERROR [updateExecutor-2-thread-176-processing-x:ducts_shard3_replica2 http:10.YYY.40.68:8983//solr//ducts_shard11_replica2 <http://10.148.40.68:8983/solr/products_shard11_replica2> r:core_node26 n:10.YYY.40.68:8983_solr c:ducts s:shard3:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] - {ConcurrentUpdateSolrClient.url= http://10.YYY.40.68:8983/solr/ducts_shard11_replica2 <http://10.148.40.68:8983/solr/products_shard11_replica2>, collection=c:ducts, core=x:ducts_shard3_replica2, node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node26, shard=s:shard3} - error org.apache.http.NoHttpResponseException: 10.YYY.40.68:8983 failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) ~[httpclient-4.4.1.jar:4.4.1] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) ~[httpclient-4.4.1.jar:4.4.1] at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) ~[httpcore-4.4.1.jar:4.4.1] 2019-11-14T19:36:14,567 - ERROR [updateExecutor-2-thread-189-processing-x:ducts_shard11_replica2 http:10.YYY.40.62:8983//solr//ducts_shard2_replica1 <http://10.148.40.62:8983/solr/products_shard2_replica1> r:core_node25 n:10.YYY.40.68:8983_solr c:ducts s:shard11:StreamingSolrClients$ErrorReportingConcurrentUpdateSolrClient@131] - {ConcurrentUpdateSolrClient.url= http://10.YYY.40.62:8983/solr/ducts_shard2_replica1 <http://10.148.40.62:8983/solr/products_shard2_replica1>, collection=c:ducts, core=x:ducts_shard11_replica2, node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} - error java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) ~[?:1.8.0_232] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) ~[?:1.8.0_232] at java.net.SocketOutputStream.write(SocketOutputStream.java:155) ~[?:1.8.0_232] 2019-11-14T19:36:38,851 - ERROR [qtp876213901-542:SolrException@159] - {collection=c:ducts, core=x:ducts_shard11_replica2, node_name=n:10.YYY.40.68:8983_solr, replica=r:core_node25, shard=s:shard11} - null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Broken pipe (Write failed) at org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972) at org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911) Thanks Fiz..
Re: Async exceptions during distributed update
Adding some more context to my last email Solr:6.6.3 2 nodes : 3 shards each No replication . Can someone answer the following questions 1) any ideas on why the following errors keep happening. AFAIK streaming solr clients error is because of timeouts when connecting to other nodes. Async errors are also network related as explained earlier in the email by Emir. There were no network issues but the error has comeback and filling up my logs. 2) is anyone using solr 6.6.3 in production and what has their experience been so far. 3) is there any good documentation or blog post that would explain about inner working of solrcloud networking? Thanks Jay org.apache.solr.update.StreamingSolrClients > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > Async exception during > On May 13, 2018, at 9:21 PM, Jay Potharaju wrote: > > Hi, > I restarted both my solr servers but I am seeing the async error again. In > older 5x version of solrcloud, solr would normally recover gracefully in case > of network errors, but solr 6.6.3 does not seem to be doing that. At this > time I am not doing only a small percentage of deletebyquery operations, its > mostly indexing of documents only. > I have not noticed any network blip like last time. Any suggestions or is > any else also having the same issue on solr 6.6.3? > > I am again seeing the following two errors back to back. > > ERROR org.apache.solr.update.StreamingSolrClients > > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > Async exception during distributed update: Read timed out > Thanks > Jay > > > >> On Wed, May 9, 2018 at 12:34 AM Emir Arnautović >> wrote: >> Hi Jay, >> Network blip might be the cause, but also the consequence of this issue. >> Maybe you can try avoiding DBQ while indexing and see if it is the cause. >> You can do thread dump on “the other” node and see if there are blocked >> threads and that can give you more clues what’s going on. >> >> Thanks, >> Emir >> -- >> Monitoring - Log Management - Alerting - Anomaly Detection >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> >> >> >> > On 8 May 2018, at 17:53, Jay Potharaju wrote: >> > >> > Hi Emir, >> > I was seeing this error as long as the indexing was running. Once I stopped >> > the indexing the errors also stopped. Yes, we do monitor both hosts & solr >> > but have not seen anything out of the ordinary except for a small network >> > blip. In my experience solr generally recovers after a network blip and >> > there are a few errors for streaming solr client...but have never seen this >> > error before. >> > >> > Thanks >> > Jay >> > >> > Thanks >> > Jay Potharaju >> > >> > >> > On Tue, May 8, 2018 at 12:56 AM, Emir Arnautović < >> > emir.arnauto...@sematext.com> wrote: >> > >> >> Hi Jay, >> >> This is low ingestion rate. What is the size of your index? What is heap >> >> size? I am guessing that this is not a huge index, so I am leaning toward >> >> what Shawn mentioned - some combination of DBQ/merge/commit/optimise that >> >> is blocking indexing. Though, it is strange that it is happening only on >> >> one node if you are sending updates randomly to both nodes. Do you monitor >> >> your hosts/Solr? Do you see anything different at the time when timeouts >> >> happen? >> >> >> >> Thanks, >> >> Emir >> >> -- >> >> Monitoring - Log Management - Alerting - Anomaly Detection >> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> >> >> >> >> >> >> >>> On 8 May 2018, at 03:23, Jay Potharaju wrote: >> >>> >> >>> I have about 3-5 updates per second. >> >>> >> >>> >> On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: >> >> > On 5/7/2018 5:05 PM, Jay Potharaju wrote: >> > There are some deletes by query. I have not had any issues with DBQ, >> > currently have 5.3 running in production. >> >> Here's the big problem with DBQ. Imagine this sequence of events with >> these timestamps: >> >> 13:00:00: A commit for change visibility happens. >> 13:00:00: A segment merge is triggered by the commit. >> (It's a big merge that takes exactly 3 minutes.) >> 13:00:05: A deleteByQuery is sent. >> 13:00:15: An update to the index is sent. >> 13:00:25: An update to the index is sent. >> 13:00:35: An update to the index is sent. >> 13:00:45: An update to the index is sent. >> 13:00:55: An update to the index is sent. >> 13:01:05: An update to the index is sent. >> 13:01:15: An update to the index is sent. >> 13:01:25: An update to the index is sent. >> {time passes, more updates might be sent} >> 13:03:00: The merge finishes. >> >> Here's what would happen in this scenario: The DBQ and all of the >> update requests sent *after* the DBQ will block until the merge >> finishes.
Re: Async exceptions during distributed update
Hi, I restarted both my solr servers but I am seeing the async error again. In older 5x version of solrcloud, solr would normally recover gracefully in case of network errors, but solr 6.6.3 does not seem to be doing that. At this time I am not doing only a small percentage of deletebyquery operations, its mostly indexing of documents only. I have not noticed any network blip like last time. Any suggestions or is any else also having the same issue on solr 6.6.3? I am again seeing the following two errors back to back. ERROR org.apache.solr.update.StreamingSolrClients org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Read timed out Thanks Jay On Wed, May 9, 2018 at 12:34 AM Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Jay, > Network blip might be the cause, but also the consequence of this issue. > Maybe you can try avoiding DBQ while indexing and see if it is the cause. > You can do thread dump on “the other” node and see if there are blocked > threads and that can give you more clues what’s going on. > > Thanks, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 8 May 2018, at 17:53, Jay Potharaju wrote: > > > > Hi Emir, > > I was seeing this error as long as the indexing was running. Once I > stopped > > the indexing the errors also stopped. Yes, we do monitor both hosts & > solr > > but have not seen anything out of the ordinary except for a small network > > blip. In my experience solr generally recovers after a network blip and > > there are a few errors for streaming solr client...but have never seen > this > > error before. > > > > Thanks > > Jay > > > > Thanks > > Jay Potharaju > > > > > > On Tue, May 8, 2018 at 12:56 AM, Emir Arnautović < > > emir.arnauto...@sematext.com> wrote: > > > >> Hi Jay, > >> This is low ingestion rate. What is the size of your index? What is heap > >> size? I am guessing that this is not a huge index, so I am leaning > toward > >> what Shawn mentioned - some combination of DBQ/merge/commit/optimise > that > >> is blocking indexing. Though, it is strange that it is happening only on > >> one node if you are sending updates randomly to both nodes. Do you > monitor > >> your hosts/Solr? Do you see anything different at the time when timeouts > >> happen? > >> > >> Thanks, > >> Emir > >> -- > >> Monitoring - Log Management - Alerting - Anomaly Detection > >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > >> > >> > >> > >>> On 8 May 2018, at 03:23, Jay Potharaju wrote: > >>> > >>> I have about 3-5 updates per second. > >>> > >>> > On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: > > > On 5/7/2018 5:05 PM, Jay Potharaju wrote: > > There are some deletes by query. I have not had any issues with DBQ, > > currently have 5.3 running in production. > > Here's the big problem with DBQ. Imagine this sequence of events with > these timestamps: > > 13:00:00: A commit for change visibility happens. > 13:00:00: A segment merge is triggered by the commit. > (It's a big merge that takes exactly 3 minutes.) > 13:00:05: A deleteByQuery is sent. > 13:00:15: An update to the index is sent. > 13:00:25: An update to the index is sent. > 13:00:35: An update to the index is sent. > 13:00:45: An update to the index is sent. > 13:00:55: An update to the index is sent. > 13:01:05: An update to the index is sent. > 13:01:15: An update to the index is sent. > 13:01:25: An update to the index is sent. > {time passes, more updates might be sent} > 13:03:00: The merge finishes. > > Here's what would happen in this scenario: The DBQ and all of the > update requests sent *after* the DBQ will block until the merge > finishes. That means that it's going to take up to three minutes for > Solr to respond to those requests. If the client that is sending the > request is configured with a 60 second socket timeout, which > inter-node > requests made by Solr are by default, then it is going to experience a > timeout error. The request will probably complete successfully once > the > merge finishes, but the connection is gone, and the client has already > received an error. > > Now imagine what happens if an optimize (forced merge of the entire > index) is requested on an index that's 50GB. That optimize may take > 2-3 > hours, possibly longer. A deleteByQuery started on that index after > the > optimize begins (and any updates requested after the DBQ) will pause > until the optimize is done. A pause of 2 hours or more is a BIG > >> problem. > > This is why deleteByQuery is not recommended. > > If the deleteByQuery were changed into a two-step process
Re: Async exceptions during distributed update
Hi Jay, Network blip might be the cause, but also the consequence of this issue. Maybe you can try avoiding DBQ while indexing and see if it is the cause. You can do thread dump on “the other” node and see if there are blocked threads and that can give you more clues what’s going on. Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 8 May 2018, at 17:53, Jay Potharaju wrote: > > Hi Emir, > I was seeing this error as long as the indexing was running. Once I stopped > the indexing the errors also stopped. Yes, we do monitor both hosts & solr > but have not seen anything out of the ordinary except for a small network > blip. In my experience solr generally recovers after a network blip and > there are a few errors for streaming solr client...but have never seen this > error before. > > Thanks > Jay > > Thanks > Jay Potharaju > > > On Tue, May 8, 2018 at 12:56 AM, Emir Arnautović < > emir.arnauto...@sematext.com> wrote: > >> Hi Jay, >> This is low ingestion rate. What is the size of your index? What is heap >> size? I am guessing that this is not a huge index, so I am leaning toward >> what Shawn mentioned - some combination of DBQ/merge/commit/optimise that >> is blocking indexing. Though, it is strange that it is happening only on >> one node if you are sending updates randomly to both nodes. Do you monitor >> your hosts/Solr? Do you see anything different at the time when timeouts >> happen? >> >> Thanks, >> Emir >> -- >> Monitoring - Log Management - Alerting - Anomaly Detection >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> >> >> >>> On 8 May 2018, at 03:23, Jay Potharaju wrote: >>> >>> I have about 3-5 updates per second. >>> >>> On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: > On 5/7/2018 5:05 PM, Jay Potharaju wrote: > There are some deletes by query. I have not had any issues with DBQ, > currently have 5.3 running in production. Here's the big problem with DBQ. Imagine this sequence of events with these timestamps: 13:00:00: A commit for change visibility happens. 13:00:00: A segment merge is triggered by the commit. (It's a big merge that takes exactly 3 minutes.) 13:00:05: A deleteByQuery is sent. 13:00:15: An update to the index is sent. 13:00:25: An update to the index is sent. 13:00:35: An update to the index is sent. 13:00:45: An update to the index is sent. 13:00:55: An update to the index is sent. 13:01:05: An update to the index is sent. 13:01:15: An update to the index is sent. 13:01:25: An update to the index is sent. {time passes, more updates might be sent} 13:03:00: The merge finishes. Here's what would happen in this scenario: The DBQ and all of the update requests sent *after* the DBQ will block until the merge finishes. That means that it's going to take up to three minutes for Solr to respond to those requests. If the client that is sending the request is configured with a 60 second socket timeout, which inter-node requests made by Solr are by default, then it is going to experience a timeout error. The request will probably complete successfully once the merge finishes, but the connection is gone, and the client has already received an error. Now imagine what happens if an optimize (forced merge of the entire index) is requested on an index that's 50GB. That optimize may take 2-3 hours, possibly longer. A deleteByQuery started on that index after the optimize begins (and any updates requested after the DBQ) will pause until the optimize is done. A pause of 2 hours or more is a BIG >> problem. This is why deleteByQuery is not recommended. If the deleteByQuery were changed into a two-step process involving a query to retrieve ID values and then one or more deleteById requests, then none of that blocking would occur. The deleteById operation can run at the same time as a segment merge, so neither it nor subsequent update requests will have the significant pause. From what I understand, you can even do commits in this scenario and have changes be visible before the merge completes. I haven't verified that this is the case. Experienced devs: Can we fix this problem with DBQ? On indexes with a uniqueKey, can DBQ be changed to use the two-step process I mentioned? Thanks, Shawn >> >>
Re: Async exceptions during distributed update
Hi Emir, I was seeing this error as long as the indexing was running. Once I stopped the indexing the errors also stopped. Yes, we do monitor both hosts & solr but have not seen anything out of the ordinary except for a small network blip. In my experience solr generally recovers after a network blip and there are a few errors for streaming solr client...but have never seen this error before. Thanks Jay Thanks Jay Potharaju On Tue, May 8, 2018 at 12:56 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Jay, > This is low ingestion rate. What is the size of your index? What is heap > size? I am guessing that this is not a huge index, so I am leaning toward > what Shawn mentioned - some combination of DBQ/merge/commit/optimise that > is blocking indexing. Though, it is strange that it is happening only on > one node if you are sending updates randomly to both nodes. Do you monitor > your hosts/Solr? Do you see anything different at the time when timeouts > happen? > > Thanks, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 8 May 2018, at 03:23, Jay Potharaju wrote: > > > > I have about 3-5 updates per second. > > > > > >> On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: > >> > >>> On 5/7/2018 5:05 PM, Jay Potharaju wrote: > >>> There are some deletes by query. I have not had any issues with DBQ, > >>> currently have 5.3 running in production. > >> > >> Here's the big problem with DBQ. Imagine this sequence of events with > >> these timestamps: > >> > >> 13:00:00: A commit for change visibility happens. > >> 13:00:00: A segment merge is triggered by the commit. > >> (It's a big merge that takes exactly 3 minutes.) > >> 13:00:05: A deleteByQuery is sent. > >> 13:00:15: An update to the index is sent. > >> 13:00:25: An update to the index is sent. > >> 13:00:35: An update to the index is sent. > >> 13:00:45: An update to the index is sent. > >> 13:00:55: An update to the index is sent. > >> 13:01:05: An update to the index is sent. > >> 13:01:15: An update to the index is sent. > >> 13:01:25: An update to the index is sent. > >> {time passes, more updates might be sent} > >> 13:03:00: The merge finishes. > >> > >> Here's what would happen in this scenario: The DBQ and all of the > >> update requests sent *after* the DBQ will block until the merge > >> finishes. That means that it's going to take up to three minutes for > >> Solr to respond to those requests. If the client that is sending the > >> request is configured with a 60 second socket timeout, which inter-node > >> requests made by Solr are by default, then it is going to experience a > >> timeout error. The request will probably complete successfully once the > >> merge finishes, but the connection is gone, and the client has already > >> received an error. > >> > >> Now imagine what happens if an optimize (forced merge of the entire > >> index) is requested on an index that's 50GB. That optimize may take 2-3 > >> hours, possibly longer. A deleteByQuery started on that index after the > >> optimize begins (and any updates requested after the DBQ) will pause > >> until the optimize is done. A pause of 2 hours or more is a BIG > problem. > >> > >> This is why deleteByQuery is not recommended. > >> > >> If the deleteByQuery were changed into a two-step process involving a > >> query to retrieve ID values and then one or more deleteById requests, > >> then none of that blocking would occur. The deleteById operation can > >> run at the same time as a segment merge, so neither it nor subsequent > >> update requests will have the significant pause. From what I > >> understand, you can even do commits in this scenario and have changes be > >> visible before the merge completes. I haven't verified that this is the > >> case. > >> > >> Experienced devs: Can we fix this problem with DBQ? On indexes with a > >> uniqueKey, can DBQ be changed to use the two-step process I mentioned? > >> > >> Thanks, > >> Shawn > >> > >
Re: Async exceptions during distributed update
Hi Jay, This is low ingestion rate. What is the size of your index? What is heap size? I am guessing that this is not a huge index, so I am leaning toward what Shawn mentioned - some combination of DBQ/merge/commit/optimise that is blocking indexing. Though, it is strange that it is happening only on one node if you are sending updates randomly to both nodes. Do you monitor your hosts/Solr? Do you see anything different at the time when timeouts happen? Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 8 May 2018, at 03:23, Jay Potharaju wrote: > > I have about 3-5 updates per second. > > >> On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: >> >>> On 5/7/2018 5:05 PM, Jay Potharaju wrote: >>> There are some deletes by query. I have not had any issues with DBQ, >>> currently have 5.3 running in production. >> >> Here's the big problem with DBQ. Imagine this sequence of events with >> these timestamps: >> >> 13:00:00: A commit for change visibility happens. >> 13:00:00: A segment merge is triggered by the commit. >> (It's a big merge that takes exactly 3 minutes.) >> 13:00:05: A deleteByQuery is sent. >> 13:00:15: An update to the index is sent. >> 13:00:25: An update to the index is sent. >> 13:00:35: An update to the index is sent. >> 13:00:45: An update to the index is sent. >> 13:00:55: An update to the index is sent. >> 13:01:05: An update to the index is sent. >> 13:01:15: An update to the index is sent. >> 13:01:25: An update to the index is sent. >> {time passes, more updates might be sent} >> 13:03:00: The merge finishes. >> >> Here's what would happen in this scenario: The DBQ and all of the >> update requests sent *after* the DBQ will block until the merge >> finishes. That means that it's going to take up to three minutes for >> Solr to respond to those requests. If the client that is sending the >> request is configured with a 60 second socket timeout, which inter-node >> requests made by Solr are by default, then it is going to experience a >> timeout error. The request will probably complete successfully once the >> merge finishes, but the connection is gone, and the client has already >> received an error. >> >> Now imagine what happens if an optimize (forced merge of the entire >> index) is requested on an index that's 50GB. That optimize may take 2-3 >> hours, possibly longer. A deleteByQuery started on that index after the >> optimize begins (and any updates requested after the DBQ) will pause >> until the optimize is done. A pause of 2 hours or more is a BIG problem. >> >> This is why deleteByQuery is not recommended. >> >> If the deleteByQuery were changed into a two-step process involving a >> query to retrieve ID values and then one or more deleteById requests, >> then none of that blocking would occur. The deleteById operation can >> run at the same time as a segment merge, so neither it nor subsequent >> update requests will have the significant pause. From what I >> understand, you can even do commits in this scenario and have changes be >> visible before the merge completes. I haven't verified that this is the >> case. >> >> Experienced devs: Can we fix this problem with DBQ? On indexes with a >> uniqueKey, can DBQ be changed to use the two-step process I mentioned? >> >> Thanks, >> Shawn >>
Re: Async exceptions during distributed update
I have about 3-5 updates per second. > On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: > >> On 5/7/2018 5:05 PM, Jay Potharaju wrote: >> There are some deletes by query. I have not had any issues with DBQ, >> currently have 5.3 running in production. > > Here's the big problem with DBQ. Imagine this sequence of events with > these timestamps: > > 13:00:00: A commit for change visibility happens. > 13:00:00: A segment merge is triggered by the commit. > (It's a big merge that takes exactly 3 minutes.) > 13:00:05: A deleteByQuery is sent. > 13:00:15: An update to the index is sent. > 13:00:25: An update to the index is sent. > 13:00:35: An update to the index is sent. > 13:00:45: An update to the index is sent. > 13:00:55: An update to the index is sent. > 13:01:05: An update to the index is sent. > 13:01:15: An update to the index is sent. > 13:01:25: An update to the index is sent. > {time passes, more updates might be sent} > 13:03:00: The merge finishes. > > Here's what would happen in this scenario: The DBQ and all of the > update requests sent *after* the DBQ will block until the merge > finishes. That means that it's going to take up to three minutes for > Solr to respond to those requests. If the client that is sending the > request is configured with a 60 second socket timeout, which inter-node > requests made by Solr are by default, then it is going to experience a > timeout error. The request will probably complete successfully once the > merge finishes, but the connection is gone, and the client has already > received an error. > > Now imagine what happens if an optimize (forced merge of the entire > index) is requested on an index that's 50GB. That optimize may take 2-3 > hours, possibly longer. A deleteByQuery started on that index after the > optimize begins (and any updates requested after the DBQ) will pause > until the optimize is done. A pause of 2 hours or more is a BIG problem. > > This is why deleteByQuery is not recommended. > > If the deleteByQuery were changed into a two-step process involving a > query to retrieve ID values and then one or more deleteById requests, > then none of that blocking would occur. The deleteById operation can > run at the same time as a segment merge, so neither it nor subsequent > update requests will have the significant pause. From what I > understand, you can even do commits in this scenario and have changes be > visible before the merge completes. I haven't verified that this is the > case. > > Experienced devs: Can we fix this problem with DBQ? On indexes with a > uniqueKey, can DBQ be changed to use the two-step process I mentioned? > > Thanks, > Shawn >
Re: Async exceptions during distributed update
Thanks for explaining that Shawn! Emir, I use php library called solarium to do updates/deletes to solr. The request is sent to any of the available nodes in the cluster. > On May 7, 2018, at 5:02 PM, Shawn Heisey wrote: > >> On 5/7/2018 5:05 PM, Jay Potharaju wrote: >> There are some deletes by query. I have not had any issues with DBQ, >> currently have 5.3 running in production. > > Here's the big problem with DBQ. Imagine this sequence of events with > these timestamps: > > 13:00:00: A commit for change visibility happens. > 13:00:00: A segment merge is triggered by the commit. > (It's a big merge that takes exactly 3 minutes.) > 13:00:05: A deleteByQuery is sent. > 13:00:15: An update to the index is sent. > 13:00:25: An update to the index is sent. > 13:00:35: An update to the index is sent. > 13:00:45: An update to the index is sent. > 13:00:55: An update to the index is sent. > 13:01:05: An update to the index is sent. > 13:01:15: An update to the index is sent. > 13:01:25: An update to the index is sent. > {time passes, more updates might be sent} > 13:03:00: The merge finishes. > > Here's what would happen in this scenario: The DBQ and all of the > update requests sent *after* the DBQ will block until the merge > finishes. That means that it's going to take up to three minutes for > Solr to respond to those requests. If the client that is sending the > request is configured with a 60 second socket timeout, which inter-node > requests made by Solr are by default, then it is going to experience a > timeout error. The request will probably complete successfully once the > merge finishes, but the connection is gone, and the client has already > received an error. > > Now imagine what happens if an optimize (forced merge of the entire > index) is requested on an index that's 50GB. That optimize may take 2-3 > hours, possibly longer. A deleteByQuery started on that index after the > optimize begins (and any updates requested after the DBQ) will pause > until the optimize is done. A pause of 2 hours or more is a BIG problem. > > This is why deleteByQuery is not recommended. > > If the deleteByQuery were changed into a two-step process involving a > query to retrieve ID values and then one or more deleteById requests, > then none of that blocking would occur. The deleteById operation can > run at the same time as a segment merge, so neither it nor subsequent > update requests will have the significant pause. From what I > understand, you can even do commits in this scenario and have changes be > visible before the merge completes. I haven't verified that this is the > case. > > Experienced devs: Can we fix this problem with DBQ? On indexes with a > uniqueKey, can DBQ be changed to use the two-step process I mentioned? > > Thanks, > Shawn >
Re: Async exceptions during distributed update
On 5/7/2018 5:05 PM, Jay Potharaju wrote: > There are some deletes by query. I have not had any issues with DBQ, > currently have 5.3 running in production. Here's the big problem with DBQ. Imagine this sequence of events with these timestamps: 13:00:00: A commit for change visibility happens. 13:00:00: A segment merge is triggered by the commit. (It's a big merge that takes exactly 3 minutes.) 13:00:05: A deleteByQuery is sent. 13:00:15: An update to the index is sent. 13:00:25: An update to the index is sent. 13:00:35: An update to the index is sent. 13:00:45: An update to the index is sent. 13:00:55: An update to the index is sent. 13:01:05: An update to the index is sent. 13:01:15: An update to the index is sent. 13:01:25: An update to the index is sent. {time passes, more updates might be sent} 13:03:00: The merge finishes. Here's what would happen in this scenario: The DBQ and all of the update requests sent *after* the DBQ will block until the merge finishes. That means that it's going to take up to three minutes for Solr to respond to those requests. If the client that is sending the request is configured with a 60 second socket timeout, which inter-node requests made by Solr are by default, then it is going to experience a timeout error. The request will probably complete successfully once the merge finishes, but the connection is gone, and the client has already received an error. Now imagine what happens if an optimize (forced merge of the entire index) is requested on an index that's 50GB. That optimize may take 2-3 hours, possibly longer. A deleteByQuery started on that index after the optimize begins (and any updates requested after the DBQ) will pause until the optimize is done. A pause of 2 hours or more is a BIG problem. This is why deleteByQuery is not recommended. If the deleteByQuery were changed into a two-step process involving a query to retrieve ID values and then one or more deleteById requests, then none of that blocking would occur. The deleteById operation can run at the same time as a segment merge, so neither it nor subsequent update requests will have the significant pause. From what I understand, you can even do commits in this scenario and have changes be visible before the merge completes. I haven't verified that this is the case. Experienced devs: Can we fix this problem with DBQ? On indexes with a uniqueKey, can DBQ be changed to use the two-step process I mentioned? Thanks, Shawn
Re: Async exceptions during distributed update
How many concurrent updates can be sent? Do you always send updates to the same node? Do you use solrj? Emir On Tue, May 8, 2018, 1:02 AM Jay Potharaju wrote: > The updates are pushed in real time not batched. No complex analysis and > everything is committed using autocommit settings in solr. > > Thanks > Jay Potharaju > > > On Mon, May 7, 2018 at 4:00 PM, Emir Arnautović < > emir.arnauto...@sematext.com> wrote: > > > How do you send documents? Large batches? Complex analysis? Do you send > all > > batches to the same node? How do you commit? Do you delete by query while > > indexing? > > > > Emir > > > > On Tue, May 8, 2018, 12:30 AM Jay Potharaju > wrote: > > > > > I didn't see any OOM errors in the logs on either of the nodes. I saw > GC > > > pause of 1 second on the box that was throwing error ...but nothing on > > the > > > other node. Any other recommendations? > > > Thanks > > > > > > > > > Thanks > > > Jay Potharaju > > > > > > > > > On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju > > > wrote: > > > > > > > Ah thanks for explaining that! > > > > > > > > Thanks > > > > Jay Potharaju > > > > > > > > > > > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović < > > > > emir.arnauto...@sematext.com> wrote: > > > > > > > >> Node A receives batch of documents to index. It forwards documents > to > > > >> shards that are on the node B. Node B is having issues with GC so it > > > takes > > > >> a while to respond. Node A sees it as read timeout and reports it in > > > logs. > > > >> So the issue is on node B not node A. > > > >> > > > >> Emir > > > >> -- > > > >> Monitoring - Log Management - Alerting - Anomaly Detection > > > >> Solr & Elasticsearch Consulting Support Training - > > http://sematext.com/ > > > >> > > > >> > > > >> > > > >> > On 7 May 2018, at 18:39, Jay Potharaju > > wrote: > > > >> > > > > >> > Yes, the nodes are well balanced. I am just using these boxes for > > > >> indexing > > > >> > the data and is not serving any traffic at this time. The error > > > >> indicates > > > >> > it is having issues errors on the shards that are hosted on the > box > > > and > > > >> not > > > >> > on the other box. > > > >> > I will check GC logs to see if there were any issues. > > > >> > thanks > > > >> > > > > >> > Thanks > > > >> > Jay Potharaju > > > >> > > > > >> > > > > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < > > > >> > emir.arnauto...@sematext.com> wrote: > > > >> > > > > >> >> Hi Jay, > > > >> >> My first guess would be that there was some major GC on other box > > so > > > it > > > >> >> did not respond on time. Are your nodes well balanced - do they > > serve > > > >> equal > > > >> >> amount of data? > > > >> >> > > > >> >> Thanks, > > > >> >> Emir > > > >> >> -- > > > >> >> Monitoring - Log Management - Alerting - Anomaly Detection > > > >> >> Solr & Elasticsearch Consulting Support Training - > > > >> http://sematext.com/ > > > >> >> > > > >> >> > > > >> >> > > > >> >>> On 7 May 2018, at 18:11, Jay Potharaju > > > wrote: > > > >> >>> > > > >> >>> Hi, > > > >> >>> I am seeing the following lines in the error log. My setup has 2 > > > >> nodes in > > > >> >>> the solrcloud cluster, each node has 3 shards with no > replication. > > > >> From > > > >> >> the > > > >> >>> error log it seems like all the shards on this box are throwing > > > async > > > >> >>> exception errors. Other node in the cluster does not have any > > errors > > > >> in > > > >> >> the > > > >> >>> logs. Any suggestions on how to tackle this error? > > > >> >>> > > > >> >>> Solr setup > > > >> >>> Solr:6.6.3 > > > >> >>> 2Nodes: 3 shards each > > > >> >>> > > > >> >>> > > > >> >>> ERROR org.apache.solr.servlet.HttpSolrCall > > [test_shard3_replica1] ? > > > >> >>> > null:org.apache.solr.update.processor.DistributedUpdateProcessor$ > > > >> >> DistributedUpdatesAsyncException: > > > >> >>> Async exception during distributed update: Read timed out > > > >> >>> at > > > >> >>> > > > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( > > > >> >> DistributedUpdateProcessor.java:972) > > > >> >>> at > > > >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor. > > finish( > > > >> >> DistributedUpdateProcessor.java:1911) > > > >> >>> at > > > >> >>> org.apache.solr.handler.ContentStreamHandlerBase. > > handleRequestBody( > > > >> >> ContentStreamHandlerBase.java:78) > > > >> >>> at > > > >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest( > > > >> >> RequestHandlerBase.java:173) > > > >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > > > >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall. > > > >> java:723) > > > >> >>> at org.apache.solr.servlet.HttpSolrCall.call( > > HttpSolrCall.java:529) > > > >> >>> at > > > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > > >> >> SolrDispatchFilter.java:361) > > > >> >>> at > > > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > > >> >> SolrDispatchFilter.java:3
Re: Async exceptions during distributed update
There are some deletes by query. I have not had any issues with DBQ, currently have 5.3 running in production. Thanks Jay Potharaju On Mon, May 7, 2018 at 4:02 PM, Jay Potharaju wrote: > The updates are pushed in real time not batched. No complex analysis and > everything is committed using autocommit settings in solr. > > Thanks > Jay Potharaju > > > On Mon, May 7, 2018 at 4:00 PM, Emir Arnautović < > emir.arnauto...@sematext.com> wrote: > >> How do you send documents? Large batches? Complex analysis? Do you send >> all >> batches to the same node? How do you commit? Do you delete by query while >> indexing? >> >> Emir >> >> On Tue, May 8, 2018, 12:30 AM Jay Potharaju >> wrote: >> >> > I didn't see any OOM errors in the logs on either of the nodes. I saw GC >> > pause of 1 second on the box that was throwing error ...but nothing on >> the >> > other node. Any other recommendations? >> > Thanks >> > >> > >> > Thanks >> > Jay Potharaju >> > >> > >> > On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju >> > wrote: >> > >> > > Ah thanks for explaining that! >> > > >> > > Thanks >> > > Jay Potharaju >> > > >> > > >> > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović < >> > > emir.arnauto...@sematext.com> wrote: >> > > >> > >> Node A receives batch of documents to index. It forwards documents to >> > >> shards that are on the node B. Node B is having issues with GC so it >> > takes >> > >> a while to respond. Node A sees it as read timeout and reports it in >> > logs. >> > >> So the issue is on node B not node A. >> > >> >> > >> Emir >> > >> -- >> > >> Monitoring - Log Management - Alerting - Anomaly Detection >> > >> Solr & Elasticsearch Consulting Support Training - >> http://sematext.com/ >> > >> >> > >> >> > >> >> > >> > On 7 May 2018, at 18:39, Jay Potharaju >> wrote: >> > >> > >> > >> > Yes, the nodes are well balanced. I am just using these boxes for >> > >> indexing >> > >> > the data and is not serving any traffic at this time. The error >> > >> indicates >> > >> > it is having issues errors on the shards that are hosted on the box >> > and >> > >> not >> > >> > on the other box. >> > >> > I will check GC logs to see if there were any issues. >> > >> > thanks >> > >> > >> > >> > Thanks >> > >> > Jay Potharaju >> > >> > >> > >> > >> > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < >> > >> > emir.arnauto...@sematext.com> wrote: >> > >> > >> > >> >> Hi Jay, >> > >> >> My first guess would be that there was some major GC on other box >> so >> > it >> > >> >> did not respond on time. Are your nodes well balanced - do they >> serve >> > >> equal >> > >> >> amount of data? >> > >> >> >> > >> >> Thanks, >> > >> >> Emir >> > >> >> -- >> > >> >> Monitoring - Log Management - Alerting - Anomaly Detection >> > >> >> Solr & Elasticsearch Consulting Support Training - >> > >> http://sematext.com/ >> > >> >> >> > >> >> >> > >> >> >> > >> >>> On 7 May 2018, at 18:11, Jay Potharaju >> > wrote: >> > >> >>> >> > >> >>> Hi, >> > >> >>> I am seeing the following lines in the error log. My setup has 2 >> > >> nodes in >> > >> >>> the solrcloud cluster, each node has 3 shards with no >> replication. >> > >> From >> > >> >> the >> > >> >>> error log it seems like all the shards on this box are throwing >> > async >> > >> >>> exception errors. Other node in the cluster does not have any >> errors >> > >> in >> > >> >> the >> > >> >>> logs. Any suggestions on how to tackle this error? >> > >> >>> >> > >> >>> Solr setup >> > >> >>> Solr:6.6.3 >> > >> >>> 2Nodes: 3 shards each >> > >> >>> >> > >> >>> >> > >> >>> ERROR org.apache.solr.servlet.HttpSolrCall >> [test_shard3_replica1] ? >> > >> >>> null:org.apache.solr.update.processor.DistributedUpdateProce >> ssor$ >> > >> >> DistributedUpdatesAsyncException: >> > >> >>> Async exception during distributed update: Read timed out >> > >> >>> at >> > >> >>> >> > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( >> > >> >> DistributedUpdateProcessor.java:972) >> > >> >>> at >> > >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor. >> finish( >> > >> >> DistributedUpdateProcessor.java:1911) >> > >> >>> at >> > >> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleReque >> stBody( >> > >> >> ContentStreamHandlerBase.java:78) >> > >> >>> at >> > >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest( >> > >> >> RequestHandlerBase.java:173) >> > >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) >> > >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall. >> > >> java:723) >> > >> >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java: >> 529) >> > >> >>> at >> > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( >> > >> >> SolrDispatchFilter.java:361) >> > >> >>> at >> > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( >> > >> >> SolrDispatchFilter.java:305) >> > >> >>> at >> > >> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain. >> > >> >> doFilter(Servlet
Re: Async exceptions during distributed update
The updates are pushed in real time not batched. No complex analysis and everything is committed using autocommit settings in solr. Thanks Jay Potharaju On Mon, May 7, 2018 at 4:00 PM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > How do you send documents? Large batches? Complex analysis? Do you send all > batches to the same node? How do you commit? Do you delete by query while > indexing? > > Emir > > On Tue, May 8, 2018, 12:30 AM Jay Potharaju wrote: > > > I didn't see any OOM errors in the logs on either of the nodes. I saw GC > > pause of 1 second on the box that was throwing error ...but nothing on > the > > other node. Any other recommendations? > > Thanks > > > > > > Thanks > > Jay Potharaju > > > > > > On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju > > wrote: > > > > > Ah thanks for explaining that! > > > > > > Thanks > > > Jay Potharaju > > > > > > > > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović < > > > emir.arnauto...@sematext.com> wrote: > > > > > >> Node A receives batch of documents to index. It forwards documents to > > >> shards that are on the node B. Node B is having issues with GC so it > > takes > > >> a while to respond. Node A sees it as read timeout and reports it in > > logs. > > >> So the issue is on node B not node A. > > >> > > >> Emir > > >> -- > > >> Monitoring - Log Management - Alerting - Anomaly Detection > > >> Solr & Elasticsearch Consulting Support Training - > http://sematext.com/ > > >> > > >> > > >> > > >> > On 7 May 2018, at 18:39, Jay Potharaju > wrote: > > >> > > > >> > Yes, the nodes are well balanced. I am just using these boxes for > > >> indexing > > >> > the data and is not serving any traffic at this time. The error > > >> indicates > > >> > it is having issues errors on the shards that are hosted on the box > > and > > >> not > > >> > on the other box. > > >> > I will check GC logs to see if there were any issues. > > >> > thanks > > >> > > > >> > Thanks > > >> > Jay Potharaju > > >> > > > >> > > > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < > > >> > emir.arnauto...@sematext.com> wrote: > > >> > > > >> >> Hi Jay, > > >> >> My first guess would be that there was some major GC on other box > so > > it > > >> >> did not respond on time. Are your nodes well balanced - do they > serve > > >> equal > > >> >> amount of data? > > >> >> > > >> >> Thanks, > > >> >> Emir > > >> >> -- > > >> >> Monitoring - Log Management - Alerting - Anomaly Detection > > >> >> Solr & Elasticsearch Consulting Support Training - > > >> http://sematext.com/ > > >> >> > > >> >> > > >> >> > > >> >>> On 7 May 2018, at 18:11, Jay Potharaju > > wrote: > > >> >>> > > >> >>> Hi, > > >> >>> I am seeing the following lines in the error log. My setup has 2 > > >> nodes in > > >> >>> the solrcloud cluster, each node has 3 shards with no replication. > > >> From > > >> >> the > > >> >>> error log it seems like all the shards on this box are throwing > > async > > >> >>> exception errors. Other node in the cluster does not have any > errors > > >> in > > >> >> the > > >> >>> logs. Any suggestions on how to tackle this error? > > >> >>> > > >> >>> Solr setup > > >> >>> Solr:6.6.3 > > >> >>> 2Nodes: 3 shards each > > >> >>> > > >> >>> > > >> >>> ERROR org.apache.solr.servlet.HttpSolrCall > [test_shard3_replica1] ? > > >> >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$ > > >> >> DistributedUpdatesAsyncException: > > >> >>> Async exception during distributed update: Read timed out > > >> >>> at > > >> >>> > > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( > > >> >> DistributedUpdateProcessor.java:972) > > >> >>> at > > >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor. > finish( > > >> >> DistributedUpdateProcessor.java:1911) > > >> >>> at > > >> >>> org.apache.solr.handler.ContentStreamHandlerBase. > handleRequestBody( > > >> >> ContentStreamHandlerBase.java:78) > > >> >>> at > > >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest( > > >> >> RequestHandlerBase.java:173) > > >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > > >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall. > > >> java:723) > > >> >>> at org.apache.solr.servlet.HttpSolrCall.call( > HttpSolrCall.java:529) > > >> >>> at > > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > >> >> SolrDispatchFilter.java:361) > > >> >>> at > > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > >> >> SolrDispatchFilter.java:305) > > >> >>> at > > >> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain. > > >> >> doFilter(ServletHandler.java:1691) > > >> >>> at > > >> >>> org.eclipse.jetty.servlet.ServletHandler.doHandle( > > >> >> ServletHandler.java:582) > > >> >>> at > > >> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( > > >> >> ScopedHandler.java:143) > > >> >>> at > > >> >>> org.eclipse.jetty.security.SecurityHandler.handle( > > >> >> SecurityHandler.java:548) > >
Re: Async exceptions during distributed update
How do you send documents? Large batches? Complex analysis? Do you send all batches to the same node? How do you commit? Do you delete by query while indexing? Emir On Tue, May 8, 2018, 12:30 AM Jay Potharaju wrote: > I didn't see any OOM errors in the logs on either of the nodes. I saw GC > pause of 1 second on the box that was throwing error ...but nothing on the > other node. Any other recommendations? > Thanks > > > Thanks > Jay Potharaju > > > On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju > wrote: > > > Ah thanks for explaining that! > > > > Thanks > > Jay Potharaju > > > > > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović < > > emir.arnauto...@sematext.com> wrote: > > > >> Node A receives batch of documents to index. It forwards documents to > >> shards that are on the node B. Node B is having issues with GC so it > takes > >> a while to respond. Node A sees it as read timeout and reports it in > logs. > >> So the issue is on node B not node A. > >> > >> Emir > >> -- > >> Monitoring - Log Management - Alerting - Anomaly Detection > >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > >> > >> > >> > >> > On 7 May 2018, at 18:39, Jay Potharaju wrote: > >> > > >> > Yes, the nodes are well balanced. I am just using these boxes for > >> indexing > >> > the data and is not serving any traffic at this time. The error > >> indicates > >> > it is having issues errors on the shards that are hosted on the box > and > >> not > >> > on the other box. > >> > I will check GC logs to see if there were any issues. > >> > thanks > >> > > >> > Thanks > >> > Jay Potharaju > >> > > >> > > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < > >> > emir.arnauto...@sematext.com> wrote: > >> > > >> >> Hi Jay, > >> >> My first guess would be that there was some major GC on other box so > it > >> >> did not respond on time. Are your nodes well balanced - do they serve > >> equal > >> >> amount of data? > >> >> > >> >> Thanks, > >> >> Emir > >> >> -- > >> >> Monitoring - Log Management - Alerting - Anomaly Detection > >> >> Solr & Elasticsearch Consulting Support Training - > >> http://sematext.com/ > >> >> > >> >> > >> >> > >> >>> On 7 May 2018, at 18:11, Jay Potharaju > wrote: > >> >>> > >> >>> Hi, > >> >>> I am seeing the following lines in the error log. My setup has 2 > >> nodes in > >> >>> the solrcloud cluster, each node has 3 shards with no replication. > >> From > >> >> the > >> >>> error log it seems like all the shards on this box are throwing > async > >> >>> exception errors. Other node in the cluster does not have any errors > >> in > >> >> the > >> >>> logs. Any suggestions on how to tackle this error? > >> >>> > >> >>> Solr setup > >> >>> Solr:6.6.3 > >> >>> 2Nodes: 3 shards each > >> >>> > >> >>> > >> >>> ERROR org.apache.solr.servlet.HttpSolrCall [test_shard3_replica1] ? > >> >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$ > >> >> DistributedUpdatesAsyncException: > >> >>> Async exception during distributed update: Read timed out > >> >>> at > >> >>> > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( > >> >> DistributedUpdateProcessor.java:972) > >> >>> at > >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish( > >> >> DistributedUpdateProcessor.java:1911) > >> >>> at > >> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody( > >> >> ContentStreamHandlerBase.java:78) > >> >>> at > >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest( > >> >> RequestHandlerBase.java:173) > >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall. > >> java:723) > >> >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) > >> >>> at > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > >> >> SolrDispatchFilter.java:361) > >> >>> at > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > >> >> SolrDispatchFilter.java:305) > >> >>> at > >> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain. > >> >> doFilter(ServletHandler.java:1691) > >> >>> at > >> >>> org.eclipse.jetty.servlet.ServletHandler.doHandle( > >> >> ServletHandler.java:582) > >> >>> at > >> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( > >> >> ScopedHandler.java:143) > >> >>> at > >> >>> org.eclipse.jetty.security.SecurityHandler.handle( > >> >> SecurityHandler.java:548) > >> >>> at > >> >>> org.eclipse.jetty.server.session.SessionHandler. > >> >> doHandle(SessionHandler.java:226) > >> >>> at > >> >>> org.eclipse.jetty.server.handler.ContextHandler. > >> >> doHandle(ContextHandler.java:1180) > >> >>> at org.eclipse.jetty.servlet.ServletHandler.doScope( > >> >> ServletHandler.java:512) > >> >>> at > >> >>> org.eclipse.jetty.server.session.SessionHandler. > >> >> doScope(SessionHandler.java:185) > >> >>> at > >> >>> org.eclipse.jetty.server.handler.ContextHandler. > >> >> doScope(ContextHandler.java:1112) > >>
Re: Async exceptions during distributed update
I didn't see any OOM errors in the logs on either of the nodes. I saw GC pause of 1 second on the box that was throwing error ...but nothing on the other node. Any other recommendations? Thanks Thanks Jay Potharaju On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju wrote: > Ah thanks for explaining that! > > Thanks > Jay Potharaju > > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović < > emir.arnauto...@sematext.com> wrote: > >> Node A receives batch of documents to index. It forwards documents to >> shards that are on the node B. Node B is having issues with GC so it takes >> a while to respond. Node A sees it as read timeout and reports it in logs. >> So the issue is on node B not node A. >> >> Emir >> -- >> Monitoring - Log Management - Alerting - Anomaly Detection >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> >> >> >> > On 7 May 2018, at 18:39, Jay Potharaju wrote: >> > >> > Yes, the nodes are well balanced. I am just using these boxes for >> indexing >> > the data and is not serving any traffic at this time. The error >> indicates >> > it is having issues errors on the shards that are hosted on the box and >> not >> > on the other box. >> > I will check GC logs to see if there were any issues. >> > thanks >> > >> > Thanks >> > Jay Potharaju >> > >> > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < >> > emir.arnauto...@sematext.com> wrote: >> > >> >> Hi Jay, >> >> My first guess would be that there was some major GC on other box so it >> >> did not respond on time. Are your nodes well balanced - do they serve >> equal >> >> amount of data? >> >> >> >> Thanks, >> >> Emir >> >> -- >> >> Monitoring - Log Management - Alerting - Anomaly Detection >> >> Solr & Elasticsearch Consulting Support Training - >> http://sematext.com/ >> >> >> >> >> >> >> >>> On 7 May 2018, at 18:11, Jay Potharaju wrote: >> >>> >> >>> Hi, >> >>> I am seeing the following lines in the error log. My setup has 2 >> nodes in >> >>> the solrcloud cluster, each node has 3 shards with no replication. >> From >> >> the >> >>> error log it seems like all the shards on this box are throwing async >> >>> exception errors. Other node in the cluster does not have any errors >> in >> >> the >> >>> logs. Any suggestions on how to tackle this error? >> >>> >> >>> Solr setup >> >>> Solr:6.6.3 >> >>> 2Nodes: 3 shards each >> >>> >> >>> >> >>> ERROR org.apache.solr.servlet.HttpSolrCall [test_shard3_replica1] ? >> >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$ >> >> DistributedUpdatesAsyncException: >> >>> Async exception during distributed update: Read timed out >> >>> at >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( >> >> DistributedUpdateProcessor.java:972) >> >>> at >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish( >> >> DistributedUpdateProcessor.java:1911) >> >>> at >> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody( >> >> ContentStreamHandlerBase.java:78) >> >>> at >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest( >> >> RequestHandlerBase.java:173) >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall. >> java:723) >> >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) >> >>> at >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( >> >> SolrDispatchFilter.java:361) >> >>> at >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( >> >> SolrDispatchFilter.java:305) >> >>> at >> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain. >> >> doFilter(ServletHandler.java:1691) >> >>> at >> >>> org.eclipse.jetty.servlet.ServletHandler.doHandle( >> >> ServletHandler.java:582) >> >>> at >> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( >> >> ScopedHandler.java:143) >> >>> at >> >>> org.eclipse.jetty.security.SecurityHandler.handle( >> >> SecurityHandler.java:548) >> >>> at >> >>> org.eclipse.jetty.server.session.SessionHandler. >> >> doHandle(SessionHandler.java:226) >> >>> at >> >>> org.eclipse.jetty.server.handler.ContextHandler. >> >> doHandle(ContextHandler.java:1180) >> >>> at org.eclipse.jetty.servlet.ServletHandler.doScope( >> >> ServletHandler.java:512) >> >>> at >> >>> org.eclipse.jetty.server.session.SessionHandler. >> >> doScope(SessionHandler.java:185) >> >>> at >> >>> org.eclipse.jetty.server.handler.ContextHandler. >> >> doScope(ContextHandler.java:1112) >> >>> at >> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( >> >> ScopedHandler.java:141) >> >>> at >> >>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle( >> >> ContextHandlerCollection.java:213) >> >>> at >> >>> org.eclipse.jetty.server.handler.HandlerCollection. >> >> handle(HandlerCollection.java:119) >> >>> at >> >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle( >> >> HandlerWrapper.java:134) >> >>> at >> >>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handl
Re: Async exceptions during distributed update
Ah thanks for explaining that! Thanks Jay Potharaju On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Node A receives batch of documents to index. It forwards documents to > shards that are on the node B. Node B is having issues with GC so it takes > a while to respond. Node A sees it as read timeout and reports it in logs. > So the issue is on node B not node A. > > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 7 May 2018, at 18:39, Jay Potharaju wrote: > > > > Yes, the nodes are well balanced. I am just using these boxes for > indexing > > the data and is not serving any traffic at this time. The error > indicates > > it is having issues errors on the shards that are hosted on the box and > not > > on the other box. > > I will check GC logs to see if there were any issues. > > thanks > > > > Thanks > > Jay Potharaju > > > > > > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < > > emir.arnauto...@sematext.com> wrote: > > > >> Hi Jay, > >> My first guess would be that there was some major GC on other box so it > >> did not respond on time. Are your nodes well balanced - do they serve > equal > >> amount of data? > >> > >> Thanks, > >> Emir > >> -- > >> Monitoring - Log Management - Alerting - Anomaly Detection > >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > >> > >> > >> > >>> On 7 May 2018, at 18:11, Jay Potharaju wrote: > >>> > >>> Hi, > >>> I am seeing the following lines in the error log. My setup has 2 nodes > in > >>> the solrcloud cluster, each node has 3 shards with no replication. From > >> the > >>> error log it seems like all the shards on this box are throwing async > >>> exception errors. Other node in the cluster does not have any errors in > >> the > >>> logs. Any suggestions on how to tackle this error? > >>> > >>> Solr setup > >>> Solr:6.6.3 > >>> 2Nodes: 3 shards each > >>> > >>> > >>> ERROR org.apache.solr.servlet.HttpSolrCall [test_shard3_replica1] ? > >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$ > >> DistributedUpdatesAsyncException: > >>> Async exception during distributed update: Read timed out > >>> at > >>> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( > >> DistributedUpdateProcessor.java:972) > >>> at > >>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish( > >> DistributedUpdateProcessor.java:1911) > >>> at > >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody( > >> ContentStreamHandlerBase.java:78) > >>> at > >>> org.apache.solr.handler.RequestHandlerBase.handleRequest( > >> RequestHandlerBase.java:173) > >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) > >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) > >>> at > >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > >> SolrDispatchFilter.java:361) > >>> at > >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > >> SolrDispatchFilter.java:305) > >>> at > >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain. > >> doFilter(ServletHandler.java:1691) > >>> at > >>> org.eclipse.jetty.servlet.ServletHandler.doHandle( > >> ServletHandler.java:582) > >>> at > >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( > >> ScopedHandler.java:143) > >>> at > >>> org.eclipse.jetty.security.SecurityHandler.handle( > >> SecurityHandler.java:548) > >>> at > >>> org.eclipse.jetty.server.session.SessionHandler. > >> doHandle(SessionHandler.java:226) > >>> at > >>> org.eclipse.jetty.server.handler.ContextHandler. > >> doHandle(ContextHandler.java:1180) > >>> at org.eclipse.jetty.servlet.ServletHandler.doScope( > >> ServletHandler.java:512) > >>> at > >>> org.eclipse.jetty.server.session.SessionHandler. > >> doScope(SessionHandler.java:185) > >>> at > >>> org.eclipse.jetty.server.handler.ContextHandler. > >> doScope(ContextHandler.java:1112) > >>> at > >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( > >> ScopedHandler.java:141) > >>> at > >>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle( > >> ContextHandlerCollection.java:213) > >>> at > >>> org.eclipse.jetty.server.handler.HandlerCollection. > >> handle(HandlerCollection.java:119) > >>> at > >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle( > >> HandlerWrapper.java:134) > >>> at > >>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle( > >> RewriteHandler.java:335) > >>> at > >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle( > >> HandlerWrapper.java:134) > >>> at org.eclipse.jetty.server.Server.handle(Server.java:534) > >>> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) > >>> at > >>> org.eclipse.jetty.server.HttpConnection.onFillable( > >> HttpConnection.java:251) > >>> at > >>> org.eclipse.jetty.io.AbstractConnection$Re
Re: Async exceptions during distributed update
Node A receives batch of documents to index. It forwards documents to shards that are on the node B. Node B is having issues with GC so it takes a while to respond. Node A sees it as read timeout and reports it in logs. So the issue is on node B not node A. Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 7 May 2018, at 18:39, Jay Potharaju wrote: > > Yes, the nodes are well balanced. I am just using these boxes for indexing > the data and is not serving any traffic at this time. The error indicates > it is having issues errors on the shards that are hosted on the box and not > on the other box. > I will check GC logs to see if there were any issues. > thanks > > Thanks > Jay Potharaju > > > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < > emir.arnauto...@sematext.com> wrote: > >> Hi Jay, >> My first guess would be that there was some major GC on other box so it >> did not respond on time. Are your nodes well balanced - do they serve equal >> amount of data? >> >> Thanks, >> Emir >> -- >> Monitoring - Log Management - Alerting - Anomaly Detection >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> >> >> >>> On 7 May 2018, at 18:11, Jay Potharaju wrote: >>> >>> Hi, >>> I am seeing the following lines in the error log. My setup has 2 nodes in >>> the solrcloud cluster, each node has 3 shards with no replication. From >> the >>> error log it seems like all the shards on this box are throwing async >>> exception errors. Other node in the cluster does not have any errors in >> the >>> logs. Any suggestions on how to tackle this error? >>> >>> Solr setup >>> Solr:6.6.3 >>> 2Nodes: 3 shards each >>> >>> >>> ERROR org.apache.solr.servlet.HttpSolrCall [test_shard3_replica1] ? >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$ >> DistributedUpdatesAsyncException: >>> Async exception during distributed update: Read timed out >>> at >>> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( >> DistributedUpdateProcessor.java:972) >>> at >>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish( >> DistributedUpdateProcessor.java:1911) >>> at >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody( >> ContentStreamHandlerBase.java:78) >>> at >>> org.apache.solr.handler.RequestHandlerBase.handleRequest( >> RequestHandlerBase.java:173) >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) >>> at >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( >> SolrDispatchFilter.java:361) >>> at >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter( >> SolrDispatchFilter.java:305) >>> at >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain. >> doFilter(ServletHandler.java:1691) >>> at >>> org.eclipse.jetty.servlet.ServletHandler.doHandle( >> ServletHandler.java:582) >>> at >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( >> ScopedHandler.java:143) >>> at >>> org.eclipse.jetty.security.SecurityHandler.handle( >> SecurityHandler.java:548) >>> at >>> org.eclipse.jetty.server.session.SessionHandler. >> doHandle(SessionHandler.java:226) >>> at >>> org.eclipse.jetty.server.handler.ContextHandler. >> doHandle(ContextHandler.java:1180) >>> at org.eclipse.jetty.servlet.ServletHandler.doScope( >> ServletHandler.java:512) >>> at >>> org.eclipse.jetty.server.session.SessionHandler. >> doScope(SessionHandler.java:185) >>> at >>> org.eclipse.jetty.server.handler.ContextHandler. >> doScope(ContextHandler.java:1112) >>> at >>> org.eclipse.jetty.server.handler.ScopedHandler.handle( >> ScopedHandler.java:141) >>> at >>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle( >> ContextHandlerCollection.java:213) >>> at >>> org.eclipse.jetty.server.handler.HandlerCollection. >> handle(HandlerCollection.java:119) >>> at >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle( >> HandlerWrapper.java:134) >>> at >>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle( >> RewriteHandler.java:335) >>> at >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle( >> HandlerWrapper.java:134) >>> at org.eclipse.jetty.server.Server.handle(Server.java:534) >>> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) >>> at >>> org.eclipse.jetty.server.HttpConnection.onFillable( >> HttpConnection.java:251) >>> at >>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded( >> AbstractConnection.java:273) >>> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) >>> at >>> org.eclipse.jetty.io.SelectChannelEndPoint$2.run( >> SelectChannelEndPoint.java:93) >>> at >>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( >> QueuedThreadPool.java:671) >>> at >>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run( >> QueuedThreadP
Re: Async exceptions during distributed update
Yes, the nodes are well balanced. I am just using these boxes for indexing the data and is not serving any traffic at this time. The error indicates it is having issues errors on the shards that are hosted on the box and not on the other box. I will check GC logs to see if there were any issues. thanks Thanks Jay Potharaju On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Jay, > My first guess would be that there was some major GC on other box so it > did not respond on time. Are your nodes well balanced - do they serve equal > amount of data? > > Thanks, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 7 May 2018, at 18:11, Jay Potharaju wrote: > > > > Hi, > > I am seeing the following lines in the error log. My setup has 2 nodes in > > the solrcloud cluster, each node has 3 shards with no replication. From > the > > error log it seems like all the shards on this box are throwing async > > exception errors. Other node in the cluster does not have any errors in > the > > logs. Any suggestions on how to tackle this error? > > > > Solr setup > > Solr:6.6.3 > > 2Nodes: 3 shards each > > > > > > ERROR org.apache.solr.servlet.HttpSolrCall [test_shard3_replica1] ? > > null:org.apache.solr.update.processor.DistributedUpdateProcessor$ > DistributedUpdatesAsyncException: > > Async exception during distributed update: Read timed out > > at > > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish( > DistributedUpdateProcessor.java:972) > > at > > org.apache.solr.update.processor.DistributedUpdateProcessor.finish( > DistributedUpdateProcessor.java:1911) > > at > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody( > ContentStreamHandlerBase.java:78) > > at > > org.apache.solr.handler.RequestHandlerBase.handleRequest( > RequestHandlerBase.java:173) > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) > > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) > > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:361) > > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter( > SolrDispatchFilter.java:305) > > at > > org.eclipse.jetty.servlet.ServletHandler$CachedChain. > doFilter(ServletHandler.java:1691) > > at > > org.eclipse.jetty.servlet.ServletHandler.doHandle( > ServletHandler.java:582) > > at > > org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:143) > > at > > org.eclipse.jetty.security.SecurityHandler.handle( > SecurityHandler.java:548) > > at > > org.eclipse.jetty.server.session.SessionHandler. > doHandle(SessionHandler.java:226) > > at > > org.eclipse.jetty.server.handler.ContextHandler. > doHandle(ContextHandler.java:1180) > > at org.eclipse.jetty.servlet.ServletHandler.doScope( > ServletHandler.java:512) > > at > > org.eclipse.jetty.server.session.SessionHandler. > doScope(SessionHandler.java:185) > > at > > org.eclipse.jetty.server.handler.ContextHandler. > doScope(ContextHandler.java:1112) > > at > > org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:141) > > at > > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle( > ContextHandlerCollection.java:213) > > at > > org.eclipse.jetty.server.handler.HandlerCollection. > handle(HandlerCollection.java:119) > > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:134) > > at > > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle( > RewriteHandler.java:335) > > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:134) > > at org.eclipse.jetty.server.Server.handle(Server.java:534) > > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) > > at > > org.eclipse.jetty.server.HttpConnection.onFillable( > HttpConnection.java:251) > > at > > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded( > AbstractConnection.java:273) > > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) > > at > > org.eclipse.jetty.io.SelectChannelEndPoint$2.run( > SelectChannelEndPoint.java:93) > > at > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( > QueuedThreadPool.java:671) > > at > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run( > QueuedThreadPool.java:589) > > at java.lang.Thread.run(Unknown Source) > > > > > > Thanks > > Jay > >
Re: Async exceptions during distributed update
Hi Jay, My first guess would be that there was some major GC on other box so it did not respond on time. Are your nodes well balanced - do they serve equal amount of data? Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 7 May 2018, at 18:11, Jay Potharaju wrote: > > Hi, > I am seeing the following lines in the error log. My setup has 2 nodes in > the solrcloud cluster, each node has 3 shards with no replication. From the > error log it seems like all the shards on this box are throwing async > exception errors. Other node in the cluster does not have any errors in the > logs. Any suggestions on how to tackle this error? > > Solr setup > Solr:6.6.3 > 2Nodes: 3 shards each > > > ERROR org.apache.solr.servlet.HttpSolrCall [test_shard3_replica1] ? > null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > Async exception during distributed update: Read timed out > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.eclipse.jetty.server.Server.handle(Server.java:534) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) > at > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at java.lang.Thread.run(Unknown Source) > > > Thanks > Jay
Async exceptions during distributed update
Hi, I am seeing the following lines in the error log. My setup has 2 nodes in the solrcloud cluster, each node has 3 shards with no replication. From the error log it seems like all the shards on this box are throwing async exception errors. Other node in the cluster does not have any errors in the logs. Any suggestions on how to tackle this error? Solr setup Solr:6.6.3 2Nodes: 3 shards each ERROR org.apache.solr.servlet.HttpSolrCall [test_shard3_replica1] ? null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Read timed out at org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972) at org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.server.Server.handle(Server.java:534) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Unknown Source) Thanks Jay