[ https://issues.apache.org/jira/browse/SOLR-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145553#comment-16145553 ]
Amrit Sarkar edited comment on SOLR-11278 at 8/30/17 12:00 PM: --------------------------------------------------------------- [~varunthacker] [~erickerickson] :: This is on Solr Version: 6.3 Yes you are right. This solution is not full-proofed. I am able to narrow down one problem, still no solution on the way: {code} [junit4] 2> 93761 ERROR (recoveryExecutor-5-thread-1-processing-n:127.0.0.1:55637_solr x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) [n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 x:cdcr-target_shard1_replica1] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Index fetch failed : [junit4] 2> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:540) [junit4] 2> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:251) [junit4] 2> at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397) [junit4] 2> at org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:758) [junit4] 2> at org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:713) [junit4] 2> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] 2> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] 2> at java.lang.Thread.run(Thread.java:745) [junit4] 2> Caused by: java.lang.NullPointerException [junit4] 2> at org.apache.solr.handler.IndexFetcher.openNewSearcherAndUpdateCommitPoint(IndexFetcher.java:753) [junit4] 2> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:517) [junit4] 2> ... 9 more [junit4] 2> {code} The problem is with *{{target.shutdown}}*. While doing target shutdown, it doesn't shutdown properly. {code} [junit4] 2> 67512 INFO (coreCloseExecutor-49-thread-1) [n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Solr core is being closed - shutting down CDCR handler @ cdcr-target:shard1 [junit4] 2> 67515 WARN (updateExecutor-4-thread-1-processing-n:127.0.0.1:55637_solr x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) [n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Bootstrap was interrupted [junit4] 2> java.lang.InterruptedException [junit4] 2> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) [junit4] 2> at java.util.concurrent.FutureTask.get(FutureTask.java:191) [junit4] 2> at org.apache.solr.handler.CdcrRequestHandler.lambda$handleBootstrapAction$0(CdcrRequestHandler.java:644) [junit4] 2> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [junit4] 2> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] 2> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] 2> at java.lang.Thread.run(Thread.java:745) {code} and the core gets reloaded and throws NPE. I am diving into this properly, but it have a feeling this is a machine specific issue as we didn't see much of these in Jenkins failures as mentioned by Varun. was (Author: sarkaramr...@gmail.com): [~varunthacker] [~erickerickson] Yes you are right. This solution is not full-proofed. I am able to narrow down one problem, still no solution on the way: {code} [junit4] 2> 93761 ERROR (recoveryExecutor-5-thread-1-processing-n:127.0.0.1:55637_solr x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) [n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 x:cdcr-target_shard1_replica1] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Index fetch failed : [junit4] 2> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:540) [junit4] 2> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:251) [junit4] 2> at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397) [junit4] 2> at org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:758) [junit4] 2> at org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:713) [junit4] 2> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] 2> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] 2> at java.lang.Thread.run(Thread.java:745) [junit4] 2> Caused by: java.lang.NullPointerException [junit4] 2> at org.apache.solr.handler.IndexFetcher.openNewSearcherAndUpdateCommitPoint(IndexFetcher.java:753) [junit4] 2> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:517) [junit4] 2> ... 9 more [junit4] 2> {code} The problem is with *{{target.shutdown}}*. While doing target shutdown, it doesn't shutdown properly. {code} [junit4] 2> 67512 INFO (coreCloseExecutor-49-thread-1) [n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Solr core is being closed - shutting down CDCR handler @ cdcr-target:shard1 [junit4] 2> 67515 WARN (updateExecutor-4-thread-1-processing-n:127.0.0.1:55637_solr x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) [n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Bootstrap was interrupted [junit4] 2> java.lang.InterruptedException [junit4] 2> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) [junit4] 2> at java.util.concurrent.FutureTask.get(FutureTask.java:191) [junit4] 2> at org.apache.solr.handler.CdcrRequestHandler.lambda$handleBootstrapAction$0(CdcrRequestHandler.java:644) [junit4] 2> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [junit4] 2> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] 2> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] 2> at java.lang.Thread.run(Thread.java:745) {code} and the core gets reloaded and throws NPE. I am diving into this properly, but it have a feeling this is a machine specific issue as we didn't see much of these in Jenkins failures as mentioned by Varun. > CdcrBootstrapTest failing in branch_6_6 > --------------------------------------- > > Key: SOLR-11278 > URL: https://issues.apache.org/jira/browse/SOLR-11278 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: CDCR > Reporter: Amrit Sarkar > Assignee: Varun Thacker > Attachments: SOLR-11278.patch, test_results > > > I ran beast for 10 rounds: > ant beast -Dtestcase=CdcrBootstrapTest -Dtests.multiplier=2 -Dtests.slow=true > -Dtests.locale=vi -Dtests.timezone=Asia/Yekaterinburg -Dtests.asserts=true > -Dtests.file.encoding=US-ASCII -Dbeast.iters=10 > and seeing following failure: > {code} > [beaster] [01:37:16.282] FAILURE 153s | > CdcrBootstrapTest.testBootstrapWithSourceCluster <<< > [beaster] > Throwable #1: java.lang.AssertionError: Document mismatch on > target after sync expected:<2000> but was:<1000> > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org