[ 
https://issues.apache.org/jira/browse/SOLR-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145553#comment-16145553
 ] 

Amrit Sarkar edited comment on SOLR-11278 at 8/30/17 12:00 PM:
---------------------------------------------------------------

[~varunthacker] [~erickerickson] :: This is on Solr Version: 6.3

Yes you are right. This solution is not full-proofed.

I am able to narrow down one problem, still no solution on the way:

{code}
   [junit4]   2> 93761 ERROR 
(recoveryExecutor-5-thread-1-processing-n:127.0.0.1:55637_solr 
x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
[n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 
x:cdcr-target_shard1_replica1] o.a.s.h.ReplicationHandler Index fetch failed 
:org.apache.solr.common.SolrException: Index fetch failed : 
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:540)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:251)
   [junit4]   2>        at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:758)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:713)
   [junit4]   2>        at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
   [junit4]   2>        at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> Caused by: java.lang.NullPointerException
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.openNewSearcherAndUpdateCommitPoint(IndexFetcher.java:753)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:517)
   [junit4]   2>        ... 9 more
   [junit4]   2> 
{code}

The problem is with *{{target.shutdown}}*. While doing target shutdown, it 
doesn't shutdown properly.

{code}
   [junit4]   2> 67512 INFO  (coreCloseExecutor-49-thread-1) 
[n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 
x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Solr core is being 
closed - shutting down CDCR handler @ cdcr-target:shard1
   [junit4]   2> 67515 WARN  
(updateExecutor-4-thread-1-processing-n:127.0.0.1:55637_solr 
x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
[n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 
x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Bootstrap was 
interrupted
   [junit4]   2> java.lang.InterruptedException
   [junit4]   2>        at 
java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
   [junit4]   2>        at 
java.util.concurrent.FutureTask.get(FutureTask.java:191)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler.lambda$handleBootstrapAction$0(CdcrRequestHandler.java:644)
   [junit4]   2>        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   [junit4]   2>        at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
   [junit4]   2>        at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
{code}

and the core gets reloaded and throws NPE. I am diving into this properly, but 
it have a feeling this is a machine specific issue as we didn't see much of 
these in Jenkins failures as mentioned by Varun.


was (Author: sarkaramr...@gmail.com):
[~varunthacker] [~erickerickson]

Yes you are right. This solution is not full-proofed.

I am able to narrow down one problem, still no solution on the way:

{code}
   [junit4]   2> 93761 ERROR 
(recoveryExecutor-5-thread-1-processing-n:127.0.0.1:55637_solr 
x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
[n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 
x:cdcr-target_shard1_replica1] o.a.s.h.ReplicationHandler Index fetch failed 
:org.apache.solr.common.SolrException: Index fetch failed : 
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:540)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:251)
   [junit4]   2>        at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:758)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:713)
   [junit4]   2>        at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
   [junit4]   2>        at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> Caused by: java.lang.NullPointerException
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.openNewSearcherAndUpdateCommitPoint(IndexFetcher.java:753)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:517)
   [junit4]   2>        ... 9 more
   [junit4]   2> 
{code}

The problem is with *{{target.shutdown}}*. While doing target shutdown, it 
doesn't shutdown properly.

{code}
   [junit4]   2> 67512 INFO  (coreCloseExecutor-49-thread-1) 
[n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 
x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Solr core is being 
closed - shutting down CDCR handler @ cdcr-target:shard1
   [junit4]   2> 67515 WARN  
(updateExecutor-4-thread-1-processing-n:127.0.0.1:55637_solr 
x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
[n:127.0.0.1:55637_solr c:cdcr-target s:shard1 r:core_node1 
x:cdcr-target_shard1_replica1] o.a.s.h.CdcrRequestHandler Bootstrap was 
interrupted
   [junit4]   2> java.lang.InterruptedException
   [junit4]   2>        at 
java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
   [junit4]   2>        at 
java.util.concurrent.FutureTask.get(FutureTask.java:191)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler.lambda$handleBootstrapAction$0(CdcrRequestHandler.java:644)
   [junit4]   2>        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   [junit4]   2>        at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
   [junit4]   2>        at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
{code}

and the core gets reloaded and throws NPE. I am diving into this properly, but 
it have a feeling this is a machine specific issue as we didn't see much of 
these in Jenkins failures as mentioned by Varun.

> CdcrBootstrapTest failing in branch_6_6
> ---------------------------------------
>
>                 Key: SOLR-11278
>                 URL: https://issues.apache.org/jira/browse/SOLR-11278
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: CDCR
>            Reporter: Amrit Sarkar
>            Assignee: Varun Thacker
>         Attachments: SOLR-11278.patch, test_results
>
>
> I ran beast for 10 rounds:
> ant beast -Dtestcase=CdcrBootstrapTest -Dtests.multiplier=2 -Dtests.slow=true 
> -Dtests.locale=vi -Dtests.timezone=Asia/Yekaterinburg -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII -Dbeast.iters=10
> and seeing following failure:
> {code}
>   [beaster] [01:37:16.282] FAILURE  153s | 
> CdcrBootstrapTest.testBootstrapWithSourceCluster <<<
>   [beaster]    > Throwable #1: java.lang.AssertionError: Document mismatch on 
> target after sync expected:<2000> but was:<1000>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to