[
https://issues.apache.org/jira/browse/SOLR-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537479#comment-16537479
]
Steve Rowe commented on SOLR-12412:
-----------------------------------
Policeman Jenkins found a reproducing seed
[https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-MacOSX/734/] for test failures
that {{git bisect}} blames on commit {{fddf35c}} on this issue:
{noformat}
Checking out Revision 80eb5da7393dd25c8cb566194eb9158de212bfb2
(refs/remotes/origin/branch_7x)
[...]
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestPullReplica
-Dtests.method=testKillLeader -Dtests.seed=89003455250E12D2 -Dtests.slow=true
-Dtests.locale=lg -Dtests.timezone=America/Rainy_River -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII
[junit4] FAILURE 60.4s J1 | TestPullReplica.testKillLeader <<<
[junit4] > Throwable #1: java.lang.AssertionError: Replica core_node4 not
up to date after 10 seconds expected:<1> but was:<0>
[junit4] > at
__randomizedtesting.SeedInfo.seed([89003455250E12D2:C016C0E147B58684]:0)
[junit4] > at
org.apache.solr.cloud.TestPullReplica.waitForNumDocsInAllReplicas(TestPullReplica.java:542)
[junit4] > at
org.apache.solr.cloud.TestPullReplica.doTestNoLeader(TestPullReplica.java:490)
[junit4] > at
org.apache.solr.cloud.TestPullReplica.testKillLeader(TestPullReplica.java:309)
[junit4] > at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit4] > at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit4] > at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit4] > at
java.base/java.lang.reflect.Method.invoke(Method.java:564)
[junit4] > at java.base/java.lang.Thread.run(Thread.java:844)
[...]
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestPullReplica
-Dtests.method=testRemoveAllWriterReplicas -Dtests.seed=89003455250E12D2
-Dtests.slow=true -Dtests.locale=lg -Dtests.timezone=America/Rainy_River
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII
[junit4] FAILURE 24.6s J1 | TestPullReplica.testRemoveAllWriterReplicas <<<
[junit4] > Throwable #1: java.lang.AssertionError: Replica core_node4 not
up to date after 10 seconds expected:<1> but was:<0>
[junit4] > at
__randomizedtesting.SeedInfo.seed([89003455250E12D2:1A0EA86E31F0FB7B]:0)
[junit4] > at
org.apache.solr.cloud.TestPullReplica.waitForNumDocsInAllReplicas(TestPullReplica.java:542)
[junit4] > at
org.apache.solr.cloud.TestPullReplica.doTestNoLeader(TestPullReplica.java:490)
[junit4] > at
org.apache.solr.cloud.TestPullReplica.testRemoveAllWriterReplicas(TestPullReplica.java:303)
[junit4] > at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit4] > at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit4] > at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit4] > at
java.base/java.lang.reflect.Method.invoke(Method.java:564)
[junit4] > at java.base/java.lang.Thread.run(Thread.java:844)
[...]
[junit4] 2> NOTE: test params are:
codec=HighCompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=HIGH_COMPRESSION,
chunkSize=8218, maxDocsPerChunk=6, blockSize=10),
termVectorsFormat=CompressingTermVectorsFormat(compressionMode=HIGH_COMPRESSION,
chunkSize=8218, blockSize=10)), sim=RandomSimilarity(queryNorm=true): {},
locale=lg, timezone=America/Rainy_River
[junit4] 2> NOTE: Mac OS X 10.11.6 x86_64/Oracle Corporation 9
(64-bit)/cpus=3,threads=1,free=262884464,total=536870912
{noformat}
> Leader should give up leadership when IndexWriter.tragedy occur
> ---------------------------------------------------------------
>
> Key: SOLR-12412
> URL: https://issues.apache.org/jira/browse/SOLR-12412
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Cao Manh Dat
> Assignee: Cao Manh Dat
> Priority: Major
> Attachments: SOLR-12412.patch, SOLR-12412.patch
>
>
> When a leader meets some kind of unrecoverable exception (ie:
> CorruptedIndexException). The shard will go into the readable state and human
> has to intervene. In that case, it will be the best if the leader gives up
> its leadership and let other replicas become the leader.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]