[ 
https://issues.apache.org/jira/browse/SOLR-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089202#comment-14089202
 ] 

Mark Miller commented on SOLR-6213:
-----------------------------------

This is pretty much a currently expected result / limitation of the system if 
no replica can become leader. They will keep retrying until stack overflow.

> StackOverflowException in Solr cloud's leader election
> ------------------------------------------------------
>
>                 Key: SOLR-6213
>                 URL: https://issues.apache.org/jira/browse/SOLR-6213
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.0, 4.10
>            Reporter: Dawid Weiss
>            Priority: Critical
>
> This is what's causing test hangs (at least on FreeBSD, LUCENE-5786), 
> possibly on other machines too. The problem is stack overflow from looped 
> calls in:
> {code}
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)
>   > 
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
>   > 
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
>   > org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)
>   > 
> org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)
> {code}
> These routines attempt to log information to loggers, which in turn attempts 
> to serialize messages back to the master (test process). When the stack is 
> exhausted the serialization process fails and breaks the communication with 
> the master test node.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to