[ https://issues.apache.org/jira/browse/SOLR-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803172#comment-14803172 ]
Jessica Cheng Mallet commented on SOLR-8069: -------------------------------------------- We have definitely seen this as well, even after commit for SOLR-7109 added zookeeper multi transaction to ZkController.markShardAsDownIfLeader, which is supposed to predicate setting the LiR node on the setter's still having the same election znode it thinks it has when it's a leader. Hmmm, reading the code now I'm not sure it's doing exactly the right thing since it calls getLeaderSeqPath, which just takes the current ElectionContext from electionContexts, which isn't necessarily the one the node had when it decided to mark someone else down, right? [~shalinmangar] thoughts? > Leader Initiated Recovery can put the replica with the latest data into LIR > and a shard will have no leader even on restart. > ---------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-8069 > URL: https://issues.apache.org/jira/browse/SOLR-8069 > Project: Solr > Issue Type: Bug > Reporter: Mark Miller > > I've seen this twice now. Need to work on a test. > When some issues hit all the replicas at once, you can end up in a situation > where the rightful leader was put or put itself into LIR. Even on restart, > this rightful leader won't take leadership and you have to manually clear the > LIR nodes. > It seems that if all the replicas participate in election on startup, LIR > should just be cleared. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org