[ 
https://issues.apache.org/jira/browse/SOLR-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14875931#comment-14875931
 ] 

Jessica Cheng Mallet commented on SOLR-8069:
--------------------------------------------

bq. I think of course it is. It's valid for the leader and only the leader to 
set anyone as down.

It's definitely only valid for the leader to set anyone down, but it doesn't 
mean that the leader should set someone down based on old leadership decision. 
This is the only place I'm unsure about.

bq. I don't see an easy way to do that in this case. Almost all the solutions 
that fit with the code have the exact same holes / races.

If we're willing to make more changes, one way I see this work is to write down 
the election node path as a prop in the leader znode (this is now written via 
zk transaction from your other commit). Then, have the isLeader logic in 
DistributedUpdateProcessor be based on reading the leader znode, and at that 
point record down the election node path as well. Then, when setting LiR, 
predicate the ZK transaction on the election node path read in the beginning of 
DistributedUpdateProcessor.

> Leader Initiated Recovery can put the replica with the latest data into LIR 
> and a shard will have no leader even on restart.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-8069
>                 URL: https://issues.apache.org/jira/browse/SOLR-8069
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Mark Miller
>         Attachments: SOLR-8069.patch, SOLR-8069.patch
>
>
> I've seen this twice now. Need to work on a test.
> When some issues hit all the replicas at once, you can end up in a situation 
> where the rightful leader was put or put itself into LIR. Even on restart, 
> this rightful leader won't take leadership and you have to manually clear the 
> LIR nodes.
> It seems that if all the replicas participate in election on startup, LIR 
> should just be cleared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to