Timothy Potter created SOLR-6236:
------------------------------------

             Summary: Need an optional fallback mechanism for selecting a 
leader when all replicas are in leader-initiated recovery.
                 Key: SOLR-6236
                 URL: https://issues.apache.org/jira/browse/SOLR-6236
             Project: Solr
          Issue Type: Improvement
          Components: SolrCloud
            Reporter: Timothy Potter


Offshoot from discussion in SOLR-6235, key points are:

Tim: In ElectionContext, when running shouldIBeLeader, the node will choose to 
not be the leader if it is in LIR. However, this could lead to no leader. My 
thinking there is the state is bad enough that we would need manual 
intervention to clear one of the LIR znodes to allow a replica to get past this 
point. But maybe we can do better here?

Shalin: Good question. With careful use of minRf, the user can retry operations 
and maintain consistency even if we arbitrarily elect a leader in this case. 
But most people won't use minRf and don't care about consistency as much as 
availability. For them there should be a way to get out of this mess easily. We 
can have a collection property (boolean + timeout value) to force elect a 
leader even if all shards were in LIR. What do you think?

Mark: Indeed, it's a current limitation that you can have all nodes in a shard 
thinking they cannot be leader, even when all of them are available. This is 
not required by the distributed model we have at all, it's just a consequence 
of being over restrictive on the initial implementation - if all known replicas 
are participating, you should be able to get a leader. So I'm not sure if this 
case should be optional. But iff not all known replicas are participating and 
you still want to force a leader, that should be optional - I think it should 
default to false though. I think the system should default to reasonable data 
safety in these cases.
How best to solve this, I'm not quite sure, but happy to look at a patch. How 
do you plan on monitoring and taking action? Via the Overseer? It seems tricky 
to do it from the replicas.

Tim: We have a similar issue where a replica attempting to be the leader needs 
to wait a while to see other replicas before declaring itself the leader, see 
ElectionContext around line 200:
int leaderVoteWait = cc.getZkController().getLeaderVoteWait();
if (!weAreReplacement)
{ waitForReplicasToComeUp(weAreReplacement, leaderVoteWait); }
So one quick idea might be to have the code that checks if it's in LIR see if 
all replicas are in LIR and if so, wait out the leaderVoteWait period and check 
again. If all are still in LIR, then move on with becoming the leader (in the 
spirit of availability).

{quote}
But iff not all known replicas are participating and you still want to force a 
leader, that should be optional - I think it should default to false though. I 
think the system should default to reasonable data safety in these cases.
{quote}
Shalin: That's the same case as the leaderVoteWait situation and we do go ahead 
after that amount of time even if all replicas aren't participating. Therefore, 
I think that we should handle it the same way. But to help people who care 
about consistency over availability, there should be a configurable property 
which bans this auto-promotion completely.
In any case, we should switch to coreNodeName instead of coreName and open an 
issue to improve the leader election part.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to