runzhiwang opened a new pull request #383:
URL: https://github.com/apache/incubator-ratis/pull/383


   ## What changes were proposed in this pull request?
   
   **What's the problem ?**
   
   For example, there are 3 servers: s1, s2, s3, and s1 is leader. When 
split-brain happens, s2 was elected as new leader, but s1 still think it's 
leader, when client read from s1, if s2 has processed write request, client 
will read old data from s1.
   
   **How to fix ?**
   Assign the leader with a lease, the leader would use the
   normal heartbeat mechanism to maintain a lease. Once the leader’s heartbeats 
were acknowledged by a majority of the cluster, it would extends its lease to 
`start+ election timeout`, since the followers shouldn’t time out before then, 
so we can make sure there will no new leader was elected(need pre-vote feature 
and need to consider transferLeadership feature) , so before start + election 
timeout, there will not split-brain happens.
   
![image](https://user-images.githubusercontent.com/51938049/103255180-4d581280-49c3-11eb-855e-ccd972755a4b.png).
   
   Why need pre-vote feature ?
   
   As the image shows, s1 is leader, but server1 can not connect with s2, even 
though server1 extend its lease to start+ election timeout when s1 receive 
acknowledgement from s3, but before start+ election timeout, s1 isolated from 
all servers, and s2 maybe timeout and start election and change to leader 
immediately with vote from s3, so both s1 and s2 think itself as leader before 
start+ election timeout. But if with pre-vote feature, when s2 request vote, s3 
check s1's leadership is still valid, s3 will reject vote to s2, only one 
leader exists.
   
   
![image](https://user-images.githubusercontent.com/51938049/103255778-9c9f4280-49c5-11eb-8ae6-4a6fa035b2ad.png)
   
   How to address transferLeadership ?
   
   For example, s1 is leader and extend its lease to `start+ election timeout` 
when s1 receive acknowledgement from s2 and s3. But before `start+ election 
timeout`, admin maybe call transferLeadership(s2), after s1 send 
StartLeaderElectionRequest to s2, s1 isolated from all servers, then s2 start 
election and change to leader immediately with vote from s3, so both s1 and s2 
think itself as leader before start+ election timeout.
   
   So s1 should step down as a follower when s1 send StartLeaderElectionRequest 
to s2.
   
   @szetszwo Could you help review this proposal ?
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/RATIS-1273
   
   ## How was this patch tested?
   
   TODO
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to