hanm commented on pull request #1690:
URL: https://github.com/apache/zookeeper/pull/1690#issuecomment-847186137


   > @hanm Maybe it would be inappropriate for me to ask the following 
questions under this PR. Now I am learning ZAB 1.0 and prepare to write spec. 
Could you please let me ask about details in it?
   > 
   > Currently I have several problems about ZAB 1.0.
   > 
   > 1. For some detailed industial implementation (like message with type 
PING, connection between leader and followers), do I need to implement these in 
spec?
   
   It depends on if the particular message type is critical to reveal important 
details of the protocol. Messages like NEWLEADER or UPTODATE is an integral 
part of the protocol, I don't think we can abstract it away. For Ping though, I 
think we can omit it for now.
   
   > 2. I see before sending NEWLEADER, leader will choose a best message from 
SNAP,TRUNC,DIFF to send according to the corresponding follower's state. Do I 
need to implement these, or abstract this part and just send NEWLEADER to sync 
with followers?
   
   I am inclined to abstract this part away using "RECOVERY_SYNC" operation. In 
fact, that's what the original pre-1.0 did where a leader always sync learners 
with full history. We can go more fine grained later. This will hopefully 
reduce the state space.
   
   > 3. I don't understand whether variable 'loracle' is more like a global 
variable that has the latest leader ID and every server can reach it, or 
'loracle' is a local variable like 'votedFor' in Raft. (Actually I used the 
latter in Zab.tla, which corresponds to 'leaderOracle'. And I think it has no 
effect on correctness no matter which one is used.)
   
   ZAB 1.0 and its implementation does not use this "loracle". Instead, a 
server always starts a leader election and get the latest later ID through the 
leader election results. If we want an abstraction here, I would tend to think 
"loracle" is a global variable.
   
   > 4. The last problem what I want to ask is that about recovery, because I 
didn't see notes about this part in paper and the link.
   >    I saw how the leader handles when a new server wants to join them, but 
I didn't see how a restarted server finds the latest leader. If 'loracle' is 
the former in question 3, I can understand this part. In Zab.tla, I use the 
method where the follower broadcasts messages to ask other servers what their 
local oracle is, and update its oracle when receiving the same oracle and epoch 
from a quorum(I used this method because I saw servers recover like this in 
View-stamped Replication).
   
   As mentioned earlier, a restarted (or newly joined) server finds latest 
leader through leader election.
   
   >    A follower must have received corresponding PROPOSAL when receiving 
COMMIT, if the follower is the initial server that joins Q. But I think this 
may not always be true when the follower is one that joins Q midway. So I want 
to know how follower handles to catch up state when receiving COMMIT 
corresponding to a transaction that not exists in its local history.(You could 
see this condition in the image at the bottom of README. When I wrote spec for 
Zab pre-1.0, I choose to let followers keep re-sending CEPOCH to obtain latest 
transactions until the conflict described above disappers.)
   
   The "follow handles to catch up" part of the protocol is the recovery part 
(SNAP, DIFF, TRANC). Follower will not start broadcast unless the recovery 
phase is finished. The invariant here is that once recovery finished, follower 
should have the latest history of quorum / leader so any COMMIT it receives has 
a corresponding entry in its history. This is different comparing to Paxos or 
Raft where there was no dedicated recovery phase.
   
   > 
   > I am very sorry if my problems bother you. Thank you for the various 
feedbacks and suggestions you have provided before!
   
   No worries, I am not super active in community these days, but I  will 
answer questions to the issues that I am involved in (with undefined SLA).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to