reddycharan commented on issue #1747: (WIP) Set ConnectionExpired Listener to 
MetadataClientDriver in AR
URL: https://github.com/apache/bookkeeper/pull/1747#issuecomment-428381295
 
 
   @eolivelli @sijie 
   
   **Issue Summary:** I added the test case validating the need of this fix. As 
you can notice in this test case, with out this fix, when AR (with Auditor)’s 
ZK session expires, then as expected other AR will become Auditor and start 
running Auditor. But unexpectedly the previous Auditor also continue to run.
   
   If AR is **not Auditor**, then because of the following ElectionWatcher - 
https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/AuditorElector.java#L224
 in the case of ZK session expiry, AuditorElector will shutdown and hence 
AutoRecoveryMain will shutdown because of 
https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/AutoRecoveryMain.java#L230.
   
   If AR is **Auditor**, then we are not setting ElectionWatcher in 
AuditorElector and hence above mentioned path wouldn’t execute. So AR and its 
Auditor will continue to run.
   
   But the ‘myVote’ ephemeral node of the Auditor’s AR would be deleted in the 
case of ZK session expiry and hence some other AR will get this event and would 
become Auditor.
   
   So effectively two AR’s are running Auditors now :( 
   
   In summary just like what we are doing in the case of non-Auditor AR, we 
should shutdown the AR in the Auditor case as well. 
   
   To be honest, I never fully grasped the semantics, guarantees and invariants 
of ZooKeeperClient wrapper class, especially regarding ’connectRetryPolicy’. 
@athanatos  has some strong concerns with how things (watchers, ephemeral 
nodes) are handled in ZooKeeperClient when connection is reestablished. Also, 
frequently I see people making misassumptions with what ZooKeeperClient offers.
   
   @jvrao @codingwangqi fyi

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to