gianm opened a new pull request #8177: Update to Curator 4.2.0, ZooKeeper 
3.4.14.
URL: https://github.com/apache/incubator-druid/pull/8177
 
 
   Other than generally wanting to use the latest Curator and ZK, this change 
is motivated by an outage I encountered last night. I was debugging a cluster 
last night that was acting bizarrely, and in the end it turned out that it had 
two overlords that both thought they were leader. Shortly before they both 
gained leadership, the ZK quorum was unavailable for about 20 seconds. It 
doesn't look like Druid itself was doing anything particularly wrong: logs 
indicated the overlords weren't ignoring `stopBeingLeader` calls or anything 
like that.
   
   For these reasons, I believe the cause of the outage was 
https://issues.apache.org/jira/browse/CURATOR-498. This comment indicates the 
bug could cause two LeaderLatch users to become leaders at once: 
https://issues.apache.org/jira/browse/CURATOR-498?focusedCommentId=16732419&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16732419.
   
   The bug was fixed in Curator 4.2.0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to