gianm commented on PR #16425:
URL: https://github.com/apache/druid/pull/16425#issuecomment-2104905648

   Just reviewed the lists. By downgrading from 5.5 to 5.3 we do lose various 
fixes. These ones sound like they could be important:
   
   - [[CURATOR-504](https://issues.apache.org/jira/browse/CURATOR-504)] - Race 
conditions in LeaderLatch after reconnecting to ensemble
   - [[CURATOR-638](https://issues.apache.org/jira/browse/CURATOR-638)] - 
Curator disconnect from zookeeper when IPs change [seems especially relevant to 
k8s environments]
   - [[CURATOR-644](https://issues.apache.org/jira/browse/CURATOR-644)] - CLONE 
- Race conditions in LeaderLatch after reconnecting to ensemble [a live-lock 
issue; we believe the fix for this bug introduced the split-brain problem; so 
rolling back would reintroduce the live-lock issue]
   - [[CURATOR-649](https://issues.apache.org/jira/browse/CURATOR-649)] - 
Background exception was not retry-able or retry gave up [robustness]
   
   With regard to downgrading Curator to 5.3 in Druid 30, I think we should be 
especially careful of these issues, and in particular CURATOR-638 and possible 
impact on k8s environments.
   
   Alternate approaches do include:
   
   - Stay with Curator 5.5 and work around this bug, such as by closing and 
recreating our LeaderLatch when ZK session changes.
   - Stay with Curator 5.5 and _don't_ work around this bug; wait for release 
of Curator 5.7 which will hopefully include a fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to