Discussed with Onur offline, the purpose of `getAfterNodeExists` in 
`CheckedEphemeral` is indeed used to handle the case when zk connection loss 
happens. After digging around both zookeeper and kafka codes, we think it is 
safe to remove the extra complexity for `controllerNodeExistsHandler` in this 
PR when we make `/controller` creation and `/controller_epoch` update atomic.

So the logic will be:
1). Try to create `/controller_epoch` if not exists
2). Read `/controller_epoch` from zk
3). Atomically create `/controller` and update `/controller_epoch`
4). If 3) throws NodeExistsException, read `/controller` and if controller id 
in zk equals the current broker id and if controller epoch in zk equals the 
expected epoch, successfully finish controller election; Otherwise, throw 
ControllerMovedException.

[ Full content available at: https://github.com/apache/kafka/pull/5101 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to