andrekramer1 commented on issue #11070:
URL: https://github.com/apache/pulsar/issues/11070#issuecomment-936457985


   Some further investigation found that the first Zookeeper created for the 
Kubernetes statefull set does not respond to the ready/liveness probe. This 
uses the "ruok" command and the reply from the server is to close the 
connection (as Zookeeper is not up and running). So the second and third 
replicas are never created. Somehow Zookeeper has stopped responding while 
initializing / creating a quorum. This can be confirmed by setting the enabled 
flag on Zookeeper ready and liveness probes to false in the helm chart. With 
probes disabled managed to initialize a 3 node cluster.
   
   Created a debug branch of Zookeeper modified to respond to ruok and other 
client requests even when not fully initialized. With these changes it's also 
possible to bring up Zookeepers and Pulsar cluster with the probes enabled. The 
branch is here: https://github.com/andrekramer1/zookeeper/tree/early-ruok 
   
   Would be possible to create a pull request from this but the implications of 
allowing client connections while Zookeeper is initializing would need to be 
considered. Hopefully the change list can help fix this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to