lhotari commented on pull request #14088:
URL: https://github.com/apache/pulsar/pull/14088#issuecomment-1027220070
> If you pipe the command to nc the input stream is closed instantly which
leads to nc terminating in certain conditions even before the server is able to
send the reply. This leads to an empty output which then leads to a failed
health check. This behavior seems to be different for different version of nc
(OpenBSD, Linux). Since the cause of the problem is a race condition the "-q 1"
will wait one second before the program terminates and the server is able to
send the reply. This behavior is reproducable on certain Kubernetes clusters
with small nodes and seems to be fixed with this change.
>
> for run in {1..10}; do echo ruok | nc localhost 2181; done =>
imokimokimokimokimok for run in {1..10}; do echo ruok | nc -q 1 localhost 2181;
done => imokimokimokimokimokimokimokimokimokimok
Thanks for the explanation @frederic-kneier .
Btw. I've been struggling with the Zookeeper probes and this has been
causing some instability in https://github.com/apache/pulsar-helm-chart . Some
attempts to improve the situation:
https://github.com/apache/pulsar-helm-chart/pull/220
https://github.com/apache/pulsar-helm-chart/pull/214
https://github.com/apache/pulsar-helm-chart/pull/202
I just wonder if the value for `-w` should be more than 1?
I checked the Bitnami Zookeeper Helm chart and there the probe timeout value
is passed to `-w` parameter.
https://github.com/bitnami/charts/blob/b86be50209134b8a2967ce5335c647c4b9ca1759/bitnami/zookeeper/templates/statefulset.yaml#L350-L356
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]