Lost connection to zookeeper and lead to supervisor restart

Ryan Chan Mon, 12 May 2014 08:27:46 -0700

This morning, the supervisors/nimbus connections to zookeeper are having
problems (they are inside AWS VPC same subnet, not sure the root issue),
from the supervisor log:


http://pastebin.com/CVMTrfuQ

The supervisord died at line:

2014-05-11 00:11:07 b.s.util [INFO] Halting process: ("Error when
processing an event")

My Python Supervisord will auto restart the supervisor and after restarted,
I observed the error log

 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't
started
 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't
started
 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't
started

And the memory usage is substantially increased, and the topology is not
running at all. Until sometimes later, I killed all the Java processes and
it become normal again. (I am using netty transport)

Any idea?
Thanks.

Lost connection to zookeeper and lead to supervisor restart

Reply via email to