Thanks for bringing this up, Marc. Yes, I can confirm your problem. And I can confirm this for both supervisor-3.0b2 (beta) and supervisor-3.0 (stable release).
I summarized the details here: https://github.com/miguno/puppet-zookeeper/issues/1 After some back-and-forth testing I learned that the correct fix (or maybe I should call it a workaround, as it could be a bug in supervisord actually) involves the use of /both/: - `stopasgroup` must be set to true - `trap "kill -- -$$" EXIT` must be added to /usr/bin/zookeeper-server Using only one of the two is not enough. See the link above for what breaks in each case if you still try. The problem is fixed in puppet-zookeeper 1.0.4 and in the latest (master/trunk) version of Wirbelsturm. Hope this helps! Michael On 03/19/2014 07:18 PM, Marc Vaillant wrote: > Hi Michael, > > Thanks very much for your hard work on this, your puppet scripts have > been very helpful. We are having a specific issue with supervision of > zookeeper and I wonder if you have encountered something similar or if > we are doing something wrong. Even with the stopasgroup=true > supervisord option, there still seems to be a problem with orphaned > child processes when the parent process (zookeeper-server script) goes > down from an external event. Although running "supervisorctl stop > zookeeper" will take down the zookeeper-server script and its child > processes, issuing a "killall zookeeper-server" will take down only the > script, leaving the child processes running. This sends supervisord > into an infinite loop of attempting to restart zookeeper, but failing > because the child processes are still alive and occupying the required > ports. > > A fix we've found (refer to the last answer here > http://stackoverflow.com/questions/9090683/supervisord-stopping-child-processesis) > is to put > > trap "kill -- -$$" EXIT > > at the top of the zookeeper-server script. However, it seems like the > stopasgroup=true setting was designed to handle this case. I know that > stopasgroup was part of the 3.0b2 (05.28.2013) release of supervisord. We > are using the 3.0 (07.30.2013) release from your RPM > https://github.com/miguno/wirbelsturm-rpm-supervisord so I believe it > should be available. > > Thanks, > Marc > > > On Mon, Mar 17, 2014 at 09:02:11PM +0100, Michael G. Noll wrote: >> Hi everyone, >> >> I have released a tool called Wirbelsturm >> (https://github.com/miguno/wirbelsturm) that allows you to perform local >> and remote deployments of Storm. It's also a small way of saying a big >> "thank you" to the Storm community. >> >> Wirbelsturm uses Vagrant for creating and managing machines, and Puppet >> for provisioning the machines once they're up and running. You can also >> use Ansible to interact with deployed machines. Deploying Storm is but >> one example, of course -- you can deploy other software with Wirbelsturm >> as well (e.g. Graphite, Kafka, Redis, ZooKeeper). >> >> I also wrote a quick intro and behind-the-scenes blog post at [1], which >> covers, for instance, the motivation behind building Wirbelsturm and >> lessons learned along the way (read: mistakes made :-P). >> >> Enjoy! >> Michael >> >> >> [1] >> http://www.michael-noll.com/blog/2014/03/17/wirbelsturm-one-click-deploy-storm-kafka-clusters-with-vagrant-puppet/ >> >>
