Thanks for bringing this up, Marc.

Yes, I can confirm your problem. And I can confirm this for both
supervisor-3.0b2 (beta) and supervisor-3.0 (stable release).

I summarized the details here:
https://github.com/miguno/puppet-zookeeper/issues/1

After some back-and-forth testing I learned that the correct fix (or
maybe I should call it a workaround, as it could be a bug in supervisord
actually) involves the use of /both/:

- `stopasgroup` must be set to true
- `trap "kill -- -$$" EXIT` must be added to /usr/bin/zookeeper-server

Using only one of the two is not enough.  See the link above for what
breaks in each case if you still try.

The problem is fixed in puppet-zookeeper 1.0.4 and in the latest
(master/trunk) version of Wirbelsturm.

Hope this helps!
Michael



On 03/19/2014 07:18 PM, Marc Vaillant wrote:
> Hi Michael,
> 
> Thanks very much for your hard work on this, your puppet scripts have
> been very helpful.   We are having a specific issue with supervision of
> zookeeper and I wonder if you have encountered something similar or if
> we are doing something wrong.  Even with the stopasgroup=true
> supervisord option, there still seems to be a problem with orphaned
> child processes when the parent process (zookeeper-server script) goes
> down from an external event.  Although running "supervisorctl stop
> zookeeper" will take down the zookeeper-server script and its child
> processes, issuing a "killall zookeeper-server" will take down only the
> script, leaving the child processes running.  This sends supervisord
> into an infinite loop of attempting to restart zookeeper, but failing
> because the child processes are still alive and occupying the required
> ports.  
> 
> A fix we've found (refer to the last answer here
> http://stackoverflow.com/questions/9090683/supervisord-stopping-child-processesis)
> is to put 
> 
> trap "kill -- -$$" EXIT
> 
> at the top of the zookeeper-server script.  However, it seems like the
> stopasgroup=true setting was designed to handle this case.  I know that
> stopasgroup was part of the 3.0b2 (05.28.2013) release of supervisord.  We
> are using the 3.0 (07.30.2013) release from your RPM
> https://github.com/miguno/wirbelsturm-rpm-supervisord so I believe it
> should be available.  
> 
> Thanks,
> Marc
> 
> 
> On Mon, Mar 17, 2014 at 09:02:11PM +0100, Michael G. Noll wrote:
>> Hi everyone,
>>
>> I have released a tool called Wirbelsturm
>> (https://github.com/miguno/wirbelsturm) that allows you to perform local
>> and remote deployments of Storm.  It's also a small way of saying a big
>> "thank you" to the Storm community.
>>
>> Wirbelsturm uses Vagrant for creating and managing machines, and Puppet
>> for provisioning the machines once they're up and running.  You can also
>> use Ansible to interact with deployed machines.  Deploying Storm is but
>> one example, of course -- you can deploy other software with Wirbelsturm
>> as well (e.g. Graphite, Kafka, Redis, ZooKeeper).
>>
>> I also wrote a quick intro and behind-the-scenes blog post at [1], which
>> covers, for instance, the motivation behind building Wirbelsturm and
>> lessons learned along the way (read: mistakes made :-P).
>>
>> Enjoy!
>> Michael
>>
>>
>> [1]
>> http://www.michael-noll.com/blog/2014/03/17/wirbelsturm-one-click-deploy-storm-kafka-clusters-with-vagrant-puppet/
>>
>>

Reply via email to