Hi Michael,

Thanks so much for wrestling this one to the ground.  I've been out for
a couple days but an engineer on our team has pulled your update and
is using it successfully.

Best,
Marc

On Thu, Mar 20, 2014 at 07:57:29PM +0100, Michael G. Noll wrote:
> Thanks for bringing this up, Marc.
> 
> Yes, I can confirm your problem. And I can confirm this for both
> supervisor-3.0b2 (beta) and supervisor-3.0 (stable release).
> 
> I summarized the details here:
> https://github.com/miguno/puppet-zookeeper/issues/1
> 
> After some back-and-forth testing I learned that the correct fix (or
> maybe I should call it a workaround, as it could be a bug in supervisord
> actually) involves the use of /both/:
> 
> - `stopasgroup` must be set to true
> - `trap "kill -- -$$" EXIT` must be added to /usr/bin/zookeeper-server
> 
> Using only one of the two is not enough.  See the link above for what
> breaks in each case if you still try.
> 
> The problem is fixed in puppet-zookeeper 1.0.4 and in the latest
> (master/trunk) version of Wirbelsturm.
> 
> Hope this helps!
> Michael
> 
> 
> 
> On 03/19/2014 07:18 PM, Marc Vaillant wrote:
> > Hi Michael,
> > 
> > Thanks very much for your hard work on this, your puppet scripts have
> > been very helpful.   We are having a specific issue with supervision of
> > zookeeper and I wonder if you have encountered something similar or if
> > we are doing something wrong.  Even with the stopasgroup=true
> > supervisord option, there still seems to be a problem with orphaned
> > child processes when the parent process (zookeeper-server script) goes
> > down from an external event.  Although running "supervisorctl stop
> > zookeeper" will take down the zookeeper-server script and its child
> > processes, issuing a "killall zookeeper-server" will take down only the
> > script, leaving the child processes running.  This sends supervisord
> > into an infinite loop of attempting to restart zookeeper, but failing
> > because the child processes are still alive and occupying the required
> > ports.  
> > 
> > A fix we've found (refer to the last answer here
> > http://stackoverflow.com/questions/9090683/supervisord-stopping-child-processesis)
> > is to put 
> > 
> > trap "kill -- -$$" EXIT
> > 
> > at the top of the zookeeper-server script.  However, it seems like the
> > stopasgroup=true setting was designed to handle this case.  I know that
> > stopasgroup was part of the 3.0b2 (05.28.2013) release of supervisord.  We
> > are using the 3.0 (07.30.2013) release from your RPM
> > https://github.com/miguno/wirbelsturm-rpm-supervisord so I believe it
> > should be available.  
> > 
> > Thanks,
> > Marc
> > 
> > 
> > On Mon, Mar 17, 2014 at 09:02:11PM +0100, Michael G. Noll wrote:
> >> Hi everyone,
> >>
> >> I have released a tool called Wirbelsturm
> >> (https://github.com/miguno/wirbelsturm) that allows you to perform local
> >> and remote deployments of Storm.  It's also a small way of saying a big
> >> "thank you" to the Storm community.
> >>
> >> Wirbelsturm uses Vagrant for creating and managing machines, and Puppet
> >> for provisioning the machines once they're up and running.  You can also
> >> use Ansible to interact with deployed machines.  Deploying Storm is but
> >> one example, of course -- you can deploy other software with Wirbelsturm
> >> as well (e.g. Graphite, Kafka, Redis, ZooKeeper).
> >>
> >> I also wrote a quick intro and behind-the-scenes blog post at [1], which
> >> covers, for instance, the motivation behind building Wirbelsturm and
> >> lessons learned along the way (read: mistakes made :-P).
> >>
> >> Enjoy!
> >> Michael
> >>
> >>
> >> [1]
> >> http://www.michael-noll.com/blog/2014/03/17/wirbelsturm-one-click-deploy-storm-kafka-clusters-with-vagrant-puppet/
> >>
> >>
> 

Reply via email to