> -----Original Message-----
> From: Spitzer, Andy (BL60:9D30) 
> Sent: Tuesday, December 09, 2008 2:23 PM
> To: Beeton, Carolyn (CAR:9D60); [EMAIL PROTECTED]
> Subject: Re: [sipX-dev] Recovering from supervisor crash...
> 
> Woof!
> 
> On Tue, 09 Dec 2008 14:02:42 -0500, Carolyn Beeton 
> <[EMAIL PROTECTED]> wrote:
> 
> > Woof commented this morning on another way of recovering from a 
> > supervisor crash... something about forking a child but 
> dong anything 
> > unless/until the parent dies and then not exec'ing but running as a 
> > copy of itself (i.e. a new supervisor)...
> 
> Yep.  Call fork(), have the "child" hander do the "wait for 
> parent to die, see if it should restart" trick.  If it is to 
> restart, then it can promote it self by calling main().  This 
> essentually creates two copies of the process, one of them 
> will continue and do whatever it is supposed to do 
> (sipXsupervisor stuff, in this case), the other becomes the 
> "supervisor in waiting" and will wait for the parent to die, 
> and then take over.
> 
> > I don't think this removes the need
> > to restart the supervisor
> 
> Yes, it does, as the "supervisor in waiting" is already 
> started (as the child of the previous supervisor, with all 
> the rights and privledges thereunto).  All it needs to do to 
> be promoted to current supervisor is call main().
> 
> > and all its original children,
> 
> Yes, the original children need to be killed and be re-born.  
> The only good orphan is a dead orphan.  But the supervisor 
> code already handles that case, right?  So when the 
> "supervisor in waiting" promote's itself when it calls 
> main(), the first thing it does is create a "supervisor in 
> waiting", check if any of the previous children are still 
> around and commit murder on them if so, (you can't have your 
> predecessor's children around to get in the way, you know, 
> even if they are your siblings), and then install his own 
> children in their stead.
> 
> --Woof!
> 

This isn't quite how things work now (the supervisor right now can only
stop/restart its own children, not the previous guy's orphans), but I
think I can make it work.  The "supervisor-in-waiting" won't be
implemented in a process.def, but rather explicitly in the supervisor
code (i.e. it can't share the portLib launch function) - and it won't
have to be enabled, get its config version set, etc, which makes the
package nicer.  I'll have another go at it.

thanks,
Carolyn
_______________________________________________
sipx-dev mailing list
[email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev

Reply via email to