I currently see three options and I'm leaning toward number 1 but welcome
input.

1) Always close the sockets when the number of processes in the group hits
zero and figure out a way to do graceful restarts later.  This solution is
simple and foolproof.

2) Create well defined process group hooks in Supervisor like (something
like beforeStart, afterStart, beforeStop, afterStop, beforeRestart,
afterRestart, etc or a way of registering a callback after an async event is
complete) so that process groups can implement custom behavior.  This would
be great but would require a lot of refactoring.  There currently isn't a
single place in the code where this process group logic lives.  For example,
if you stop a process group through supervisorctl, the RPC interface has
it's own logic for iterating through the processes in the group and stopping
them.  If you call 'reload' though supervisorctl, it puts supervisord into
the RESTARTING state which later triggers a different code path for stopping
all the processes in a group.  The scope of this task seems to big for me to
take on at this point.

3) Change the default behavior to close sockets when the reference count
hits zero as in #1 but leave a hook in place so that in special cases it
does not get closed.  Conceptually, I like this and started down this path
but the implementation feels wrong.  As described in #2, there are no hooks
for handling process group level behavior which is why I've resorted to
reference counting in the first place.  Adding a function to the reference
counter like "keep_socket_open_next_time_ref_count_hits_zero" gives off an
odor of poor design.

Cheers,

Roger

On Thu, Sep 9, 2010 at 7:29 AM, Marco Vittorini Orgeas
<[email protected]>wrote:

> On 09/07/2010 11:01 PM, Roger Hoover wrote:
>
>> Hi Marco,
>>
>> This looks like a case that I missed.  I need to look more closely at why
>> it's happening so see if it requires a fundamental change to way sockets
>> are
>> handled.  The current implementation doesn't close the socket when all the
>> processes are stopped unless a group-level stop command was issued.  The
>> idea here was to support graceful restarts.  If you have a single FCGI
>> process and want to restart it, should the FCGI socket get torn down and
>> recreated?  If so, there will be a small amount of time where web clients
>> are getting errors while the socket is down.  I may have to change it to
>> always tear down the socket when the number of processes hit zero which
>> would require someone to always run at least two copies if you want
>> graceful
>> restart.
>>
>> I'll keep you posted on what I find out.
>>
>> Thanks,
>>
>> Roger
>>
>
> Hi Roger!
>
> The idea to recycle sockets , also for allow graceful restarts is valuable.
>
> But at the moment it seems that as you said the sockets are not turn down,
> and just looking at the exception thrown - I've not looking at the code yet
> sorry -
>
>
>  2010-09-04 10:05:53,366 INFO Creating socket tcp://127.0.0.1:10005
>>
>>> Traceback (most recent call last):
>>>
>>
>   File "/usr/lib/python2.6/site-packages/supervisor/process.py", line
>>
>>> 694, in __init__
>>>    raise ValueError('Could not create FastCGI socket %s: %s' %
>>> (self.socket_manager.config(), e))
>>> ValueError: Could not create FastCGI socket tcp://127.0.0.1:10005:
>>> [Errno 98] Address already in use
>>>
>>
>
> svd is trying to re-create a socket binding it to the same tcp
> address:port.
> It's stepping on its toes , because the tcp address is already bind!
>
> In order to recycle it- sorry speaking again without looking at the code -
> it should not try to create a new socket or whatsoever it does that implies
> to re-create the socket !
>
> Anyway, I've switched to use local (i.e. unix) sockets and reloading occurs
> without problems.
>
> Thank you for your support.
>
> --
> Marco Vittorini Orgeas
>
_______________________________________________
Supervisor-users mailing list
[email protected]
http://lists.supervisord.org/mailman/listinfo/supervisor-users

Reply via email to