I currently see three options and I'm leaning toward number 1 but welcome input.
1) Always close the sockets when the number of processes in the group hits zero and figure out a way to do graceful restarts later. This solution is simple and foolproof. 2) Create well defined process group hooks in Supervisor like (something like beforeStart, afterStart, beforeStop, afterStop, beforeRestart, afterRestart, etc or a way of registering a callback after an async event is complete) so that process groups can implement custom behavior. This would be great but would require a lot of refactoring. There currently isn't a single place in the code where this process group logic lives. For example, if you stop a process group through supervisorctl, the RPC interface has it's own logic for iterating through the processes in the group and stopping them. If you call 'reload' though supervisorctl, it puts supervisord into the RESTARTING state which later triggers a different code path for stopping all the processes in a group. The scope of this task seems to big for me to take on at this point. 3) Change the default behavior to close sockets when the reference count hits zero as in #1 but leave a hook in place so that in special cases it does not get closed. Conceptually, I like this and started down this path but the implementation feels wrong. As described in #2, there are no hooks for handling process group level behavior which is why I've resorted to reference counting in the first place. Adding a function to the reference counter like "keep_socket_open_next_time_ref_count_hits_zero" gives off an odor of poor design. Cheers, Roger On Thu, Sep 9, 2010 at 7:29 AM, Marco Vittorini Orgeas <[email protected]>wrote: > On 09/07/2010 11:01 PM, Roger Hoover wrote: > >> Hi Marco, >> >> This looks like a case that I missed. I need to look more closely at why >> it's happening so see if it requires a fundamental change to way sockets >> are >> handled. The current implementation doesn't close the socket when all the >> processes are stopped unless a group-level stop command was issued. The >> idea here was to support graceful restarts. If you have a single FCGI >> process and want to restart it, should the FCGI socket get torn down and >> recreated? If so, there will be a small amount of time where web clients >> are getting errors while the socket is down. I may have to change it to >> always tear down the socket when the number of processes hit zero which >> would require someone to always run at least two copies if you want >> graceful >> restart. >> >> I'll keep you posted on what I find out. >> >> Thanks, >> >> Roger >> > > Hi Roger! > > The idea to recycle sockets , also for allow graceful restarts is valuable. > > But at the moment it seems that as you said the sockets are not turn down, > and just looking at the exception thrown - I've not looking at the code yet > sorry - > > > 2010-09-04 10:05:53,366 INFO Creating socket tcp://127.0.0.1:10005 >> >>> Traceback (most recent call last): >>> >> > File "/usr/lib/python2.6/site-packages/supervisor/process.py", line >> >>> 694, in __init__ >>> raise ValueError('Could not create FastCGI socket %s: %s' % >>> (self.socket_manager.config(), e)) >>> ValueError: Could not create FastCGI socket tcp://127.0.0.1:10005: >>> [Errno 98] Address already in use >>> >> > > svd is trying to re-create a socket binding it to the same tcp > address:port. > It's stepping on its toes , because the tcp address is already bind! > > In order to recycle it- sorry speaking again without looking at the code - > it should not try to create a new socket or whatsoever it does that implies > to re-create the socket ! > > Anyway, I've switched to use local (i.e. unix) sockets and reloading occurs > without problems. > > Thank you for your support. > > -- > Marco Vittorini Orgeas >
_______________________________________________ Supervisor-users mailing list [email protected] http://lists.supervisord.org/mailman/listinfo/supervisor-users
