On Mon, 2010-06-07 at 10:00 +0100, Phillip Oldham wrote:
> We noticed an odd error over the weekend, and would like some advice.
> 
> One of our "services", a Python thrift[1] server, which binds to a port 
> had an error and stopped responding to requests. Supervisord "saw" this, 
> and tried to bring up another instance.

I think you might mean that superlance httpok saw this and tried to
bring up another instance?  "Raw" supervisor doesn't monitor process
behavior, only process up/down status.

>  However the original instance 
> hadn't actually exited, so was still running and was still bound to the 
> port. Over the weekend supervisord brought up a number of instances of 
> the service, so in total we found ~30 running instances none of which 
> were responding correctly.
> 
> We are about to script a plug-in for supervisord to "ping" the service 
> to monitor the connection. How would we then kill/restart the service if 
> it doesn't respond as expected?

I think you probably need to answer the above question and maybe provide
your current config so we can figure out what's going on before any
other advice can be given.

- C


_______________________________________________
Supervisor-users mailing list
[email protected]
http://lists.supervisord.org/mailman/listinfo/supervisor-users

Reply via email to