I'm not sure if this is relevant to the issue but I figure I'd throw it 
out there in case it is.

I added a new server to our production env. running a build of mongrel 
with my fix for filters getting the CLOSE transition.  My filter 
increments a counter on the HANDLER event and decrements it on the CLOSE 
event.  I then send that count to statsd.  Looking at my stats I can see 
that CLOSE is happening much more frequently than HANDLER so it seems 
the same connection is getting closed multiple times.

-Rob

On 6/22/12 1:40 PM, Rob LaRubbio wrote:
> Thanks for looking into this.
>
> We aren't using websockets or proxies, just mongrel2 and Tir.  We have 
> 4 mongrel2 servers behind a load balance each has 300 handlers.  The 
> handlers are not shared across servers (I have a pull request into Tir 
> to make it easier to run Tir on a server other than mongrel2)
>
> ==== mongrel2.conf =====
> houston = Filter(
>   name="/opt/mongrel2-1.8-dev/lib/mongrel2/filters/houston.so",
>   settings = {
>     <removed>
>   }
> )
>
> apollo = Handler(send_spec='tcp://127.0.0.1:9999',
>                 send_ident='38f857b8-cbaa-4b58-9271-0d36c27813c4',
>                 recv_spec='tcp://127.0.0.1:9998', recv_ident='',
>                 protocol='tnetstring')
>
> static = Dir(base='static/',
>              index_file='index.html',
>              default_ctype='text/plain')
>
> main = Server(
>     uuid="505417b8-1de4-454f-98b6-07eb9225cca1",
>     access_log="/logs/access.log",
>     error_log="/logs/error.log",
>     chroot="/opt/mongrel2-1.8-dev",
>     default_host="(.+)",
>     name="main",
>     pid_file="/run/mongrel2.pid",
>     port=6767,
>     hosts = [
>         Host(name="(.+)",
>         routes={ '/(.*/.*)': apollo,
>                  '/([^/]*)$': static })
>     ],
>     filters = [
>         houston
>     ]
>   )
>
> settings = {
>     "limits.content_length": 20480000
> }
>
> On 6/22/12 1:11 PM, Tordek wrote:
>> On 22/06/12 13:12, Rob LaRubbio wrote:
>>> Is the dev branch ready for a release? We're running it production
>>> and at least three times a week it starts spinning and writing this
>>> to the logs in an endless loop:
>>>
>>> Fri, 22 Jun 2012 16:04:50 GMT [ERROR] (src/task/fd.c:217: errno:
>>> None) Attempt to wait on a dead socket/fd: (nil) or -1
>>>
>>> The server fills up a 500G disk in about 11 hours and we need to
>>> kill the server to get it handling requests again.
>> Jason and I are looking into this; could you show us your
>> mongrel2.conf? Are you using websockets or proxies?
>>
>>> -Rob
>
>


Reply via email to