Hi Rob,

I've not been able to reproduce this (though I found and fixed an
unrelated bug in the process of trying to).  Any chance you could use
gdb to attach to mongre2 next time this happens, set a breakpoint on the
line where the error message happens and print a backtrace?

If you're not familiar with gdb from the shell:

  sudo gdb attach <mongrel2 pid>
That will attach gdb to mongrel2
  b src/task/fd.c:217
This will set the breakpoint, which ought to be hit immediately
  backtrace
This will print the backtrace


-Jason

On 20:36 Fri 22 Jun     , Rob LaRubbio wrote:
> I'm not sure if this is relevant to the issue but I figure I'd throw it 
> out there in case it is.
> 
> I added a new server to our production env. running a build of mongrel 
> with my fix for filters getting the CLOSE transition.  My filter 
> increments a counter on the HANDLER event and decrements it on the CLOSE 
> event.  I then send that count to statsd.  Looking at my stats I can see 
> that CLOSE is happening much more frequently than HANDLER so it seems 
> the same connection is getting closed multiple times.
> 
> -Rob
> 
> On 6/22/12 1:40 PM, Rob LaRubbio wrote:
> > Thanks for looking into this.
> >
> > We aren't using websockets or proxies, just mongrel2 and Tir.  We have 
> > 4 mongrel2 servers behind a load balance each has 300 handlers.  The 
> > handlers are not shared across servers (I have a pull request into Tir 
> > to make it easier to run Tir on a server other than mongrel2)
> >
> > ==== mongrel2.conf =====
> > houston = Filter(
> >   name="/opt/mongrel2-1.8-dev/lib/mongrel2/filters/houston.so",
> >   settings = {
> >     <removed>
> >   }
> > )
> >
> > apollo = Handler(send_spec='tcp://127.0.0.1:9999',
> >                 send_ident='38f857b8-cbaa-4b58-9271-0d36c27813c4',
> >                 recv_spec='tcp://127.0.0.1:9998', recv_ident='',
> >                 protocol='tnetstring')
> >
> > static = Dir(base='static/',
> >              index_file='index.html',
> >              default_ctype='text/plain')
> >
> > main = Server(
> >     uuid="505417b8-1de4-454f-98b6-07eb9225cca1",
> >     access_log="/logs/access.log",
> >     error_log="/logs/error.log",
> >     chroot="/opt/mongrel2-1.8-dev",
> >     default_host="(.+)",
> >     name="main",
> >     pid_file="/run/mongrel2.pid",
> >     port=6767,
> >     hosts = [
> >         Host(name="(.+)",
> >         routes={ '/(.*/.*)': apollo,
> >                  '/([^/]*)$': static })
> >     ],
> >     filters = [
> >         houston
> >     ]
> >   )
> >
> > settings = {
> >     "limits.content_length": 20480000
> > }
> >
> > On 6/22/12 1:11 PM, Tordek wrote:
> >> On 22/06/12 13:12, Rob LaRubbio wrote:
> >>> Is the dev branch ready for a release? We're running it production
> >>> and at least three times a week it starts spinning and writing this
> >>> to the logs in an endless loop:
> >>>
> >>> Fri, 22 Jun 2012 16:04:50 GMT [ERROR] (src/task/fd.c:217: errno:
> >>> None) Attempt to wait on a dead socket/fd: (nil) or -1
> >>>
> >>> The server fills up a 500G disk in about 11 hours and we need to
> >>> kill the server to get it handling requests again.
> >> Jason and I are looking into this; could you show us your
> >> mongrel2.conf? Are you using websockets or proxies?
> >>
> >>> -Rob
> >
> >
> 
> 
> 

Reply via email to