Bharanee Rathna <[email protected]> wrote:
> I'm encountering a weird error where the unicorn workers are stuck in
> a loop after hitting a 500 on the backend sinatra app.
Also, what extensions are you using in your app?
> strace at the point where it starts to go into a loop of death
> select(7, [4 5], NULL, [3 6], {30, 0}) = 1 (in [5], left {27, 274382})
> fchmod(8, 01) = 0
> fcntl(5, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
> accept4(5, {sa_family=AF_INET, sin_port=htons(56728),
> sin_addr=inet_addr("10.1.1.4")}, [16], SOCK_CLOEXEC) = 12
> recvfrom(12, 0x1c99fb0, 16384, 64, 0, 0) = -1 EAGAIN (Resource
> temporarily unavailable)
(I'm somewhat more awake, now, haven't been sleeping much)
Two things look off in the line above:
1) recvfrom() isn't using the MSG_DONTWAIT flag. I know you're using
Linux, so kgio should be using MSG_DONTWAIT to do non-blocking
recv... Which versions of unicorn/kgio are you using?
2) TCP_DEFER_ACCEPT should prevent recvfrom() from hitting EAGAIN
in the common case under Linux.
> select(13, [12], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
> --- SIGINT (Interrupt) @ 0 (0) ---
> rt_sigreturn(0x2) = -1 EINTR (Interrupted system call)
What triggered SIGINT?
> sched_yield() = 0
> sched_yield() = 0
> sched_yield() = 0
> sched_yield() = 0
> sched_yield() = 0
>
> Longer strace outputs can be found over at
> https://gist.github.com/fe4e3172994e5de21317
Actually, after many lines of sched_yield() in your gist, I can see it
does actually exit the process. Did you kill it with SIGINT? If so, I
see nothing wrong...
Ruby 1.9 seems to sched_yield a lot during shutdown, but it does
eventually finish.
--
Eric Wong
_______________________________________________
Unicorn mailing list - [email protected]
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying