Re: Strange quit behavior

Eric Wong Wed, 17 Aug 2011 13:16:53 -0700

Eric Wong <[email protected]> wrote:
> Below is a proposed patch (to unicorn.git) which may help debug issues
> in the signal -> handler master path (but only once it enters the Ruby
> runtime).  I'm a hesitant to commit it since it worthless if the Ruby
> process is stuck because of some bad C extension.  That's the most
> common cause of stuck/unresponsive processes I've seen.


I think that was a bad patch, adding signal handler debugging at the
Ruby layer leads to the false assumption that interpreter/VM is in a
good state.  If you need to debug signal handlers, something is already
broken and tracing syscalls is the most reliable way to go.


Ruby (and any other high-level language) signal handling is not
straight forward[1].

Here's how things work in Matz Ruby 1.9.x[2]:

  you                 C timer thread             Ruby Thread(s)
  -------------------------------------------------------------------
                      traps signals              ignores most signals
                      sleeps                     runs Ruby...
  kill -USR2 ...
                      receives signal (async)
                      runs (system) sighandler[1]
                      wakes up from sleep
                      signals Ruby Thread(s)
                                                 *hopefully wakes up*
                                                 runs Ruby sighandler


The "*hopefully wakes up*" part is the part most likely to fail
as a result of a bad C extension or Ruby bug.


PS. In Ruby 1.9.3, timer thread uses the "self-pipe" sighandler
    implementation that the unicorn master process always used.
    This allows Ruby 1.9.3 to conserve power on idle processes.
    In 1.9.2, the timer thread signal handler just polls in
    10ms intervals to check if any signals were received.
    This is why "strace -f" is noisy and I recommend "-e '!futex'"
    for 1.9.2.

PPS. Unicorn still uses the "self-pipe" signal handler in Ruby-land
     because Ruby signal handlers are reentrant so must execute
     reentrant-safe code.  So without the self-pipe to serialize
     the signal handler dispatch, the Ruby signal handler execution
     can nest and overlap execution with itself.  This means if USR2
     is sent multiple times in short succession, you could spawn
     multiple new unicorn masters


[1] - See "man 7 signal" in Linux manpages or POSIX specs for the
      small list of safe functions that may be called in system-level
      sighandlers.  Ruby-level signal handlers can't run inside
      system-level signal handlers for this reason.

[2] - I think any high-level language that implements signal handlers
      AND native threads must do something similar.  The only valid
      variation I can think of is to execute the high-level language
      code inside the timer thread, but that requires the coders of
      the high-level language to have thread-safety (not just
      reentrancy) in mind when writing signal handlers even if the
      rest of their code uses no threads.

-- 
Eric Wong
_______________________________________________
Unicorn mailing list - [email protected]
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

Re: Strange quit behavior

Reply via email to