Hello, On Thu, Jun 3, 2010 at 7:37 PM, Eric Wong <normalper...@yhbt.net> wrote: > > Hi, > > HTML attachments are wasteful and thus rejected from the mailing list. > On the other hand, it actually helps to include the patch itself > (inline) so it's readable without a (human) context switch :)
Indeed, sorry for the HTML attachment, I have no idea where it comes from. As for the patch, here you are. This is really just a way to handle SIGABRT in a specific way in the worker and allow proper termination of the application. Note the FIXME comment I added in the murder_lazy_workers method. If any worker blocks while all the others are idle for a _timeout_ period of time, they will all be killed anyway. The consequence of that is that Unicorn will restart all its workers if traffic is very low on the server. diff --git a/lib/unicorn.rb b/lib/unicorn.rb index a363014..855f26a 100644 --- a/lib/unicorn.rb +++ b/lib/unicorn.rb @@ -84,7 +84,7 @@ module Unicorn # Listener sockets are started in the master process and shared with # forked worker children. - class HttpServer < Struct.new(:app, :timeout, :worker_processes, + class HttpServer < Struct.new(:app, :soft_timeout, :timeout, :worker_processes, :before_fork, :after_fork, :before_exec, :logger, :pid, :listener_opts, :preload_app, :reexec_pid, :orig_app, :init_listeners, @@ -393,7 +393,7 @@ module Unicorn when nil # avoid murdering workers after our master process (or the # machine) comes out of suspend/hibernation - if (last_check + timeout) >= (last_check = Time.now) + if (last_check + soft_timeout) >= (last_check = Time.now) murder_lazy_workers else # wait for workers to wakeup on suspend @@ -581,10 +581,20 @@ module Unicorn stat = worker.tmp.stat # skip workers that disable fchmod or have never fchmod-ed stat.mode == 0100600 and next - (diff = (Time.now - stat.ctime)) <= timeout and next - logger.error "worker=#{worker.nr} PID:#{wpid} timeout " \ + # FIXME: if the worker has not been working for soft_timeout, it will be + # killed even if it is not blocking + (diff = (Time.now - stat.ctime)) <= soft_timeout and + diff <= timeout and next + # lazy since less than timeout, attempt soft kill + if diff < timeout + logger.error "worker=#{worker.nr} PID:#{wpid} soft timeout " \ + "(#{diff}s > #{soft_timeout}s), killing softly" + kill_worker(:ABRT, wpid) + else + logger.error "worker=#{worker.nr} PID:#{wpid} hard timeout " \ : diff --git a/lib/unicorn.rb b/lib/unicorn.rb index a363014..855f26a 100644 --- a/lib/unicorn.rb +++ b/lib/unicorn.rb @@ -84,7 +84,7 @@ module Unicorn # Listener sockets are started in the master process and shared with # forked worker children. - class HttpServer < Struct.new(:app, :timeout, :worker_processes, + class HttpServer < Struct.new(:app, :soft_timeout, :timeout, :worker_processes, :before_fork, :after_fork, :before_exec, :logger, :pid, :listener_opts, :preload_app, :reexec_pid, :orig_app, :init_listeners, @@ -393,7 +393,7 @@ module Unicorn when nil # avoid murdering workers after our master process (or the # machine) comes out of suspend/hibernation - if (last_check + timeout) >= (last_check = Time.now) + if (last_check + soft_timeout) >= (last_check = Time.now) murder_lazy_workers else # wait for workers to wakeup on suspend @@ -581,10 +581,20 @@ module Unicorn stat = worker.tmp.stat # skip workers that disable fchmod or have never fchmod-ed stat.mode == 0100600 and next - (diff = (Time.now - stat.ctime)) <= timeout and next - logger.error "worker=#{worker.nr} PID:#{wpid} timeout " \ + # FIXME: if the worker has not been working for soft_timeout, it will be + # killed even if it is not blocking + (diff = (Time.now - stat.ctime)) <= soft_timeout and + diff <= timeout and next + # lazy since less than timeout, attempt soft kill + if diff < timeout + logger.error "worker=#{worker.nr} PID:#{wpid} soft timeout " \ + "(#{diff}s > #{soft_timeout}s), killing softly" + kill_worker(:ABRT, wpid) + else + logger.error "worker=#{worker.nr} PID:#{wpid} hard timeout " \ "(#{diff}s > #{timeout}s), killing" - kill_worker(:KILL, wpid) # take no prisoners for timeout violations + kill_worker(:KILL, wpid) # take no prisoners for timeout violations + end end end @@ -657,6 +667,12 @@ module Unicorn proc_name "worker[#{worker.nr}]" START_CTX.clear init_self_pipe! + + # try to handle SIGABRT correctly + trap('ABRT') do + raise SignalException, "SIGABRT" + end + WORKERS.values.each { |other| other.tmp.close rescue nil } WORKERS.clear LISTENERS.each { |sock| sock.fcntl(Fcntl::F_SETFD, Fcntl::FD_CLOEXEC) } diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb index 64a25e3..6efb0c5 100644 --- a/lib/unicorn/configurator.rb +++ b/lib/unicorn/configurator.rb @@ -14,6 +14,8 @@ module Unicorn # Default settings for Unicorn DEFAULTS = { + # Backward compatibility soft timeout (disabled in default configuration) + :soft_timeout => 60, :timeout => 60, :logger => Logger.new($stderr), :worker_processes => 1, @@ -129,6 +131,23 @@ module Unicorn # sets the timeout of worker processes to +seconds+. Workers # handling the request/app.call/response cycle taking longer than + # this time period will be softly killed (via SIGABRT). This + # timeout is enforced by the master process itself and not subject + # to the scheduling limitations by the worker process. Due the + # low-complexity, low-overhead implementation, timeouts of less + # than 3.0 seconds can be considered inaccurate and unsafe. + # ABORT is handled by the worker and raise an exception, offering a + # way to log the stack trace in your rails application. + + def soft_timeout(seconds) + Numeric === seconds or raise ArgumentError, + "not numeric: timeout=#{seconds.inspect}" + seconds >= 3 or raise ArgumentError, + "too low: timeout=#{seconds.inspect}" + set[:soft_timeout] = seconds + end + # sets the timeout of worker processes to +seconds+. Workers + # handling the request/app.call/response cycle taking longer than # this time period will be forcibly killed (via SIGKILL). This # timeout is enforced by the master process itself and not subject # to the scheduling limitations by the worker process. Due the @@ -159,6 +178,7 @@ module Unicorn set[:timeout] = seconds end + # sets the current number of worker_processes to +nr+. Each worker # process will serve exactly one client at a time. You can # increment or decrement this value at runtime by sending SIGTTIN Cheers, -- Pierre Baillet <o...@fotonauts.com> http://www.fotopedia.com/ _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying