Having recently migrated our Rails app to MRI 2.2.0 (which may or may not be
related), we’re experiencing problems with our Unicorn zero-downtime restarts.
When I send USR2 to the master process (PID 19216 in this example), I get the
following in the Unicorn log:
I, [2015-03-11T23:47:33.992274 #6848] INFO -- : executing
["/srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn",
"/srv/ourapp/current/config.ru", "-Dc", "/srv/ourapp/shared/config/unicorn.rb",
{10=>#<Kgio::UNIXServer:/srv/ourapp/shared/sockets/unicorn.sock>}] (in
/srv/ourapp/releases/a0e8b5df474ad5129200654f92a76af00a750f47)
I, [2015-03-11T23:47:36.504235 #6848] INFO -- : inherited
addr=/srv/ourapp/shared/sockets/unicorn.sock fd=10
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:206:in
`pid=': Already running on PID:19216 (or
pid=/srv/ourapp/shared/pids/unicorn.pid is stale) (ArgumentError)
from
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:134:in
`start'
from
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/bin/unicorn:126:in
`<top (required)>'
from /srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `load'
from /srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `<main>'
E, [2015-03-11T23:47:36.519549 #19216] ERROR -- : reaped #<Process::Status: pid
6848 exit 1> exec()-ed
E, [2015-03-11T23:47:36.520296 #19216] ERROR -- : master loop error: Already
running on PID:19216 (or pid=/srv/ourapp/shared/pids/unicorn.pid is stale)
(ArgumentError)
E, [2015-03-11T23:47:36.520496 #19216] ERROR -- :
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:206:in
`pid='
E, [2015-03-11T23:47:36.520650 #19216] ERROR -- :
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:404:in
`reap_all_workers'
E, [2015-03-11T23:47:36.520790 #19216] ERROR -- :
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:279:in
`join'
E, [2015-03-11T23:47:36.520928 #19216] ERROR -- :
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/bin/unicorn:126:in
`<top (required)>'
E, [2015-03-11T23:47:36.521115 #19216] ERROR -- :
/srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `load'
E, [2015-03-11T23:47:36.521254 #19216] ERROR -- :
/srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `<main>’
And when I check, indeed, there is now unicorn.pid and unicorn.pid.oldbin, both
containing 19216.
What could cause this situation to arise?
Here’s my unicorn.rb FWIW:
# Set your full path to application.
app_path = "/srv/ourapp/current"
# Set unicorn options
worker_processes 3
preload_app true
timeout 30
listen "/srv/ourapp/shared/sockets/unicorn.sock", :backlog => 64
# Spawn unicorn master worker for user deploy (group: deploy)
user 'deploy', 'deploy'
# Fill path to your app
working_directory app_path
# Should be 'production' by default, otherwise use other env
rails_env = ENV['RAILS_ENV'] || 'production'
# Log everything to one file
stderr_path "/srv/ourapp/shared/log/unicorn.log"
stdout_path "/srv/ourapp/shared/log/unicorn.log"
# Set master PID location
pid "/srv/ourapp/shared/pids/unicorn.pid"
before_exec do |server|
ENV["BUNDLE_GEMFILE"] = "#{app_path}/Gemfile"
end
before_fork do |server, worker|
ActiveRecord::Base.connection.disconnect!
sleep 10
old_pid = "#{server.config[:pid]}.oldbin"
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
after_fork do |server, worker|
ActiveRecord::Base.establish_connection
Sidekiq.configure_client do |config|
config.redis = { namespace: 'sidekiq' }
end
end
--
Kevin Yank
Chief Technology Officer, Avalanche Technology Group
http://avalanche.com.au/
ph: +61 4 2241 0083