Hi,
On our current servers, I keep running into problems if a particular
channel takes longer to complete a command that its siblings. For
example, say I have 2 servers and run a command on them simultaneously
via capistrano. If one server finishes early, after a short period of
inactivity (around 40 seconds?) I get :
/Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/net-ssh-2.1.4/lib/net/ssh/
transport/session.rb:174:in `poll_message': disconnected: Timeout,
your session not responding. (2) (Net::SSH::Disconnect)
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/net-ssh-2.1.4/lib/
net/ssh/transport/session.rb:164:in `loop'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/net-ssh-2.1.4/lib/
net/ssh/transport/session.rb:164:in `poll_message'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/net-ssh-2.1.4/lib/
net/ssh/connection/session.rb:451:in `dispatch_incoming_packets'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/net-ssh-2.1.4/lib/
net/ssh/connection/session.rb:213:in `preprocess'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/processable.rb:17:in `process_iteration'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/processable.rb:43:in `ensure_each_session'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/processable.rb:41:in `each'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/processable.rb:41:in `ensure_each_session'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/processable.rb:17:in `process_iteration'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/command.rb:165:in `process!'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/command.rb:164:in `loop'
from /Users/jon/.rvm/gems/ree-1.8.7-2011.03/gems/capistrano-2.6.0/lib/
capistrano/command.rb:164:in `process!'
However, the ssh connection will stay alive quite happily if the
command takes a more-or-less equal time on both servers. That is, if
one server has a file 'foo', but the other doesn't, then running
cap invoke COMMAND="test -f foo && sleep 60 || echo 'closing'"
will die with the above error, as the server without 'foo' will finish
its command early.
However, an indiscriminate sleep (cap invoke COMMAND="sleep 60"),
where all servers are stuck for the same amount of time, will
successfully complete.
I've seen a few mentions of this error before, but not in connection
to this 'imbalance' of execution time between different ssh channels.
I've tried playing with ClientAliveInterval and ClientAliveMaxCount
without much success. I can't figure out why it dies after 40 seconds
either, it doesn't seem related to any of the settings I've come
across. Any other suggestions what I might try?
-Jonathan
--
* You received this message because you are subscribed to the Google Groups
"Capistrano" group.
* To post to this group, send email to [email protected]
* To unsubscribe from this group, send email to
[email protected] For more options, visit this group at
http://groups.google.com/group/capistrano?hl=en