We are running Jenkins 1.478. The master node is running on Windows 2003 (xp). It has 3 slaves - 2 other Windos machines and 1 Mac. The mac machine was working fine - then when I attempted to upgrade the O/S (from Snow Leopard to Lion) it failed due to disk errors. I've since reconstituted the machine from scratch - so all of the hardware is the same but all of the software (and configurations) are brand new (Mountain Lion).
Something appears to be causing one of our slave nodes (on Mac OSX) to take longer and longer to respond. It's currently at ~1000ms response time. It has gotten up to 3000ms response time. I have added two things to slave's launch JVM options to help in diagnosing and resolving the problem: 1) -Dcom.sun.management.jmxremote (so I can monitor the performance of the slave process via jconsole) 2) -Xmx2048m (to use 2GB of the 3GB of physical memory available on the machine) The timeouts have apparently caused jobs to fail with errors about channel closing: Started by upstream project "ScapeFolio <http://cruisecontrol.office.everyscape.com:8080/job/ScapeFolio/>" build number 83 <http://cruisecontrol.office.everyscape.com:8080/job/ScapeFolio/83> [EnvInject] - Loading node environment variables. [EnvInject] - [ERROR] - SEVERE ERROR occurs: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel Archiving artifacts ERROR: Publisher hudson.tasks.Mailer aborted due to exceptionhudson.remoting.ChannelClosedException <http://stacktrace.jenkins-ci.org/search?query=hudson.remoting.ChannelClosedException>: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) <http://stacktrace.jenkins-ci.org/search/?query=hudson.remoting.Channel.send&entity=method> Started by upstream project "ScapeFolio <http://cruisecontrol.office.everyscape.com:8080/job/ScapeFolio/>" build number 83 <http://cruisecontrol.office.everyscape.com:8080/job/ScapeFolio/83> [EnvInject] - Loading node environment variables. [EnvInject] - [ERROR] - SEVERE ERROR occurs: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel Archiving artifacts ERROR: Publisher hudson.tasks.Mailer aborted due to exceptionhudson.remoting.ChannelClosedException <http://stacktrace.jenkins-ci.org/search?query=hudson.remoting.ChannelClosedException>: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) <http://stacktrace.jenkins-ci.org/search/?query=hudson.remoting.Channel.send&entity=method> Does anyone have any recommendations on how to diagnose and resolve these problems? Thanks, Chuck
