Here is more information, I just saw this message on the Manage Jenkins screen (from the master node, about the mac slave with problems):
There are more SCM polling activities scheduled than handled, so the threads are not keeping up with the demands. Check if your polling is hanging, and/or increase the number of threads if necessary<http://cruisecontrol.office.everyscape.com:8080/descriptor/hudson.triggers.SCMTrigger/>. when I clicked on the link, I saw this: Current SCM Polling Activities There are more SCM polling activities scheduled than handled, so the threads are not keeping up with the demands. Check if your polling is hanging, and/or increase the number of threads if necessary. The following polling activities are currently in progress: Project ↓<http://cruisecontrol.office.everyscape.com:8080/descriptor/hudson.triggers.SCMTrigger/#> Running for <http://cruisecontrol.office.everyscape.com:8080/descriptor/hudson.triggers.SCMTrigger/#> ESSDK<http://cruisecontrol.office.everyscape.com:8080/job/ESSDK/scmPollLog/> 2 days 21 hr Uscapeit-Android<http://cruisecontrol.office.everyscape.com:8080/job/Uscapeit-Android/scmPollLog/> 2 days 21 hr ScapeFolio<http://cruisecontrol.office.everyscape.com:8080/job/ScapeFolio/scmPollLog/> 2 days 21 hr This are all projects that only run on the Mac slave node. I'm not sure how to kill these SCM polling jobs. I do know how to kill regular build jobs. Perhaps I can try SCM notification instead (notify jenkins to rebuild upon checkin). Chuck On Aug 27, 2012, at 10:11 AM, Chuck Doucette <[email protected]<mailto:[email protected]>> wrote: Yes, I believe the Mac hardware is in good general health. The machine has 3GB of physical memory, so I believe it has plenty of free memory. I don't believe it is swapping - but I'm not sure how to tell. I have tried running Activity Monitor and JConsole. As far as I can tell, there is no other software running. There is no Time Machine backup setup nor has any anti virus software been installed. As I said below, I had to wipe the disk and reinstall everything from scratch. So, it has: Mountain Lion, Java, Xcode. That's about it. Nobody else is logged on except the jenkins user over ssh. Now builds that should take a few minutes are taking multiple hours, and I see that time synchronization is off by a few minutes. I will try to fix the latter right now. Chuck On Aug 24, 2012, at 4:54 PM, Sami Tikka <[email protected]<mailto:[email protected]>> wrote: Just to rule out the obvious culprits: - The Mac hardware is in good general health? - There is plenty of free memory? The system is not swapping? - There isn't some process running and taking a lot of cpu? Spotlight indexing, Time Machine backup, some anti-virus real-time scanner? Even though Macs are great machines, even they can get messed up and become slow. -- Sami Chuck Doucette <[email protected]<mailto:[email protected]>> kirjoitti 24.8.2012 kello 20.19: We are running Jenkins 1.478. The master node is running on Windows 2003 (xp). It has 3 slaves - 2 other Windos machines and 1 Mac. The mac machine was working fine - then when I attempted to upgrade the O/S (from Snow Leopard to Lion) it failed due to disk errors. I've since reconstituted the machine from scratch - so all of the hardware is the same but all of the software (and configurations) are brand new (Mountain Lion). Something appears to be causing one of our slave nodes (on Mac OSX) to take longer and longer to respond. It's currently at ~1000ms response time. It has gotten up to 3000ms response time. I have added two things to slave's launch JVM options to help in diagnosing and resolving the problem: 1) -Dcom.sun.management.jmxremote (so I can monitor the performance of the slave process via jconsole) 2) -Xmx2048m (to use 2GB of the 3GB of physical memory available on the machine) The timeouts have apparently caused jobs to fail with errors about channel closing: Started by upstream project "ScapeFolio" build number 83 [EnvInject] - Loading node environment variables. [EnvInject] - [ERROR] - SEVERE ERROR occurs: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel Archiving artifacts ERROR: Publisher hudson.tasks.Mailer aborted due to exception hudson.remoting.ChannelClosedException : channel is already closed at hudson.remoting.Channel.send(Channel.java:492) Started by upstream project "ScapeFolio" build number 83 [EnvInject] - Loading node environment variables. [EnvInject] - [ERROR] - SEVERE ERROR occurs: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel Archiving artifacts ERROR: Publisher hudson.tasks.Mailer aborted due to exception hudson.remoting.ChannelClosedException : channel is already closed at hudson.remoting.Channel.send(Channel.java:492) Does anyone have any recommendations on how to diagnose and resolve these problems? Thanks, Chuck
