Re: Mac OSX slave timeouts

Chuck Doucette Mon, 27 Aug 2012 07:22:15 -0700

Here is more information, I just saw this message on the Manage Jenkins screen 
(from the master node, about the mac slave with problems):


There are more SCM polling activities scheduled than handled, so the threads 
are not keeping up with the demands. Check if your polling is hanging, and/or 
increase the number of threads if 
necessary<http://cruisecontrol.office.everyscape.com:8080/descriptor/hudson.triggers.SCMTrigger/>.

when I clicked on the link, I saw this:

Current SCM Polling Activities
There are more SCM polling activities scheduled than handled, so the threads 
are not keeping up with the demands. Check if your polling is hanging, and/or 
increase the number of threads if necessary.

The following polling activities are currently in progress:

Project  
↓<http://cruisecontrol.office.everyscape.com:8080/descriptor/hudson.triggers.SCMTrigger/#>
     Running for   
<http://cruisecontrol.office.everyscape.com:8080/descriptor/hudson.triggers.SCMTrigger/#>
ESSDK<http://cruisecontrol.office.everyscape.com:8080/job/ESSDK/scmPollLog/>    
2 days 21 hr
Uscapeit-Android<http://cruisecontrol.office.everyscape.com:8080/job/Uscapeit-Android/scmPollLog/>
      2 days 21 hr
ScapeFolio<http://cruisecontrol.office.everyscape.com:8080/job/ScapeFolio/scmPollLog/>
  2 days 21 hr

This are all projects that only run on the Mac slave node.

I'm not sure how to kill these SCM polling jobs.
I do know how to kill regular build jobs.
Perhaps I can try SCM notification instead (notify jenkins to rebuild upon 
checkin).

Chuck

On Aug 27, 2012, at 10:11 AM, Chuck Doucette 
<[email protected]<mailto:[email protected]>> wrote:

Yes, I believe the Mac hardware is in good general health.
The machine has 3GB of physical memory, so I believe it has plenty of free 
memory.
I don't believe it is swapping - but I'm not sure how to tell.
I have tried running Activity Monitor and JConsole.
As far as I can tell, there is no other software running.
There is no Time Machine backup setup nor has any anti virus software been 
installed.

As I said below, I had to wipe the disk and reinstall everything from scratch.
So, it has: Mountain Lion, Java, Xcode.
That's about it.
Nobody else is logged on except the jenkins user over ssh.

Now builds that should take a few minutes are taking multiple hours, and I see 
that time synchronization is off by a few minutes. I will try to fix the latter 
right now.

Chuck

On Aug 24, 2012, at 4:54 PM, Sami Tikka 
<[email protected]<mailto:[email protected]>> wrote:

Just to rule out the obvious culprits:

- The Mac hardware is in good general health?

- There is plenty of free memory? The system is not swapping?

- There isn't some process running and taking a lot of cpu? Spotlight indexing, 
Time Machine backup, some anti-virus real-time scanner?

Even though Macs are great machines, even they can get messed up and become 
slow.

-- Sami

Chuck Doucette <[email protected]<mailto:[email protected]>> 
kirjoitti 24.8.2012 kello 20.19:

We are running Jenkins 1.478.
The master node is running on Windows 2003 (xp).
It has 3 slaves - 2 other Windos machines and 1 Mac.
The mac machine was working fine - then when I attempted to upgrade the O/S 
(from Snow Leopard to Lion) it failed due to disk errors.
I've since reconstituted the machine from scratch - so all of the hardware is 
the same but all of the software (and configurations) are brand new (Mountain 
Lion).

Something appears to be causing one of our slave nodes (on Mac OSX) to take 
longer and longer to respond.
It's currently at ~1000ms response time.
It has gotten up to 3000ms response time.

I have added two things to slave's launch JVM options to help in diagnosing and 
resolving the problem:
1) -Dcom.sun.management.jmxremote (so I can monitor the performance of the 
slave process via jconsole)
2) -Xmx2048m (to use 2GB of the 3GB of physical memory available on the machine)

The timeouts have apparently caused jobs to fail with errors about channel 
closing:
Started by upstream project "ScapeFolio" build number 83

[EnvInject] - Loading node environment variables.
[EnvInject] - [ERROR] - SEVERE ERROR occurs: 
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
Archiving artifacts
ERROR: Publisher hudson.tasks.Mailer aborted due to exception

hudson.remoting.ChannelClosedException
: channel is already closed
at
hudson.remoting.Channel.send(Channel.java:492)
Started by upstream project "ScapeFolio" build number 83

[EnvInject] - Loading node environment variables.
[EnvInject] - [ERROR] - SEVERE ERROR occurs: 
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
Archiving artifacts
ERROR: Publisher hudson.tasks.Mailer aborted due to exception

hudson.remoting.ChannelClosedException
: channel is already closed
at
hudson.remoting.Channel.send(Channel.java:492)

Does anyone have any recommendations on how to diagnose and resolve these 
problems?

Thanks,
Chuck

Re: Mac OSX slave timeouts

Reply via email to