Hi all,
One of our nightly builds has started regularly failing due to a hang. It gets
stuck at the very beginning of the job attempting to clean up the workspace.
When look at the console output, it's just got the spinning progress wheel.
The job in question is configured with the "Always check out a fresh copy" SVN
checkout strategy. The job runs on a build slave via ssh. The job configuration
itself has not changed in a long time. But the SVN project is very large and
very active. The only recent Jenkins environment changes have been the addition
of some plugins used by other jobs, including the "Jenkins Workspace Cleanup
Plugin" ironically.
There is nothing interesting in the log when the failure happens. I only see
this when I manually stop the build after it's been trying to clean up for 6
hours:
--snip--
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at hudson.remoting.Request.call(Request.java:127)
at hudson.remoting.Channel.call(Channel.java:681)
at hudson.FilePath.act(FilePath.java:777)
at hudson.FilePath.act(FilePath.java:770)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:743)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:685)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1197)
at
hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:579)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:468)
at hudson.model.Run.run(Run.java:1410)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:238)
--snip--
Note that operating under the assumption that maybe an open file is hanging the
build, I tried restarting the build slave, the restarting the build. Same hang.
The only work around has been to manually throw the workspace away on the
slave, then start the build.
The problem sounds exactly like a post to this group with the subject "jenkins
hanging on build" from August 10, 2011. But I don't see any reply or resolution.
This problem is really killing us because we're in total crunch mode right now
and this is the first build in a series that takes nearly 6 hours. Every time
the build hangs, we have to restart it manually, which loses valuable testing
time.
As a desperate work around attempt, I'm trying the Jenkins Workspace Cleanup
Plugin, which we recently installed, but is not being currently used with this
job. It's been churning away for 20 minutes with no visible progress at the
console level. But at least it seems to be doing something under the hood. I
see the java task on the client side as 100+% (it's a multi core system), and
according to fs_usage (it's a Mac) it is actually unlinking files. Maybe
someday it will finish...
If anyone else has any experience with this problem or suggestions for a work
around, I'd appreciate it.
Best,
--
Allen Cronce