Hudson does a terrible job of killing underlying processes when a build is aborted due to someone killing it from UI or it hitting a timeout. For these hadoop builds, it usually means that 3 or 4 processes are left lying around that can and do interfere with subsequent jobs. It's not clear to me why they are hanging, but I suspect NFS issues on these hadoop slaves. We're going to disable NFS on a couple of them later this week and see if that helps.
I try to monitor for this situation regularly and properly kill builds that seem hung. Since these are on the hadoop slaves, it doesn't impact other project builds. Cheers, Nige On Jan 17, 2011, at 7:20 AM, Niklas Gustavsson wrote: > Hi > > The following build keeps getting locked up in Hudson and requires > frequent killing. Could someone have a look at it or should we disable > it for now? > > https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/ > > /niklas
