Hudson does a terrible job of killing underlying processes when a build is 
aborted due to someone killing it from UI or it hitting a timeout.  For these 
hadoop builds, it usually means that 3 or 4 processes are left lying around 
that can and do interfere with subsequent jobs.  It's not clear to me why they 
are hanging, but I suspect NFS issues on these hadoop slaves.  We're going to 
disable NFS on a couple of them later this week and see if that helps.  

I try to monitor for this situation regularly and properly kill builds that 
seem hung.  Since these are on the hadoop slaves, it doesn't impact other 
project builds.

Cheers,
Nige


On Jan 17, 2011, at 7:20 AM, Niklas Gustavsson wrote:

> Hi
> 
> The following build keeps getting locked up in Hudson and requires
> frequent killing. Could someone have a look at it or should we disable
> it for now?
> 
> https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/
> 
> /niklas

Reply via email to