[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219501#comment-16219501
 ] 

Allen Wittenauer commented on HDFS-12711:
-----------------------------------------

OK, I just hit it here. On my laptop's Linux VM, things start falling apart 
around 35 Java processes running, almost all of them fired off from surefire.  
Some things start to fail (I couldn't do a ps!).   Eventually I can again and 
see what is happening. Only *one* process has been killed, even though the 
shell couldn't fork for ps.  

Given that my CPU is currently pegged (uptime says the loadavg is over 50), I'm 
guessing the same thing is happening on a bigger scale on the ASF build boxes.  
There aren't enough cycles on the CPU to run the OOM killer fast enough for it 
to kill things.  Eventually, other stuff fails to launch due to insufficient 
memory.  This eventually confuses the Jenkins agent and it all falls apart from 
there.

It *eventually* gets over the hump, but by then it's too late.

> deadly hdfs test
> ----------------
>
>                 Key: HDFS-12711
>                 URL: https://issues.apache.org/jira/browse/HDFS-12711
>             Project: Hadoop HDFS
>          Issue Type: Test
>    Affects Versions: 2.9.0, 2.8.2
>            Reporter: Allen Wittenauer
>            Priority: Critical
>         Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to