[jira] [Commented] (HDFS-12711) deadly hdfs test

Allen Wittenauer (JIRA) Sun, 29 Oct 2017 11:55:21 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224146#comment-16224146
 ]


Allen Wittenauer commented on HDFS-12711:
-----------------------------------------

I've got a hypothesis.

Somewhere in the HDFS code base is a try/catch block that is effectively 
ignoring system exceptions.  A test times out and surefire sends (probably) a 
SIGINT.  The try/catch grabs the exception and tosses it to the side, all the 
while eating CPU and IO.  This situation makes more tests time out.  surefire 
sends more SIGINTs which also either get ignored or never "make it" to the 
process due to CPU being scarce.  Surefire, thinking that those were received, 
fires off even more tests ...

This pattern continues until eventually there is nothing left for surefire 
and/or maven to die on its own, leaving lots of unreaped children, doing 
nothing but destroying the box.

One thing has been bothering me.   Why are projects like HBase that are using 
openjdk7 + some form of branch-2 code base not seeing these problems?

What if the code path was a less frequently traveled one? A feature that isn't 
heavily used. For the vast majority of committers testing a release, it's 
probably not even tested "for reals", never mind in a hostile environment where 
CPU, IO, whatever is scarce. But the HDFS unit tests (and maybe the MR unit 
tests) would almost certainly hit that path, probably several times over.

> deadly hdfs test
> ----------------
>
>                 Key: HDFS-12711
>                 URL: https://issues.apache.org/jira/browse/HDFS-12711
>             Project: Hadoop HDFS
>          Issue Type: Test
>    Affects Versions: 2.9.0, 2.8.2
>            Reporter: Allen Wittenauer
>            Priority: Critical
>         Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-12711) deadly hdfs test

Reply via email to