[
https://issues.apache.org/jira/browse/HADOOP-15711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622509#comment-16622509
]
Allen Wittenauer commented on HADOOP-15711:
-------------------------------------------
bq. Ran a build for branch-2.7... but it was running fine
Thanks for confirming exactly what I said above: 2.7 does not suffer from the
same sets of problems that 2.8 and up do. The irony is that it's not running
fine. (More below)
bq. So I am wondering if we can try a test run with this patch reverted so we
can see the results. Allen Wittenauer thoughts on this? Do you know if
reverting this will cause issues on the jenkins infra?
Did anyone even bother to follow to the chain as to WHY that JIRA exists? I'm
guessing no, because if anyone did they would have realized this:
2.7 suffers from the exact problem that HADOOP-15251 fixes.
{code}
Found and killed 14 left over processes
{code}
This means that surefire is not properly killing the JVMs that are spawned for
unit tests. Those unit tests that are killed in this way are NOT reported.
This, in turn, means that unit test results are COMPLETELY UNRELIABLE. So yes,
you get a report (usually...), but with large chunks of missing data. In some
cases, up to 70% of the unit tests are never executed and never reported.
[This was specifically reported in HDFS-12711.]
FWIW: Hadoop is no longer my day job. So someone needs to sit down and really
comprehend all the bits and pieces in play here. It's a lot more complex than
just a surface reading of one JIRA.
> Fix branch-2 builds
> -------------------
>
> Key: HADOOP-15711
> URL: https://issues.apache.org/jira/browse/HADOOP-15711
> Project: Hadoop Common
> Issue Type: Task
> Reporter: Jonathan Hung
> Priority: Critical
> Attachments: HADOOP-15711.001.branch-2.patch
>
>
> Branch-2 builds have been disabled for a while:
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/
> A test run here causes hdfs tests to hang:
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/4/
> Running hadoop-hdfs tests locally reveal some errors such
> as:{noformat}[ERROR]
> testComplexAppend2(org.apache.hadoop.hdfs.TestFileAppend2) Time elapsed:
> 0.059 s <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1164)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1128)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:174)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1172)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:403)
> at
> org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:234)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1080)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:883)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:514)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:473)
> at
> org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend(TestFileAppend2.java:489)
> at
> org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend2(TestFileAppend2.java:543)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){noformat}
> I was able to get more tests passing locally by increasing the max user
> process count on my machine. But the error suggests that there's an issue in
> the tests themselves. Not sure if the error seen locally is the same reason
> as why jenkins builds are failing, I wasn't able to confirm based on the
> jenkins builds' lack of output.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]