[ 
https://issues.apache.org/jira/browse/HADOOP-15711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623288#comment-16623288
 ] 

Allen Wittenauer commented on HADOOP-15711:
-------------------------------------------

bq. Can we disable parallel tests for full builds then ?

I don't even want to contemplate the runtime.  With parallelization, it is 
already in the 14-15 hour mark, easily.  

[BTW, have a client that was using Yetus internally for 2.7.  As a result, one 
of the changes in Yetus 0.8.0 was to disable parallelization for 2.7 and lower 
since those aren't thread safe anyway.  That's why the 2.7 run timed out after 
20 hours. ]

bq. Will that help track down specific tests that go OOM for instance ?

Probably not, and definitely not without also setting the test order.  Also: no 
one has really done the homework to figure out why things are going catatonic.  
My guess is something in hadoop-common (and FileSystem in particular) that 
doesn't work well on JDK7.  That's why trunk doesn't have these issues.  My 
other hunch is that commit date of the problematic code is was committed 
between June 2016-June 2017.  Because we had other issues and no one was aware 
of how bad SUREFIRE-524 was, the issues had been hidden.

> Fix branch-2 builds
> -------------------
>
>                 Key: HADOOP-15711
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15711
>             Project: Hadoop Common
>          Issue Type: Task
>            Reporter: Jonathan Hung
>            Priority: Critical
>         Attachments: HADOOP-15711.001.branch-2.patch
>
>
> Branch-2 builds have been disabled for a while: 
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/
> A test run here causes hdfs tests to hang: 
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/4/
> Running hadoop-hdfs tests locally reveal some errors such 
> as:{noformat}[ERROR] 
> testComplexAppend2(org.apache.hadoop.hdfs.TestFileAppend2)  Time elapsed: 
> 0.059 s  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:714)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1164)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1128)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:174)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1172)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:403)
>         at 
> org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:234)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1080)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:883)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:514)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:473)
>         at 
> org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend(TestFileAppend2.java:489)
>         at 
> org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend2(TestFileAppend2.java:543)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){noformat}
> I was able to get more tests passing locally by increasing the max user 
> process count on my machine. But the error suggests that there's an issue in 
> the tests themselves. Not sure if the error seen locally is the same reason 
> as why jenkins builds are failing, I wasn't able to confirm based on the 
> jenkins builds' lack of output.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to