[jira] [Commented] (HDFS-12711) deadly hdfs test

2018-08-13 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578504#comment-16578504
 ] 

Allen Wittenauer commented on HDFS-12711:
-

bq. Was a follow up jira filed for this work? (and if so, which one was chosen)

Nope.  Few others seemed to care; patches go in regardless of what Jenkins says 
and/or how they may impact the build negatively.  

Yetus 0.7.0 and bumping up the surefire version (at least in trunk) stopped 
hadoop from crashing ASF Jenkins build nodes.  It's still horribly broken, just 
less obviously so. branch-2 nightlies were turned off months ago since they 
were failing at such a high level to be pointless. I don't think anyone really 
pays attention to the trunk nightlies so they should probably be turned off too.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2018-08-13 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578258#comment-16578258
 ] 

Ewan Higgs commented on HDFS-12711:
---

{quote}2. hadoop-hdfs-project either needs to get refactored into multiple 
maven modules or the simultaneous thread counts need to get greatly reduced. 
e.g., just changing the unit test's DN RPC thread count may work.
{quote}
Was a follow up jira files for this work?

Thanks

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-20 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259536#comment-16259536
 ] 

Allen Wittenauer commented on HDFS-12711:
-

FYI HADOOP-13514.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257668#comment-16257668
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Doing some quick math, my estimation is that we received 730 test results out 
of ~3000.  So yes, we lost 75% of the test results in that run.

HDFS-12731's https://builds.apache.org/job/PreCommit-HDFS-Build/22132/ run only 
dropped ~34%.  So hey, that's an improvement...

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257558#comment-16257558
 ] 

Allen Wittenauer commented on HDFS-12711:
-

bq. We usually try to rerun the failed tests locally to check if they are 
related to the patch. 

I think this may be the key as to why I don't think enough people are in panic 
mode. Let's take Erik's log as an example.  It's from HDFS-12823.  Precommit 
reported ~20 tests that either failed or timed out.  It reaped 20 excess 
surefire jvms after mvn returned.  The asflicense check came back with 130 dump 
log files.  Those 130 dump log files in almost every case I looked at were not 
reported to surefire.  That means that we're probably looking at a minimum of 
150 tests failed, not 20. Given that those 120 broken JVMs likely had more than 
1 test...

We're basically dropping a very large percentage (maybe even the majority) of 
test results on the ground.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-17 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257363#comment-16257363
 ] 

Arpit Agarwal commented on HDFS-12711:
--

I don't think committers are ignoring precommit run results. We usually try to 
rerun the failed tests locally to check if they are related to the patch. Being 
a manual process, it is time consuming and error-prone. I also don't think 
anyone is happy about the current situation. Over time we have come to rely on 
complex MiniDFSCluster-based tests over real unit tests and these tend to be 
flaky and fail in hard-to-debug ways. We probably need a community effort to 
revamp our unit tests. This will also require extensive refactoring of existing 
classes to make them unit-testable, a risky task in itself.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-16 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256584#comment-16256584
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Ignoring the hs_err_pid log files is pretty much just sticking our collective 
heads in the sand about actual, real problems with the unit tests. The unit 
tests themselves haven't been rock solid for a very long time, even before all 
of this start happening.   Entries have been put into the ignore pile so often 
that I wouldn't be surprised if the community is already at the point that most 
developers are ignoring precommit.  (e.g., commits with findbugs reported in 
the issues, javadoc compilation failures being treated as "environmental", etc, 
etc.) 

If I were actually paying more attention to day-to-day Hadoop bits these days, 
I'd probably be ready to disable unit tests (at least HDFS) to specifically 
avoid the "cried wolf" condition.  The rest of the precommit tests work 
properly the vast majority of the time and are probably more important given 
the current state of things. (Never mind the massive speed up. QBT is hitting 
the 15 hour mark for a full run for branch-2 when it is actually allowed to 
complete.)  No one seems to actually care that the unit tests are a broken mess 
and I doubt they'd be missed.

My goal here was to prevent Hadoop from bringing down the rest of the ASF build 
infrastructure.  It's under enough stress without this project making things 
that much worse.  Achievement unlocked and other Yetus users will pick up those 
new safety features in the next release.  I should probably close this JIRA 
issue. Unless someone else plans to spend some effort on these bugs?  At least 
at this point in time, I view my work here as complete. 

Also:

{code}
/build/
{code}

ARGH.  That hasn't been valid since Hadoop used ant.  A great example of "well, 
if we ignore it, it doesn't exist, right?"  Because anything that is still 
using /build/ almost certainly isn't safe for parallel tests and likely 
contributing to a whole host of problems.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-16 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256368#comment-16256368
 ] 

Erik Krogen commented on HDFS-12711:


Thanks Sean. Agreed that it is not really a big issue but it does make it more 
likely for a developer to miss an actual license violation (a "QA bot cried 
wolf" situation). It seems maybe it would make more sense for the 
{{hs_err_pid*.log}} files to appear in an already-excluded area, like within 
{{/build/}}, to represent their transient nature. I assume their location 
should be configurable in some way?

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-16 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256295#comment-16256295
 ] 

Erik Krogen commented on HDFS-12711:


Yeah so although we obviously need to fix the unit tests, the license checker 
also shouldn't be picking up those temp output files in the meantime, right?

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-16 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256283#comment-16256283
 ] 

Allen Wittenauer commented on HDFS-12711:
-

It's probably also worth pointing out that those files also represent tests 
that weren't actually executed.  So they aren't recorded in the fail/success 
output. 

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-16 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256275#comment-16256275
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Those files are the stack dumps from the unit tests that ran out of resources.  
Fix the unit tests, those files go away.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-16 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256166#comment-16256166
 ] 

Erik Krogen commented on HDFS-12711:


Hey [~aw], in addition to the wild fluctuations in success of HDFS unit tests 
(not your fault, but unfortunate) I'm seeing lots of false license violations 
caused by these changes, e.g.: 
https://builds.apache.org/job/PreCommit-HDFS-Build/22122/artifact/out/patch-asflicense-problems.txt

Can we do something to solve that?

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251809#comment-16251809
 ] 

Allen Wittenauer commented on HDFS-12711:
-

With the kill code in place, I'm seeing wild fluctuations in hdfs and mr unit 
tests.  Lots of unreaped processes.  Probably a hint that they are paused for 
some reason.  I have a hunch that we're pretty much bottlenecked on IO. Tests 
happen on a single disk that is shared among all the executors on that jenkins 
node.  Let's say 2xHDFS tests are running, that could easily be thousands of 
threads doing IO to the same disk.

It might be smart to decrease the # of parallel tests, at least in HDFS. This 
obviously impacts runtime (which is already out of control) but will probably 
increase accuracy.  Or, we could attempt to split up the tests such that 
compute heavy get done in parallel, IO heavy get done serial. 

Of course, if no one is paying attention to the tests anyway, we could just 
disable them altogether I guess.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249035#comment-16249035
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
13s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
48s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} xml {color} | {color:red}  0m  0s{color} | 
{color:red} The patch has 1 ill-formed XML file(s). {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 36s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hadoop-nfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-common:1 |
| Timed out junit tests | org.apache.hadoop.log.TestLogLevel |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895190/fakepatch.branch-2.txt
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 946a92c520e9 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / c153bed |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_151 |
| xml | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/12/artifact/out/xml.txt |
| Unreaped Processes Log | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/12/artifact/out/patch-unit-hadoop-common-project_hadoop-common-reaper.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/12/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249032#comment-16249032
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
10s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
49s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} xml {color} | {color:red}  0m  1s{color} | 
{color:red} The patch has 1 ill-formed XML file(s). {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 25s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-nfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-common:1 |
| Timed out junit tests | org.apache.hadoop.log.TestLogLevel |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895190/fakepatch.branch-2.txt
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 401f661e60f0 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / c153bed |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_151 |
| xml | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/11/artifact/out/xml.txt |
| Unreaped Processes Log | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/11/artifact/out/patch-unit-hadoop-common-project_hadoop-common-reaper.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/11/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238181#comment-16238181
 ] 

stack commented on HDFS-12711:
--

bq. For now, though, I'm sort of tired at looking at this problem and will go 
work on something else for a while.

Thanks for putting Hadoop in a box.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238107#comment-16238107
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Thanks!

I'll have to play around with sending a SIGQUIT. The other thing is that some 
process types may need different types of signals.  It might be useful to be 
able to define the "signal path"... e.g., surefire processes get QUIT -> TERM 
-> KILL.

I know the other thing is for archiver to save off the stack trace logs  
(hs_err_pidXX.log files) we do get.  That's just a settings thing that I've 
been too busy to setup in Jenkins. 

For now, though, I'm sort of tired at looking at this problem and will go work 
on something else for a while.  It's at the point that the issues are firmly 
contained from ASF build infra perspective and rests solely in the hands of the 
Hadoop community to fix their unit tests (or even base code) to be less broken. 
 

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238056#comment-16238056
 ] 

stack commented on HDFS-12711:
--

This is excellent work.

Would a kill -QUIT before you do actual kill of the errant processes be of use? 
It'd do a dump of stack trace before process goes away (processes might not be 
connected to stdout/stderr anymore?). Thanks.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-11-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237685#comment-16237685
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Finally got a full qbt run on branch-2, thanks to YETUS-561 and YETUS-570:

https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/27/console

branch-2 is still a broken mess (those test times! argh!), but at least it 
won't kill nodes anymore.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226299#comment-16226299
 ] 

Allen Wittenauer commented on HDFS-12711:
-

For those playing at home:

YETUS-570 (in development) changes how precommit handles tests.  It will now 
seek out and kill processes that match a certain pattern after unit tests are 
run.  It reports the number that it had to kill:

| Stuck Test Processes | hadoop-hdfs-project/hadoop-hdfs:21 |

and generates a log to show which processes those were:

| Stuck Test Processes Log | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/9/artifact/out/reaper-hadoop-hdfs-project_hadoop-hdfs.log
 |

It's supposed to do this after every individual module, but I've got a (simple) 
bug to fix first.  In any case, this should help give a metric as to just how 
broken a particular set of tests actually are.  Hopefully at some point we'll 
have the logic to pinpoint it to individual tests, but using the actual unit 
test log should be pretty helpful.

It's also worth pointing out that this change will also help the full qbt to 
actually complete.  But I need to fix that bug first.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226133#comment-16226133
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
0s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  4m 
42s{color} | {color:red} The patch generated 322 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Stuck Test Processes | hadoop-hdfs-project/hadoop-hdfs:21 |
| Failed junit tests | hadoop.fs.TestUnbuffer |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.TestDatanodeRegistration |
|   | org.apache.hadoop.hdfs.TestDFSClientFailover |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsTokens |
|   | org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | org.apache.hadoop.hdfs.TestFileAppendRestart |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter |
|   | org.apache.hadoop.hdfs.TestDFSMkdirs |
|   | org.apache.hadoop.hdfs.TestDFSOutputStream |
|   | org.apache.hadoop.hdfs.TestDatanodeReport |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | org.apache.hadoop.hdfs.TestDistributedFileSystem |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.web.TestHftpFileSystem |
|   | org.apache.hadoop.hdfs.TestReplaceDatanodeFailureReplication |
|   | org.apache.hadoop.hdfs.TestDFSShell |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSAcl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12893963/HDFS-12711.branch-2.00.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 9ec3abba2a98 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 76ec5ea |
| maven | version: Apache 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-30 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225562#comment-16225562
 ] 

Sean Busbey commented on HDFS-12711:


interesting. what's the spread on surefire versions for hadoops? If we are 
triggering things in HBase those builds all use a surefire version that 
includes that fix, with the exception of hbase branch-1.1.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-29 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224389#comment-16224389
 ] 

Allen Wittenauer commented on HDFS-12711:
-

SUREFIRE-524 ?

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-29 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224146#comment-16224146
 ] 

Allen Wittenauer commented on HDFS-12711:
-

I've got a hypothesis.

Somewhere in the HDFS code base is a try/catch block that is effectively 
ignoring system exceptions.  A test times out and surefire sends (probably) a 
SIGINT.  The try/catch grabs the exception and tosses it to the side, all the 
while eating CPU and IO.  This situation makes more tests time out.  surefire 
sends more SIGINTs which also either get ignored or never "make it" to the 
process due to CPU being scarce.  Surefire, thinking that those were received, 
fires off even more tests ...

This pattern continues until eventually there is nothing left for surefire 
and/or maven to die on its own, leaving lots of unreaped children, doing 
nothing but destroying the box.

One thing has been bothering me.   Why are projects like HBase that are using 
openjdk7 + some form of branch-2 code base not seeing these problems?

What if the code path was a less frequently traveled one? A feature that isn't 
heavily used. For the vast majority of committers testing a release, it's 
probably not even tested "for reals", never mind in a hostile environment where 
CPU, IO, whatever is scarce. But the HDFS unit tests (and maybe the MR unit 
tests) would almost certainly hit that path, probably several times over.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223845#comment-16223845
 ] 

Allen Wittenauer commented on HDFS-12711:
-

YETUS-570-wip-01 w/minor output fix test run.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223483#comment-16223483
 ] 

Allen Wittenauer commented on HDFS-12711:
-

https://issues.apache.org/jira/browse/INFRA-15373?focusedCommentId=16223481=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16223481

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223251#comment-16223251
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
20s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
0s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  2m 
40s{color} | {color:red} The patch generated 183 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}113m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestSecureEncryptionZoneWithKMS |
|   | hadoop.hdfs.TestCrcCorruption |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.TestMaintenanceState |
|   | org.apache.hadoop.hdfs.TestSetrepIncreasing |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeDowngrade |
|   | org.apache.hadoop.hdfs.TestHDFSFileSystemContract |
|   | org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage |
|   | org.apache.hadoop.hdfs.TestFileCreationDelete |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter |
|   | org.apache.hadoop.hdfs.TestBlockStoragePolicy |
|   | org.apache.hadoop.hdfs.TestDFSOutputStream |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.TestAppendSnapshotTruncate |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeRollback |
|   | org.apache.hadoop.hdfs.TestMiniDFSCluster |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.TestDFSShell |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSAcl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12893963/HDFS-12711.branch-2.00.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux f9f5a922a080 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 9093ad6 |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_151 |
| unit | 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223191#comment-16223191
 ] 

Allen Wittenauer commented on HDFS-12711:
-

bq. Max. thread count   3114 

5000 proc limit is looking to be a realistic number.  It gives some wiggle room 
but shouldn't be too much to break the world unless things really go south or 
unlucky.

Let's try the branch-2 HDFS fake patch and see what happens, now that I've got 
some metrics gathering.



> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223185#comment-16223185
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 144 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 21s{color} 
| {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient
 generated 2 new + 95 unchanged - 48 fixed = 97 total (was 143) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 51s{color} | {color:orange} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:
 The patch generated 137 new + 2573 unchanged - 305 fixed = 2710 total (was 
2878) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 26s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
24s{color} | {color:red} The patch generated 4 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.mapreduce.TestMapCollection |
|   | hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl |
| Timed out junit tests | org.apache.hadoop.mapred.pipes.TestPipeApplication |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12894259/MAPREDUCE-4980.016.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux 5ca15500a7c1 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8be5707 |
| maven | version: Apache Maven 3.3.9 |
| 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222841#comment-16222841
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Adding some better metrics gathering for YETUS-561 so we have some raw numbers 
to look at.  Also working on some automatic child killing drones in YETUS-570.

In the process of this, learned that while running trunk HDFS unit tests, 
thread count spikes up to ~4000 threads on their own. I think some tests or 
maybe even some core code need some major tuning.  It's pretty easy to see that 
if those JVMs were to stall, the thread count is going to climb to ridiculous 
heights.

In other news, all Yetus-based Hadoop jobs on Jenkins are now running this code 
with the process limit set to 5k.  This is likely too few for mapreduce, given 
it's proclivity to bringing up multiple clusters in their entirety on a regular 
basis.


> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, MAPREDUCE-4980.016.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221741#comment-16221741
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 144 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 21s{color} 
| {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient
 generated 2 new + 95 unchanged - 48 fixed = 97 total (was 143) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 52s{color} | {color:orange} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:
 The patch generated 137 new + 2573 unchanged - 305 fixed = 2710 total (was 
2878) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 45s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m  8s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
24s{color} | {color:red} The patch generated 4 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.mapreduce.TestMapCollection |
|   | hadoop.mapred.TestReduceFetch |
|   | hadoop.mapreduce.v2.TestMRJobs |
|   | hadoop.mapreduce.TestMRJobClient |
| Timed out junit tests | org.apache.hadoop.mapred.pipes.TestPipeApplication |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12894259/MAPREDUCE-4980.016.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux 7cd4aa4da144 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221701#comment-16221701
 ] 

Allen Wittenauer commented on HDFS-12711:
-

1500 is too strict.  Trying 3000.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, MAPREDUCE-4980.016.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221685#comment-16221685
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 144 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  3m 
48s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 22s{color} 
| {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient
 generated 2 new + 95 unchanged - 48 fixed = 97 total (was 143) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:
 The patch generated 137 new + 2573 unchanged - 305 fixed = 2710 total (was 
2878) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 47s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 32m 52s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
18s{color} | {color:red} The patch generated 4 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.mapreduce.TestLargeSort |
|   | hadoop.mapreduce.v2.TestRMNMInfo |
|   | hadoop.mapreduce.v2.TestSpeculativeExecution |
|   | hadoop.mapreduce.v2.TestMRJobsWithProfiler |
|   | hadoop.mapreduce.TestMapCollection |
|   | hadoop.mapreduce.lib.output.TestJobOutputCommitter |
|   | hadoop.mapreduce.v2.TestUberAM |
|   | hadoop.mapreduce.v2.TestMRJobsWithHistoryService |
|   | hadoop.mapreduce.v2.TestMRJobs |
|   | hadoop.mapreduce.v2.TestNonExistentJob |
|   | hadoop.mapreduce.v2.TestMRAppWithCombiner |
|   | hadoop.mapreduce.v2.TestMROldApiJobs |
|   | hadoop.mapreduce.v2.TestMRAMWithNonNormalizedCapabilities |
| Timed out junit tests | org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221587#comment-16221587
 ] 

Allen Wittenauer commented on HDFS-12711:
-

For reference later, two branch-2 patches running on HADOOP that have triggered 
HDFS unit tests

* https://builds.apache.org/job/PreCommit-hadoop-Build/13585/ , H9
* https://builds.apache.org/job/PreCommit-hadoop-Build/13583/ , H1 


> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, MAPREDUCE-4980.016.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221575#comment-16221575
 ] 

Allen Wittenauer commented on HDFS-12711:
-

This is a patch that I know how well it performs in general.  Usually one or 
two failures, highly threaded, etc, etc.  So let's see if 2k proc limit is 
realistic.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch, MAPREDUCE-4980.016.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221500#comment-16221500
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Some status, after taking part of the afternoon to do something besides look at 
this.

a) It's clear that surefire and therefore both Jenkins and Yetus, doesn't by 
default report on tests that didn't launch during low memory conditions.  This 
is a bit problematic: we could have had this problem for a very long time but 
we never would have known it.  It'd be great if there was a way to ask surefire 
what is expected to run and then do a pre-/post- comparison.  Something else to 
consider:  there are likely lots and lots of projects that may not know they 
have tests failing in this way...

b) That said, there are two things that we might be able to add to Yetus that 
might help:  some sort of reporting of JVMs that are still running after maven 
exits and killing leftover Java processes in between module unit testing.  
Obviously, this would only be available when running under Docker.

c) This patch is working well enough that I've reconfigured 
Precommit-HADOOP-build to use my private Yetus branch with this patch.  This is 
primarily to give me more general usability data.  I've set the proclimit to be 
5000 and mem limit to be 20g (that's slightly less than half of the node's full 
memory size).  proclimit is likely too high but should give us a lot of wiggle 
room to tune downwards. 

d) As part of deploying this version of Yetus, I also took the opportunity to 
reconfigure the job to be laid out in a more sane manner as well as setting it 
to run on all of the Hadoop build nodes.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221229#comment-16221229
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
34s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue}  0m 
14s{color} | {color:blue} ASF License check generated no output? {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestHftpFileSystem |
| Timed out junit tests | org.apache.hadoop.hdfs.TestBlockStoragePolicy |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeRollback |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSAcl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12893963/HDFS-12711.branch-2.00.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 9c99ee79afc9 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 4678315 |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/3/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/3/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221083#comment-16221083
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Note to self:

Kicking off another run with proclimit set to 1500.



> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221077#comment-16221077
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Success!

One interesting thing:  surefire is a liar.  A lot more tests failed than what 
it reported.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221076#comment-16221076
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
21s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestFileStatus |
|   | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | hadoop.hdfs.web.TestWebHDFSAcl |
|   | hadoop.hdfs.TestDFSMkdirs |
|   | hadoop.hdfs.TestSnapshotCommands |
|   | hadoop.hdfs.TestRollingUpgradeRollback |
|   | hadoop.hdfs.TestTrashWithEncryptionZones |
|   | hadoop.hdfs.web.TestFSMainOperationsWebHdfs |
|   | hadoop.hdfs.TestBlockStoragePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-12711 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12893963/HDFS-12711.branch-2.00.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 1ad1a0b77a83 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 4678315 |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/2/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build2/2/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221069#comment-16221069
 ] 

Allen Wittenauer commented on HDFS-12711:
-


https://builds.apache.org/job/PreCommit-HDFS-Build2/2

This is better or worse, depending upon your point of view.

Many of these:
{code}
estDeleteEZWithMultipleUsers(org.apache.hadoop.hdfs.TestTrashWithEncryptionZones)
  Time elapsed: 5.134 sec  <<< ERROR!
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:717)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:557)
at 
io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146)
at 
io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69)
at 
org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:272)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1986)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:1868)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1858)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1837)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1811)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1804)
at 
org.apache.hadoop.hdfs.TestTrashWithEncryptionZones.teardown(TestTrashWithEncryptionZones.java:118)
{code}

which eventually turn into these:

{code}
/bin/sh: 1: Cannot fork
/bin/sh: 1: Cannot fork
/bin/sh: 1: Cannot fork
/bin/sh: 1: Cannot fork
/bin/sh: 1: Cannot fork
/bin/sh: 1: Cannot fork
{code}

but that's not it's final form!  Oh no, it then turns into reams and reams of 
this:

{code}
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.
# An error report file with more information is saved as:
# /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/hs_err_pid20306.log
#
{code}

Now we just have to wait and see if the walls put up were enough to stop H4 
from crashing.

(the 1k default process limit is *probably* a smidge too low, but not by too 
much.)

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219757#comment-16219757
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Ok, at least locally, YETUS-561.04 is successfully preventing things from 
running wild.  It's pretty obvious at this point that something is very wrong 
with Hadoop code + OpenJDK 7.  Now whether that is only caused by the unit 
tests or something actually in the core code base, I have no idea.  I'll let 
someone else deal with that.

As soon as Jenkins clears itself up, I'll try testing -04 with the Hadoop code 
base on the ASF boxes.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219641#comment-16219641
 ] 

Allen Wittenauer commented on HDFS-12711:
-

lol yeah, that's... too many.

OK, I know what I need to do to at least get this under control.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219618#comment-16219618
 ] 

Íñigo Goiri commented on HDFS-12711:


Almost a hundred java processes hanging around.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219587#comment-16219587
 ] 

Allen Wittenauer commented on HDFS-12711:
-

See how many java processes are on that JDK7 node. Also, is that Oracle JDK7 or 
OpenJDK7?

Thanks!


> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219573#comment-16219573
 ] 

Íñigo Goiri commented on HDFS-12711:


[~aw] just to confirm, I just run the unit tests with OpenJDK 8 and everything 
worked in around 5 hours.
The machine running the tests with OpenJDK 7 is still crawling after 3 days.
So I'd say is an issue with Java 7.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219512#comment-16219512
 ] 

Allen Wittenauer commented on HDFS-12711:
-

For reference:

{code}
$ free -h
  totalusedfree  shared  buff/cache   available
Mem:   9.7G9.3G157M9.0M265M 39M
Swap:0B  0B  0B
{code}

I have the docker -m parameter set to 10g, just to see what would happen...



> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219501#comment-16219501
 ] 

Allen Wittenauer commented on HDFS-12711:
-

OK, I just hit it here. On my laptop's Linux VM, things start falling apart 
around 35 Java processes running, almost all of them fired off from surefire.  
Some things start to fail (I couldn't do a ps!).   Eventually I can again and 
see what is happening. Only *one* process has been killed, even though the 
shell couldn't fork for ps.  

Given that my CPU is currently pegged (uptime says the loadavg is over 50), I'm 
guessing the same thing is happening on a bigger scale on the ASF build boxes.  
There aren't enough cycles on the CPU to run the OOM killer fast enough for it 
to kill things.  Eventually, other stuff fails to launch due to insufficient 
memory.  This eventually confuses the Jenkins agent and it all falls apart from 
there.

It *eventually* gets over the hump, but by then it's too late.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219475#comment-16219475
 ] 

Allen Wittenauer commented on HDFS-12711:
-

pono just provided this ps from one of the dead nodes:

https://paste.apache.org/p/2D4I

Umm, yeah, that's not good.  java isn't reaping or there's a fork bomb in one 
of the unit tests.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219416#comment-16219416
 ] 

Allen Wittenauer commented on HDFS-12711:
-

FYI:

Reducing the number of nodes that HDFS, HADOOP, and Hadoop's QBT jobs may run 
on to reduce the impact to the rest of the ASF.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219365#comment-16219365
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Something else worth mentioning that I failed to earlier:  that run was done 
without the patched version of Yetus. The goal was to reproduce with just the 
hdfs chunk running and that was accomplished.  Now we just need to see if 
patched Yetus will protect everyone else.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219353#comment-16219353
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Re-launched the agent using the Jenkins UI.  But now Jenkins doesn't appear to 
want to schedule *any* jobs.

Just shoot me.

... In others news, a bit of "inside baseball" that I bet a lot of people don't 
know.

When Yetus launches a docker container, Jenkins doesn't know how to kill it.  
It sends the equiv of ctrl-c to the docker CLI but it doesn't seem to respond 
to it. So the Docker container *continues to run*. (Thus why "timed out" tasks 
will still run if they are running their docker armor). In qbt mode, there is 
no JIRA to write to.  So the output is actually handled by Jenkins.  In 
test-patch mode, Yetus has a JIRA to write output to.  What we are seeing is 
that Yetus is continuing to run, finishes, then says "yeah, a bunch of stuff 
failed.  fix your code."  Meanwhile, outside the container, it's death and 
destruction and the loss of the Jenkins agent and probably other stuff.

But this does mean at least in this run, that it was *NOT* a kernel panic 
because otherwise we would never have gotten any feedback at all.  That's 
fantastic news because it means there are likely some controls that can put 
around it just a matter if they are OS-level/infra or docker-related.

It's worth noting that from what I can tell, surefire will report OOM'd and/or 
otherwise externally killed tests as "timed out".  So there was still a lot of 
death and destruction inside the container as well.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219295#comment-16219295
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Uh that was unexpected.

According to Jenkins, that node is dead!

Maybe there is hope after all.

Just need to give some cycles to 
https://builds.apache.org/job/PreCommit-HDFS-Build2/ to test if the OOM 
protection features in Yetus works.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 2.9.0, 2.8.2
>Reporter: Allen Wittenauer
>Priority: Critical
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219285#comment-16219285
 ] 

Hadoop QA commented on HDFS-12711:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
20s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}184m 46s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  2m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}204m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.hdfs.TestEncryptionZones |
|   | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.TestHdfsAdmin |
|   | org.apache.hadoop.hdfs.TestDatanodeRegistration |
|   | org.apache.hadoop.hdfs.TestParallelRead |
|   | org.apache.hadoop.hdfs.TestMaintenanceState |
|   | org.apache.hadoop.hdfs.TestBlocksScheduledCounter |
|   | org.apache.hadoop.hdfs.TestMultiThreadedHflush |
|   | org.apache.hadoop.hdfs.TestReservedRawPaths |
|   | org.apache.hadoop.hdfs.TestSetrepIncreasing |
|   | org.apache.hadoop.hdfs.TestReplication |
|   | org.apache.hadoop.hdfs.TestDataTransferKeepalive |
|   | org.apache.hadoop.hdfs.TestDatanodeDeath |
|   | org.apache.hadoop.hdfs.TestFileAppend |
|   | org.apache.hadoop.hdfs.TestBlockMissingException |
|   | org.apache.hadoop.hdfs.TestFileAppend4 |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeDowngrade |
|   | org.apache.hadoop.hdfs.TestHDFSFileSystemContract |
|   | org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage |
|   | org.apache.hadoop.hdfs.TestDFSPermission |
|   | org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | org.apache.hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | org.apache.hadoop.hdfs.TestCrcCorruption |
|   | org.apache.hadoop.hdfs.TestFileCreationDelete |
|   | org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS |
|   | org.apache.hadoop.hdfs.TestDFSAddressConfig |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter |
|   | org.apache.hadoop.hdfs.TestLeaseRecovery |
|   | org.apache.hadoop.hdfs.TestBlockStoragePolicy |
|   | org.apache.hadoop.hdfs.TestDatanodeConfig |
|   | org.apache.hadoop.hdfs.TestSeekBug |
|   | org.apache.hadoop.hdfs.TestDFSOutputStream |
|   | org.apache.hadoop.hdfs.TestDFSUpgrade |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeRollback |
|   | 

[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219243#comment-16219243
 ] 

Allen Wittenauer commented on HDFS-12711:
-

OK, messing with OOM from the docker perspective is NOT protecting from this 
crash.  I'm at the point where unless I can reproduce this locally, I'm not 
going to be able to diagnose it any farther.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219178#comment-16219178
 ] 

stack commented on HDFS-12711:
--

Rah rah [~aw]! Thanks for digging in.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219103#comment-16219103
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Total run time was a bit over an hour.  Much easier now to reproduce.  Gonna 
set up a new job on jenkins so that I have more control over the situation.

It's also interesting to note that a trunk job just failed, but it seemed to be 
at a different spot in the hdfs-unit tests.  Unrelated? Probably.  The gear 
being used in the ASF build inf is probably getting old, s



> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219099#comment-16219099
 ] 

Allen Wittenauer commented on HDFS-12711:
-

Hooray it crashed!


> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219020#comment-16219020
 ] 

Allen Wittenauer commented on HDFS-12711:
-

The problem that we're seeing is that if we run the full gamut of Yetus tests 
on branch-2, it invariably hard crashes the node on the ASF build 
infrastructure.  Always. Every time.  It also always happen during the hdfs 
unit tests. I'm attempting to isolate it to see if it is *just* the HDFS unit 
tests that trigger the crash or if I need to run through common, etc, first 
too.  This will do two things:

* drop the reproducible test case down from 5+ hours to 1+ hours
* confirm that's it is entirely in the hdfs unit tests and not from something 
else in the path

My ultimate goal is to get Yetus to configure the Docker container to at least 
prevent the crashes.  But I need a 'faster' way to test...

FWIW, I'm still unsure if it is a kernel panic or just breaking Jenkins enough 
that it thinks the node is catatonic.  From the *one* time I was able to see 
logs before they disappeared, there were a ton of OOM errors, a core dump, and 
more.  This leads me to believe that it is likely a kernel panic caused by the 
OOM killer going nuts, since it's well established how badly the Linux kernel 
behaves under low mem.  (Thus why I can't really test at home either... I'm not 
using Linux on my "big box")

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219009#comment-16219009
 ] 

Anu Engineer commented on HDFS-12711:
-

[~aw] [~daryn] [~kihwal] Care to enlighten us lesser souls about how this empty 
line impacts tests?

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218997#comment-16218997
 ] 

Íñigo Goiri commented on HDFS-12711:


[~aw] the unit tests seemed to be running a week ago with Java 8.
I started them 2 days ago with Java 7 and I'm still half way.
Haven't double checked but the OpenJDK 7 issue seems to match.
I can start running again with Java 8 to see.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218914#comment-16218914
 ] 

Allen Wittenauer commented on HDFS-12711:
-

FWIW, my current hypothesis is that we're triggering a bug in OpenJDK 7.  If 
others could run some tests against it, that'd be super.  I don't have the 
equipment here at home to really test it properly and I'm certainly not going 
to burn money on AWS/Azure/whatever on it.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218895#comment-16218895
 ] 

Allen Wittenauer commented on HDFS-12711:
-

BTW, there a bunch of tests that still use build/test/data in HDFS.  I wouldn't 
be surprised if that's where some of the flaky tests are.  Meanwhile, back to 
trying to figure out if I can break a node just running them... *sigh*

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218892#comment-16218892
 ] 

Kihwal Lee commented on HDFS-12711:
---

bq. -1 on adding Skynet logic to the hdfs tests.
+2 to make it +1. We should make it deadlier.  Jokes aside, I've noticed 
TestPread hits maven timeout in 2.8. It passes in trunk.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
> Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12711) deadly hdfs test

2017-10-25 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218869#comment-16218869
 ] 

Daryn Sharp commented on HDFS-12711:


-1 on adding Skynet logic to the hdfs tests.

> deadly hdfs test
> 
>
> Key: HDFS-12711
> URL: https://issues.apache.org/jira/browse/HDFS-12711
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Allen Wittenauer
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org