[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-06-27 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045574#comment-14045574
 ] 

Hong Zhiguo commented on YARN-1872:
---

Hi,  Vinod, this is not just a test failure. It occurs frequently in out real 
cluster. When this happens, the application remains running forever until it's 
killed manually.

And I don't think the fix is just a work around before YARN-1902 get in. It 
makes DistributedShell taking less assumption about outside, and more tolerant 
to unexpected behavior of outside.

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Hong Zhiguo
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-05-20 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004373#comment-14004373
 ] 

Hong Zhiguo commented on YARN-1872:
---

Binglin, I got same failure. The phenomenon and reason of your failure is 
different with this one reported by Ted Yu.
I fixed it by YARN-2081.

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Hong Zhiguo
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-05-15 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998493#comment-13998493
 ] 

Binglin Chang commented on YARN-1872:
-

Hi, testDSShell fails with asser failed, don't know whether it is relevant:

https://builds.apache.org/job/Hadoop-Yarn-trunk/561/consoleText

testDSShell(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
  Time elapsed: 27.557 sec   FAILURE!
java.lang.AssertionError: expected:1 but was:0
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:198)


Results :

Failed tests: 
  TestDistributedShell.testDSShell:198 expected:1 but was:0

Tests run: 8, Failures: 1, Errors: 0, Skipped: 0
 

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Hong Zhiguo
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-05-02 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988106#comment-13988106
 ] 

Tsuyoshi OZAWA commented on YARN-1872:
--

As a workaround for 2.4.1 release, +1 for [~zhiguohong]'s patch(non-binding).

[~zjshen], [~ste...@apache.org], how about solving the issue essentially on 
YARN-1902 against 2.5.0 release as Hong mentioned? What do you think?

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Hong Zhiguo
Priority: Blocker
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-04-23 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978975#comment-13978975
 ] 

Steve Loughran commented on YARN-1872:
--

all current AMs have to deal with this event; surplus containers need to be 
released -does the AM do this once patched? This is important not so much for 
the test but because Distributed Shell is the foundation of most YARN apps -it 
needs the robustness for others to pick up

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Hong Zhiguo
Priority: Blocker
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-04-23 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979297#comment-13979297
 ] 

Zhijie Shen commented on YARN-1872:
---

bq. all current AMs have to deal with this event;

This is why I'm thinking we should fix the issue in the scope of 
AMRMClient(Async), to isolate it from AMs. Given the AM use the client library, 
the developers don't need to worry about it. On the other hand, if the AM 
interacts with ApplicationMasterProtocol directly (like MR), they have to work 
out themselves.


 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Hong Zhiguo
Priority: Blocker
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-04-07 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962528#comment-13962528
 ] 

Hong Zhiguo commented on YARN-1872:
---

I suggest to take this patch since it makes DistributedShell more robust 
regardless of AMRMClient accuracy.
And it realy fixes this failure.

In MapReduce V2 AM, there's similar logic to be tolerance of extra allocation.

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Hong Zhiguo
Priority: Blocker
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-04-04 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960357#comment-13960357
 ] 

Zhijie Shen commented on YARN-1872:
---

bq. After the DistributedShell AM requested numTotalContainers containers, RM 
main allocate more than that.

[~zhiguohong], thanks for working on the test failure. Do you know why RM is 
likely to allocate more containers than AM requested? Is it related to what 
YARN-1902 described?

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Hong Zhiguo
Priority: Blocker
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-04-04 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960797#comment-13960797
 ] 

Hong Zhiguo commented on YARN-1872:
---

Yes. It is. And MapRedue V2 AM contains some code to work around for this 
strange behavior.
I'll review YARN-1902 patch later.
But anyway, it's better to move the check to inside the loop (What's done in 
this patch).


 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Hong Zhiguo
Priority: Blocker
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-04-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958485#comment-13958485
 ] 

Hadoop QA commented on YARN-1872:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12638402/YARN-1872.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3508//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3508//console

This message is automatically generated.

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Hong Zhiguo
  Labels: patch
 Attachments: TestDistributedShell.out, YARN-1872.patch


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-03-31 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956122#comment-13956122
 ] 

Hong Zhiguo commented on YARN-1872:
---

I met the timeout too. But I can't reproduce it.
Could you reproduce the timeout? Can you attach the 
TestDistributedShell-output.txt under surefire-reports?

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
 Attachments: TestDistributedShell.out


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-03-26 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947649#comment-13947649
 ] 

Tsuyoshi OZAWA commented on YARN-1872:
--

I faced this issue too. But the failure is a bit different from Ted's report:

{code}
---
Test set: 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
---
Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 74.081 sec  
FAILURE! - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
testDSShell(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
  Time elapsed: 0.032 sec   FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:186)
{code}


 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
 Attachments: TestDistributedShell.out


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-03-26 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948489#comment-13948489
 ] 

Tsuyoshi OZAWA commented on YARN-1872:
--

The failure I reported is fixed on YARN-1873. Please ignore it and this JIRA 
should focus on the bug Ted's reported.

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
 Attachments: TestDistributedShell.out


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)