[jira] [Commented] (YARN-4385) TestDistributedShell times out

2015-12-19 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065598#comment-15065598
 ] 

Naganarasimha G R commented on YARN-4385:
-

Faced one more intermittent failure in 2928 branch but not related to ATS v2 
code
{code}
--
 T E S T S
---
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 476.165 sec 
<<< FAILURE! - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
testDSShellWithDomain(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
  Time elapsed: 29.211 sec  <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV1(TestDistributedShell.java:356)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:317)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithDomain(TestDistributedShell.java:195)

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.703 sec - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.yarn.applications.distributedshell.TestDSAppMaster
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.508 sec - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDSAppMaster

Results :

Failed tests: 
  
TestDistributedShell.testDSShellWithDomain:195->testDSShell:317->checkTimelineV1:356
 expected:<2> but was:<3>

Tests run: 16, Failures: 1, Errors: 0, Skipped: 0
{code}
{{TestDistributedShell.checkTimelineV1}} checks whether only 2 (requested) 
containers are being launched. But in reality more than 2 are getting launched. 
possible reasons for it are :
* when RM has assigned additional containers and the Distributed shell AM is 
launching it. I had observed similar behavior of over assigning in MR also but 
MR AM takes care returning the extra apps assigned by the RM. Similar approach 
should exist in Distributed shell AM too.
* container has been killed for some reason and extra Container is started

Not sure which of these cases is causing the assigning of additional 
containers, to analyze this we require more RM and AM logs.
Possible solutions are :
* Instead of checking only 2 we can check for at least 2, so that test case 
will not fail if more than 2 containers are launched
* Try to ensure not more than desired containers are launched even though RM 
allocates more containers 
 

> TestDistributedShell times out
> --
>
> Key: YARN-4385
> URL: https://issues.apache.org/jira/browse/YARN-4385
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Tsuyoshi Ozawa
>Assignee: Naganarasimha G R
> Attachments: 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell-output.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4385) TestDistributedShell times out

2015-12-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062444#comment-15062444
 ] 

Naganarasimha G R commented on YARN-4385:
-

Hi [~ozawa], 
Please confirm if this is reproducible if not planning to disable it !

> TestDistributedShell times out
> --
>
> Key: YARN-4385
> URL: https://issues.apache.org/jira/browse/YARN-4385
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Tsuyoshi Ozawa
>Assignee: Naganarasimha G R
> Attachments: 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell-output.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4385) TestDistributedShell times out

2015-12-12 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054206#comment-15054206
 ] 

Naganarasimha G R commented on YARN-4385:
-

Hi [~ozawa],
Is this getting reproduced now ? Tried many times was not able to reproduce but 
was not successful. and from the logs saw only related log exception as below, 
but felt from it that it was temporal issue your machine. Please confirm to 
analyze further 
{code}
2015-11-22 19:29:54,739 DEBUG [IPC Client (273924) connection to 
ip-172-31-20-42.ap-northeast-1.compute.internal/172.31.20.42:52028 from ubuntu] 
ipc.Client (Client.java:close(1208)) - closing ipc connection to 
ip-172-31-20-42.ap-northeast-1.compute.internal/172.31.20.42:52028: null
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1110)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1005)
2015-11-22 19:29:54,739 DEBUG [IPC Client (273924) connection to 
ip-172-31-20-42.ap-northeast-1.compute.internal/172.31.20.42:52028 from ubuntu] 
ipc.Client (Client.java:close(1217)) - IPC Client (273924) connection to 
ip-172-31-20-42.ap-northeast-1.compute.internal/172.31.20.42:52028 from ubuntu: 
closed
2015-11-22 19:29:54,739 DEBUG [IPC Client (273924) connection to 
ip-172-31-20-42.ap-northeast-1.compute.internal/172.31.20.42:52028 from ubuntu] 
ipc.Client (Client.java:run(1018)) - IPC Client (273924) connection to 
ip-172-31-20-42.ap-northeast-1.compute.internal/172.31.20.42:52028 from ubuntu: 
stopped, remaining connections 0
2015-11-22 19:29:54,743 DEBUG [Thread-3684] retry.RetryInvocationHandler 
(RetryInvocationHandler.java:invoke(151)) - Exception while invoking 
getApplicationReport of class ApplicationClientProtocolPBClientImpl over null. 
Retrying after sleeping for 3ms.
java.io.EOFException: End of File Exception between local host is: 
"ip-172-31-20-42.ap-northeast-1.compute.internal/172.31.20.42"; destination 
host is: "ip-172-31-20-42.ap-northeast-1.compute.internal":52028; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
at org.apache.hadoop.ipc.Client.call(Client.java:1385)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy87.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:220)
at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy88.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:446)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:740)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:715)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithCustomLogPropertyFile(TestDistributedShell.java:502)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 

[jira] [Commented] (YARN-4385) TestDistributedShell times out

2015-12-07 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046092#comment-15046092
 ] 

Naganarasimha G R commented on YARN-4385:
-

Hi [~ozawa],
I would like to take a look at this as its related to other jira which i was 
working on, Please reassign if you are already handling it.


> TestDistributedShell times out
> --
>
> Key: YARN-4385
> URL: https://issues.apache.org/jira/browse/YARN-4385
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Tsuyoshi Ozawa
>Assignee: Naganarasimha G R
> Attachments: 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell-output.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4385) TestDistributedShell times out

2015-11-23 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022660#comment-15022660
 ] 

Tsuyoshi Ozawa commented on YARN-4385:
--

>From https://builds.apache.org/job/Hadoop-Yarn-trunk/1380/

{quote}

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 11262 lines...]
  TestDistributedShell.setup:72->setupInternal:94 » YarnRuntime 
java.io.IOExcept...
  TestDistributedShell.setup:72->setupInternal:94 » YarnRuntime 
java.io.IOExcept...
  TestDistributedShell.setup:72->setupInternal:94 » YarnRuntime 
java.io.IOExcept...
  TestDistributedShell.setup:72->setupInternal:94 » YarnRuntime 
java.io.IOExcept...
  TestDistributedShell.setup:72->setupInternal:94 » YarnRuntime 
java.io.IOExcept...
  TestDistributedShell.setup:72->setupInternal:94 » YarnRuntime 
java.io.IOExcept...
  TestDistributedShell.setup:72->setupInternal:94 » YarnRuntime 
java.io.IOExcept...
  TestDistributedShellWithNodeLabels.setup:47 » YarnRuntime 
java.io.IOException:...

Tests run: 14, Failures: 0, Errors: 12, Skipped: 0

[INFO] 
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop YARN  SUCCESS [  4.803 s]
[INFO] Apache Hadoop YARN API  SUCCESS [04:44 min]
[INFO] Apache Hadoop YARN Common . SUCCESS [03:31 min]
[INFO] Apache Hadoop YARN Server . SUCCESS [  0.109 s]
[INFO] Apache Hadoop YARN Server Common .. SUCCESS [ 57.348 s]
[INFO] Apache Hadoop YARN NodeManager  SUCCESS [10:05 min]
[INFO] Apache Hadoop YARN Web Proxy .. SUCCESS [ 29.458 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService .. SUCCESS [03:46 min]
[INFO] Apache Hadoop YARN ResourceManager  SUCCESS [  01:03 h]
[INFO] Apache Hadoop YARN Server Tests ... SUCCESS [01:52 min]
[INFO] Apache Hadoop YARN Client . SUCCESS [07:21 min]
[INFO] Apache Hadoop YARN SharedCacheManager . SUCCESS [ 32.136 s]
[INFO] Apache Hadoop YARN Applications ... SUCCESS [  0.053 s]
[INFO] Apache Hadoop YARN DistributedShell ... FAILURE [ 29.403 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher .. SKIPPED
[INFO] Apache Hadoop YARN Site ... SKIPPED
[INFO] Apache Hadoop YARN Registry ... SKIPPED
[INFO] Apache Hadoop YARN Project  SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 01:37 h
[INFO] Finished at: 2015-11-09T20:36:25+00:00
[INFO] Final Memory: 81M/690M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-yarn-applications-distributedshell: There are test failures.
[ERROR]
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Yarn-trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-yarn-applications-distributedshell
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating HDFS-9234
Sending e-mails to: yarn-...@hadoop.apache.org
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
12 tests failed.
FAILED:  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithInvalidArgs

Error Message:
java.io.IOException: ResourceManager failed to start. Final state is STOPPED

Stack Trace:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: 
ResourceManager failed to start. Final state is STOPPED
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:331)
at