[ 
https://issues.apache.org/jira/browse/FLINK-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900419#comment-14900419
 ] 

Robert Metzger commented on FLINK-2392:
---------------------------------------

This one failed with the following issue: The TaskManager had an OutOfMemory 
exception:

{code}
16:58:16,631 ERROR org.apache.flink.runtime.taskmanager.TaskManager             
 - Error while starting up taskManager
java.lang.Exception: OutOfMemory error (Java heap space) while allocating the 
TaskManager heap memory (17973782 bytes).
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1651)
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.runTaskManager(TaskManager.scala:1460)
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala:1326)
        at 
org.apache.flink.runtime.taskmanager.TaskManager.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala)
        at 
org.apache.flink.yarn.appMaster.YarnTaskManagerRunner$1.run(YarnTaskManagerRunner.java:99)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608)
        at 
org.apache.flink.yarn.appMaster.YarnTaskManagerRunner.main(YarnTaskManagerRunner.java:95)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.flink.runtime.memory.MemoryManager$HeapMemoryPool.<init>(MemoryManager.java:611)
        at 
org.apache.flink.runtime.memory.MemoryManager.<init>(MemoryManager.java:163)
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1640)
        ... 8 more
16:58:16,634 ERROR org.apache.flink.yarn.appMaster.YarnTaskManagerRunner        
 - Error while starting the TaskManager
java.lang.Exception: OutOfMemory error (Java heap space) while allocating the 
TaskManager heap memory (17973782 bytes).
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1651)
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.runTaskManager(TaskManager.scala:1460)
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala:1326)
        at 
org.apache.flink.runtime.taskmanager.TaskManager.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala)
        at 
org.apache.flink.yarn.appMaster.YarnTaskManagerRunner$1.run(YarnTaskManagerRunner.java:99)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608)
        at 
org.apache.flink.yarn.appMaster.YarnTaskManagerRunner.main(YarnTaskManagerRunner.java:95)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.flink.runtime.memory.MemoryManager$HeapMemoryPool.<init>(MemoryManager.java:611)
        at 
org.apache.flink.runtime.memory.MemoryManager.<init>(MemoryManager.java:163)
        at 
org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1640)
        ... 8 more
16:58:16,639 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator        
 - Shutting down remote daemon.
{code}

> Instable test in flink-yarn-tests
> ---------------------------------
>
>                 Key: FLINK-2392
>                 URL: https://issues.apache.org/jira/browse/FLINK-2392
>             Project: Flink
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Matthias J. Sax
>            Assignee: Robert Metzger
>            Priority: Critical
>              Labels: test-stability
>
> The test YARNSessionFIFOITCase fails from time to time on an irregular basis. 
> For example see: https://travis-ci.org/apache/flink/jobs/72019690
> {noformat}
> Tests run: 12, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 205.163 sec 
> <<< FAILURE! - in org.apache.flink.yarn.YARNSessionFIFOITCase
> perJobYarnClusterWithParallelism(org.apache.flink.yarn.YARNSessionFIFOITCase) 
>  Time elapsed: 60.651 sec  <<< FAILURE!
> java.lang.AssertionError: During the timeout period of 60 seconds the 
> expected string did not show up
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at org.apache.flink.yarn.YarnTestBase.runWithArgs(YarnTestBase.java:478)
>       at 
> org.apache.flink.yarn.YARNSessionFIFOITCase.perJobYarnClusterWithParallelism(YARNSessionFIFOITCase.java:435)
> Results :
> Failed tests: 
>   
> YARNSessionFIFOITCase.perJobYarnClusterWithParallelism:435->YarnTestBase.runWithArgs:478
>  During the timeout period of 60 seconds the expected string did not show up
> {noformat}
> Another error case is this (see 
> https://travis-ci.org/mjsax/flink/jobs/77313444)
> {noformat}
> Tests run: 12, Failures: 3, Errors: 0, Skipped: 2, Time elapsed: 182.008 sec 
> <<< FAILURE! - in org.apache.flink.yarn.YARNSessionFIFOITCase
> testTaskManagerFailure(org.apache.flink.yarn.YARNSessionFIFOITCase)  Time 
> elapsed: 27.356 sec  <<< FAILURE!
> java.lang.AssertionError: Found a file 
> /home/travis/build/mjsax/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1440595422559_0007/container_1440595422559_0007_01_000003/taskmanager.log
>  with a prohibited string: [Exception, Started 
> [email protected]:8081]
>       at org.junit.Assert.fail(Assert.java:88)
>       at 
> org.apache.flink.yarn.YarnTestBase.ensureNoProhibitedStringInLogFiles(YarnTestBase.java:294)
>       at 
> org.apache.flink.yarn.YARNSessionFIFOITCase.checkForProhibitedLogContents(YARNSessionFIFOITCase.java:94)
> testNonexistingQueue(org.apache.flink.yarn.YARNSessionFIFOITCase)  Time 
> elapsed: 17.421 sec  <<< FAILURE!
> java.lang.AssertionError: Found a file 
> /home/travis/build/mjsax/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1440595422559_0007/container_1440595422559_0007_01_000003/taskmanager.log
>  with a prohibited string: [Exception, Started 
> [email protected]:8081]
>       at org.junit.Assert.fail(Assert.java:88)
>       at 
> org.apache.flink.yarn.YarnTestBase.ensureNoProhibitedStringInLogFiles(YarnTestBase.java:294)
>       at 
> org.apache.flink.yarn.YARNSessionFIFOITCase.checkForProhibitedLogContents(YARNSessionFIFOITCase.java:94)
> testJavaAPI(org.apache.flink.yarn.YARNSessionFIFOITCase)  Time elapsed: 
> 11.984 sec  <<< FAILURE!
> java.lang.AssertionError: Found a file 
> /home/travis/build/mjsax/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1440595422559_0007/container_1440595422559_0007_01_000003/taskmanager.log
>  with a prohibited string: [Exception, Started 
> [email protected]:8081]
>       at org.junit.Assert.fail(Assert.java:88)
>       at 
> org.apache.flink.yarn.YarnTestBase.ensureNoProhibitedStringInLogFiles(YarnTestBase.java:294)
>       at 
> org.apache.flink.yarn.YARNSessionFIFOITCase.checkForProhibitedLogContents(YARNSessionFIFOITCase.java:94)
> {noformat}
> Furthermore, this build failed too: 
> https://travis-ci.org/apache/flink/jobs/77313450
> (no error, but Travis terminated to due no progress for 300 seconds -> 
> deadlock?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to