[
https://issues.apache.org/jira/browse/MAPREDUCE-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322515#comment-16322515
]
Peter Bacsko commented on MAPREDUCE-7020:
-----------------------------------------
Ok, I found out the problem. It's caused by MAPREDUCE-5124 and the fact that a
thread is not stopped.
1. Mapper is running in uber mode inside the AM
2. Mapper is timed out
3. Uber launcher calls {{future.cancel()}} which results in an
{{InterruptedException}}
4. Background task reporter ("communication thread") keeps running
5. Task attempt is unregistered and removed from {{attemptIdToStatus}} map
6. Task reporter calls {{umbilical.statusUpdate()}} and it gets an
{{IllegalStateException}} because the task attempt is missing from
{{attemptIdToStatus}}
7. In the {{catch}} clause, {{System.exit(65)}} is eventually called
If step #7 runs quick enough, then the AM does not have the possibility to
unregister from the Resource manager and it will be restarted. Sometimes it's
slower so the unregistration succeeds.
> TestUberAM is failing
> ---------------------
>
> Key: MAPREDUCE-7020
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7020
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: test
> Reporter: Akira Ajisaka
> Assignee: Peter Bacsko
>
> TestUberAM is failing
> {noformat}
> java.lang.AssertionError: No AppMaster log found! expected:<1> but was:<2>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testThreadDumpOnTaskTimeout(TestMRJobs.java:1228)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/testReport/junit/org.apache.hadoop.mapreduce.v2/TestUberAM/testThreadDumpOnTaskTimeout/
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]