[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357235#comment-16357235
 ] 

Jason Lowe commented on MAPREDUCE-7048:
---------------------------------------

Thanks for updating the patch!

Now that we're looking up uberized all the time, I think it makes sense to just 
do this once when the task is configured (i.e.: make it a field that is 
initialized in the setConf method).  Then we don't have to do conf key lookups 
every time we do a status update.

Rather than mess with the security manager it would be simpler to change the 
System.exit calls to use ExitUtil.terminate.  Task is already doing this in 
another place already, and arguably it should be consistent.  Then the test for 
non-uber mode can be just as simple as the uber test by making sure 
ExitUtil.systemExitDisabled is called and adding 
{{expected=ExitException.class}} to the Test annotation.


> AM can still crash after MAPREDUCE-7020
> ---------------------------------------
>
>                 Key: MAPREDUCE-7048
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7048
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am
>    Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: MAPREDUCE-7048-001.patch, MAPREDUCE-7048-002.patch
>
>
> The testcase TestUberAM#testThreadDumpOnTaskTimeout was supposed to be fixed 
> by MAPREDUCE-7020. However, it still fails, see: 
> https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7325/testReport/junit/org.apache.hadoop.mapreduce.v2/TestMRJobs/testThreadDumpOnTaskTimeout/
>  (note: other tests failed as well, but those look unrelated).
> When I tried to reproduce it locally, it failed again, although with a 
> slightly different error message (it was actually the same as before):
> {noformat}
> [INFO] -------------------------------------------------------
> [INFO]  T E S T S
> [INFO] -------------------------------------------------------
> [INFO] Running org.apache.hadoop.mapreduce.v2.TestUberAM
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 128.192 s <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestUberAM
> [ERROR] 
> testThreadDumpOnTaskTimeout(org.apache.hadoop.mapreduce.v2.TestUberAM)  Time 
> elapsed: 79.539 s  <<< FAILURE!
> java.lang.AssertionError: No AppMaster log found! expected:<1> but was:<2>
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.failNotEquals(Assert.java:743)
>       at org.junit.Assert.assertEquals(Assert.java:118)
>       at org.junit.Assert.assertEquals(Assert.java:555)
>       at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testThreadDumpOnTaskTimeout(TestMRJobs.java:1228)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}
> *Root cause:* {{System.exit()}} is still invoked at {{Task.statusUpdate()}}
> {noformat}
>   public void statusUpdate(TaskUmbilicalProtocol umbilical) 
>   throws IOException {
>     int retries = MAX_RETRIES;
>     while (true) {
>       try {
>         if (!umbilical.statusUpdate(getTaskID(), taskStatus).getTaskFound()) {
>           LOG.warn("Parent died.  Exiting "+taskId);
>           System.exit(66);
>         }
>         taskStatus.clearStatus();
>         return;
>         ...
> {noformat}
> At this point, the task was not found and return value of 
> {{umbilical.statusUpdate()}} is false. Checking whether we run in uber mode 
> seems to solve the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to