[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801874#comment-13801874
 ] 

Jason Lowe commented on MAPREDUCE-5561:
---------------------------------------

Yes, the test is definitely racy.  There's no guarantee the job will be in the 
FAIL_ABORT state while when we look at it asynchronously.  A couple of 
approaches to fixing this:

# As [~kkambatl] points out, we can skip the FAIL_ABORT check.  The real 
purpose of this test is to verify we eventually get to the FAILED state without 
hanging when tasks fail.
# A more deterministic, explicit test for FAIL_ABORT would be to use an output 
committer with a barrier, similar to TestingOutputCommitter but with the 
barrier in the abortJob method, so we can guarantee the job will pause in the 
FAIL_ABORT state.  Then we can release the committer from the barrier and 
verify the job proceeds to failed.



> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl testcase failing on 
> trunk
> ---------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5561
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5561
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Cindy Li
>            Assignee: Karthik Kambatla
>            Priority: Critical
>         Attachments: 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl-output.txt
>
>
> Running org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl
> Tests run: 15, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.029 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl
> testFailAbortDoesntHang(org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl)
>   Time elapsed: 5.507 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<FAIL_ABORT> but was:<FAILED>
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.failNotEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:128)
> at org.junit.Assert.assertEquals(Assert.java:147)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:817)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testFailAbortDoesntHang(TestJobImpl.java:418)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to