[ https://issues.apache.org/jira/browse/MAPREDUCE-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801874#comment-13801874 ]
Jason Lowe commented on MAPREDUCE-5561: --------------------------------------- Yes, the test is definitely racy. There's no guarantee the job will be in the FAIL_ABORT state while when we look at it asynchronously. A couple of approaches to fixing this: # As [~kkambatl] points out, we can skip the FAIL_ABORT check. The real purpose of this test is to verify we eventually get to the FAILED state without hanging when tasks fail. # A more deterministic, explicit test for FAIL_ABORT would be to use an output committer with a barrier, similar to TestingOutputCommitter but with the barrier in the abortJob method, so we can guarantee the job will pause in the FAIL_ABORT state. Then we can release the committer from the barrier and verify the job proceeds to failed. > org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl testcase failing on > trunk > --------------------------------------------------------------------------------- > > Key: MAPREDUCE-5561 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5561 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.2.0 > Reporter: Cindy Li > Assignee: Karthik Kambatla > Priority: Critical > Attachments: > org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl-output.txt > > > Running org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl > Tests run: 15, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.029 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl > testFailAbortDoesntHang(org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl) > Time elapsed: 5.507 sec <<< FAILURE! > java.lang.AssertionError: expected:<FAIL_ABORT> but was:<FAILED> > at org.junit.Assert.fail(Assert.java:93) > at org.junit.Assert.failNotEquals(Assert.java:647) > at org.junit.Assert.assertEquals(Assert.java:128) > at org.junit.Assert.assertEquals(Assert.java:147) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:817) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testFailAbortDoesntHang(TestJobImpl.java:418) -- This message was sent by Atlassian JIRA (v6.1#6144)