Jason Lowe created MAPREDUCE-4813:
-------------------------------------
Summary: AM timing out during job commit
Key: MAPREDUCE-4813
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4813
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: applicationmaster
Affects Versions: 2.0.1-alpha, 0.23.3
Reporter: Jason Lowe
Priority: Critical
The AM calls the output committer's {{commitJob}} method synchronously during
JobImpl state transitions, which means the JobImpl write lock is held the
entire time the job is being committed. Holding the write lock prevents the RM
allocator thread from heartbeating to the RM. Therefore if committing the job
takes too long (e.g.: the job has tons of files to commit and/or the namenode
is bogged down) then the AM appears to be unresponsive to the RM and the RM
kills the AM attempt.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira