[
https://issues.apache.org/jira/browse/MAPREDUCE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479306#comment-13479306
]
Thomas Graves commented on MAPREDUCE-4729:
------------------------------------------
Ok, so I figured this out. The job is using output format
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat,
which has the OutputCommitter which is set to null. This caused the
MRAppMaster recoveryService to not start:
"org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Not starting RecoveryService:
recoveryEnabled: true recoverySupportedByCommitter: false ApplicationAttemptID:
4"
Since the recovery service didn't start it didn't parse the old job history
files, thus didn't have the list of old AMs.
I think we should fix that so that even if recovery isn't supported we atleast
parse and get the previous AM attempt info.
> job history UI not showing all job attempts
> -------------------------------------------
>
> Key: MAPREDUCE-4729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4729
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobhistoryserver
> Affects Versions: 0.23.3
> Reporter: Thomas Graves
>
> We are seeing a case where a job runs but the AM is running out of memory in
> the first 3 attempts. The job eventually finishes on the 4th attempt. When
> you go to the job history UI for that job, it only shows the last attempt.
> This is bad since we want to see why the first 3 attempts failed.
> The RM web ui shows all 4 attempts.
> Also I tested this locally by running "kill" on the app master and in that
> case the history server UI does show all attempts.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira