[
https://issues.apache.org/jira/browse/MAPREDUCE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304041#comment-14304041
]
Karthik Kambatla commented on MAPREDUCE-5718:
---------------------------------------------
[~yanghaogn] - initially, I was also trying to delete the startCommitFile if
there is not corresponding endFile. However, we can't do that for reasons Jason
described here -
https://issues.apache.org/jira/browse/MAPREDUCE-5718?focusedCommentId=13872189&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13872189
> MR job will fail after commit fail
> ----------------------------------
>
> Key: MAPREDUCE-5718
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5718
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am
> Affects Versions: 2.3.0, 2.6.0
> Reporter: Karthik Kambatla
> Assignee: Yang Hao
> Fix For: 2.6.0
>
> Attachments: MAPREDUCE-5718.v2.patch, mr-5718-0.patch
>
>
> when any of this happens:
> * While testing RM HA, if the RM fails over while an MR AM is in the middle
> of a commit,
> * When testing preempting, if the MR AM fails over during the middle of a
> commit
> the subsequent AM gets spawned but dies with a diagnostic message - "We
> crashed durring a commit".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)