[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066609#comment-13066609
 ] 

Sharad Agarwal commented on MAPREDUCE-2702:
-------------------------------------------


1. Job level commit changes:
Currently in FileOutputCommitter:
- tasks write output to <outputpath>/_temporary/<taskid>
- task commit promotes it to <outputpath>/<taskid>

Currently there is no job level commit. The task commit promotes the output 
directly to the final destination.
This would have problem in the case of recovery. In case of network partitions 
etc, there is a possibility of more than
one AppMaster running simultaneously for the same job. To avoid them 
overstepping on each other, we need to make
changes in job level commit as follows: -
[Below <AppAttemptCount> is the count of the application master start]
- tasks write output to <outputpath>/_<AppAttemptCount>/_temporary/<taskid>
- task commit promotes it to <outputpath>/_<AppAttemptCount>/<taskid>
- job commit promotes all tasks from the current AppAttemptCount dir to 
<outputpath>/<taskid>

2. *New API* required in OutputCommitter:
To recover the *completed* task output from the previous life of appmaster, the 
output needs to be moved from
<outputpath>/_<AppAttemptCount-1>/<taskid> to 
<outputpath>/_<AppAttemptCount>/<taskid>. 
For this we will need a new recoverTask api in OutputCommitter.


> OutputCommitter changes for MR Application Master recovery
> ----------------------------------------------------------
>
>                 Key: MAPREDUCE-2702
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2702
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2
>            Reporter: Sharad Agarwal
>            Assignee: Sharad Agarwal
>
> In MR AM recovers from a crash, it only reruns the non completed tasks. The 
> completed tasks (along with their output, if any) needs to be recovered from 
> the previous life. This would require some changes in OutputCommitter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to