[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248151#comment-15248151
 ] 

Junping Du edited comment on MAPREDUCE-6608 at 4/19/16 4:46 PM:
----------------------------------------------------------------

[~vinodkv], thanks for review and comments. I think most your points here are 
solid, however, the comments about "Output Commit of previous tasks" is a bit 
stale.

bq. The new AM needs to make sure that output of previously running containers 
can be safely committed. IIRC, with today's FileOutputCommitter, new AM will 
only promote task-outputs that are present in 
$jobOutput/_temporary/$currentAttemptID/
This is true before YARN-4815. However, after YARN-4815, most task-output 
commit to job final output is handled by {{FileOutputCommitter.commitTask()}} 
instead of {{FileOutputCommitter.commitJob()}}. So the commitJob() only left 
work of cleanup $jobOutput/_temporary. So there is nothing need to do here 
except we make sure "mapreduce.fileoutputcommitter.algorithm.version" is set to 
2. 
This is also an assumption setting for work of MAPREDUCE-5485 which is a 
prerequisite for feature here - or AM will failed directly in case previous AM 
ends in job committing.

Investigating on rest of issues and will bring some possible proposals later.  


bq. I'd suggest spending more time on the design, atleast on some of the areas 
I pointed above and then create a branch, create sub-tasks, do some prototypes 
etc.
+1. This feature work could be a bit over my expectation before. I agree we 
could need a separated branch for developing this in parallel. Will create a 
branch once we finalize our design work. 



was (Author: djp):
[~vinodkv], thanks for review and comments. I think most your points here are 
solid, however, the comments about "Output Commit of previous tasks" is a bit 
stale.

bq. The new AM needs to make sure that output of previously running containers 
can be safely committed. IIRC, with today's FileOutputCommitter, new AM will 
only promote task-outputs that are present in 
$jobOutput/_temporary/$currentAttemptID/
This is true before YARN-4815. However, after YARN-4815, most task-output 
commit to job final output is handled by {{FileOutputCommitter.commitTask()}} 
instead of {{FileOutputCommitter.commitJob()}}. So the commitJob() only left 
work of cleanup $jobOutput/_temporary. So there is nothing need to do here 
unless we make sure "mapreduce.fileoutputcommitter.algorithm.version" is set to 
2. 
This is also an assumption setting for work of MAPREDUCE-5485 which is a 
prerequisite for feature here - or AM will failed directly in case previous AM 
ends in job committing.

Investigating on rest of issues and will propose some possible solutions later. 
 


bq. I'd suggest spending more time on the design, atleast on some of the areas 
I pointed above and then create a branch, create sub-tasks, do some prototypes 
etc.
+1. This feature work could be a bit over my expectation before. I agree we 
could need a separated branch for developing this in parallel. Will create a 
branch once we finalize our design work. 


> Work Preserving AM Restart for MapReduce
> ----------------------------------------
>
>                 Key: MAPREDUCE-6608
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Srikanth Sampath
>            Assignee: Srikanth Sampath
>         Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf, 
> WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in 
> [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].  We would like 
> to take advantage of this for MapReduce(MR) applications.  There are some 
> challenges which have been described in the attached document and few options 
> discussed.  We solicit feedback from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to