[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136279#comment-15136279
 ] 

Srikanth Sampath commented on MAPREDUCE-6608:
---------------------------------------------

I have attached a design patch - 
[Patch1|https://issues.apache.org/jira/secure/attachment/12786705/Patch1.patch] 
that gives a high level approach on the implementation.  The 
[Design|https://issues.apache.org/jira/secure/attachment/12786706/WorkPreservingMRAppMaster-2.pdf]
 document gives the high level design.

*Notes:*
1. This is a patch against Apache 2.6.1
2. It works for the example hadoop sleep job - where I have killed the  AM 
randomly and the inflight tasks continue.
3. SS_DEBUG in the patch indicates a debug statement that helps me. Some of 
these will be removed eventually.
4. SS_FIXME in the patch is a tag for me to fix some known issues that I have 
commented on.  I will clean these up before the next submission.

I solicit comments on the high level design and the approach I have taken in 
the patch.

*Next Steps:*
1. I will iron out the known issues (all SS_FIXME), clean up the interfaces,  
make the code compliant with apache coding standards, rebase the code against 
trunk, and test it thoroughly.  I will factor in the comments and suggestions 
that are made with the design doc and design patch.
2. Identify the components and issues involved and raise sub tasks.  

> Work Preserving AM Restart for MapReduce
> ----------------------------------------
>
>                 Key: MAPREDUCE-6608
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Srikanth Sampath
>            Assignee: Srikanth Sampath
>         Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf, 
> WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in 
> [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].  We would like 
> to take advantage of this for MapReduce(MR) applications.  There are some 
> challenges which have been described in the attached document and few options 
> discussed.  We solicit feedback from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to