[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105267#comment-15105267
 ] 

Junping Du commented on MAPREDUCE-6608:
---------------------------------------

Thanks [~srikanth.sampath] and [~raju.bairishetti] for proposing this JIRA and 
upload a design document. This work could be a significant improvement to our 
MapReduce framework reliability. 
Go through the current design doc, I think store new attempt address for MR AM 
in zookeeper could have scalability issues in case MR job has massive running 
tasks (ten thousands or more). I think it could be better to store/get new MR 
AM location from HDFS which has better scalability. 
Also, in my understanding, Yarn Service Registry may not best fit for this 
case. CC [~ste...@apache.org] who is author of YSR.
I could propose another version of design with more details in next few days in 
case we haven't started the development work yet.

> Work Preserving AM Restart for MapReduce
> ----------------------------------------
>
>                 Key: MAPREDUCE-6608
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Srikanth Sampath
>            Assignee: Raju Bairishetti
>         Attachments: WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in 
> [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].  We would like 
> to take advantage of this for MapReduce(MR) applications.  There are some 
> challenges which have been described in the attached document and few options 
> discussed.  We solicit feedback from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to