[ https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163297#comment-15163297 ]
Junping Du commented on MAPREDUCE-6608: --------------------------------------- bq. Please let me know your recommendation on Issue 1 Even with optimization, it still sounds risky for AM failure on a large MR job (with 10-100 thousands of tasks could be) for ZK based reader way. So I think we need a separate JIRA to track YARN issue as this one is MAPREDUCE jira which track changes for MR project. Per my comments in YARN-1489, we already have YARN-4602 to track a generic message passing-by problem between containers for YARN. Please check if that one fit into our cases here. If so, we can think to work in parallel on this (based on some hacked/faked read path first until we have a real one later). Thoughts? > Work Preserving AM Restart for MapReduce > ---------------------------------------- > > Key: MAPREDUCE-6608 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Srikanth Sampath > Assignee: Srikanth Sampath > Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf, > WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf > > > Providing a framework for work preserving AM is achieved in > [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489]. We would like > to take advantage of this for MapReduce(MR) applications. There are some > challenges which have been described in the attached document and few options > discussed. We solicit feedback from the community. -- This message was sent by Atlassian JIRA (v6.3.4#6332)