[
https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163297#comment-15163297
]
Junping Du commented on MAPREDUCE-6608:
---------------------------------------
bq. Please let me know your recommendation on Issue 1
Even with optimization, it still sounds risky for AM failure on a large MR job
(with 10-100 thousands of tasks could be) for ZK based reader way. So I think
we need a separate JIRA to track YARN issue as this one is MAPREDUCE jira which
track changes for MR project.
Per my comments in YARN-1489, we already have YARN-4602 to track a generic
message passing-by problem between containers for YARN. Please check if that
one fit into our cases here. If so, we can think to work in parallel on this
(based on some hacked/faked read path first until we have a real one later).
Thoughts?
> Work Preserving AM Restart for MapReduce
> ----------------------------------------
>
> Key: MAPREDUCE-6608
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Srikanth Sampath
> Assignee: Srikanth Sampath
> Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf,
> WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in
> [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489]. We would like
> to take advantage of this for MapReduce(MR) applications. There are some
> challenges which have been described in the attached document and few options
> discussed. We solicit feedback from the community.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)