[jira] [Commented] (MAPREDUCE-6608) Work Preserving AM Restart for MapReduce

Srikanth Sampath (JIRA) Wed, 24 Feb 2016 04:03:43 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15162887#comment-15162887
 ]


Srikanth Sampath commented on MAPREDUCE-6608:
---------------------------------------------

Thanks much [~djp] for your review and comments.  Appreciate it very much. 

*Issue 1*
{quote}+1 on Vinod's proposal of separating write and read path.{quote}
I agree and will log a separate YARN JIRA.  Do you think that effort should be 
linked to this work or can be done separately and later incorporated.  Given 
your suggestion for optimizing - using the service record for other attempts 
(not the first one) the read paths will be much fewer.  

*Issue 2*
{quote} We can involve a new MR config to switch on/off this feature (off by 
default). However, I didn't see any implementation on this in demo patch {quote}
Yes, not in the demo patch.  I will add it in the next version and also 
maintain the old code path when the configuration is off (the default).

{quote} Beside we need to replace the read path of registry service, another 
point is we don't necessary to keep the first attempt AM info which could 
saving most of overhead we are adding here as most applications are expected to 
end with single attempt. Isn't it? {quote}
Yes.  That's correct.  Very good suggestion.

{quote}Agree that named argument sounds better. However, this way has work for 
a long time for MapReduce project and we won't prefer to change unless we find 
some issue or bug. For path to service record, we need keep consistent with our 
decision on read path. {quote}
I think named arguments are better.  If we end up changing the interface of 
YarnChild, I think we should do it.  It depends on what we decide on *Issue 1*

{quote}UmbilicalWithRetries should follow other existing practice (for RPC 
client retry during service down time) that to create a RetryProxy with 
FailoverProxyProvider (may be call it as MRAMProxy) for task attempt to contact 
with new attempt instance for AM.{quote}
Thanks much for this very useful suggestion.  I will incorporate it.

Please let me know your recommendation on *Issue 1*

> Work Preserving AM Restart for MapReduce
> ----------------------------------------
>
>                 Key: MAPREDUCE-6608
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Srikanth Sampath
>            Assignee: Srikanth Sampath
>         Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf, 
> WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in 
> [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].  We would like 
> to take advantage of this for MapReduce(MR) applications.  There are some 
> challenges which have been described in the attached document and few options 
> discussed.  We solicit feedback from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6608) Work Preserving AM Restart for MapReduce

Reply via email to