[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557483#comment-13557483
 ] 

Andrew Purtell commented on MAPREDUCE-4495:
-------------------------------------------

Hi Bo,

bq. In terms of of implementation, a prototype based on the v2 design in the 
document is finished.

Would it be possible to refresh the patch on this JIRA?

Also, I'm curious if you (or others) have put any thought into Robert's 
question:
{quote}
But for V2 and V3 when an AM is launched by the WF AM and not directly by the 
RM the WF AM must take over some responsibilities of the RM. I am curious how 
many of those responsibilities it will take over. I am also curious about what 
modifications will be required to other AMs so that they can interact with both 
the WF AM and also the RM directly.
{quote}

Would it be possible this could be handled by a RM<->AM delegation API, with 
consideration for when the RM can kill a delegate not responding sufficiently 
to its responsibilities?

Finally, it would be interesting and useful if something like the WFAM proposed 
on this issue could maintain a persistent pool of workers, should it be 
configured to do so, to avoid container allocation and startup costs. One might 
imagine the cluster admin setting a minimum reservation, and a maximum. Also 
there would be design considerations for when and how the WFAM could spin up 
new containers at best effort, if resources are available, to improve the 
parallelism of DAG execution, and for notifying the WFAM when there is resource 
pressure and should release idle persistent containers.
                
> Workflow Application Master in YARN
> -----------------------------------
>
>                 Key: MAPREDUCE-4495
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha
>            Reporter: Bo Wang
>            Assignee: Bo Wang
>         Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
> MapReduceWorkflowAM.pdf, yapp_proposal.txt
>
>
> It is useful to have a workflow application master, which will be capable of 
> running a DAG of jobs. The workflow client submits a DAG request to the AM 
> and then the AM will manage the life cycle of this application in terms of 
> requesting the needed resources from the RM, and starting, monitoring and 
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, 
> these are some of the advantages:
>  - Less number of consumed resources, since only one application master will 
> be spawned for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple 
> consecutive jobs in the workflow (no need to request/wait for resources for 
> every individual job from the central RM).
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the 
> workflow (e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig 
> and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to