[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474722#comment-13474722
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-4495:
----------------------------------------------------

Just caught up with discussion.

I am willing to help with all the support and improvements that this AM will 
need from YARN side. Also willing to discuss and vet the design of the workflow 
AM, it clearly has many champions. I am generally +1 for a workflow idea, but 
don't quite agree on some of the design points, but we can argue about that 
separately.

Regarding the project home, I agree with the general sentiment here. I am sure 
there will be a proliferation of YARN frameworks and apps in the future. 
Burdening the hadoop YARN or MapReduce communities with the maintenance of all 
those projects is clearly a road that we should all look to avoid given our 
experience with the contrib projects in the past.

We have an opportunity to set a clear precedence here. In my opinion, this code 
is best hosted at oozie, but incubator is a clear alternative like others 
noted. Am willing to help with any efforts needed to push this into incubator. 
Others like PAAS, MPI should also follow suit. Tx.
                
> Workflow Application Master in YARN
> -----------------------------------
>
>                 Key: MAPREDUCE-4495
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha
>            Reporter: Bo Wang
>            Assignee: Bo Wang
>         Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
> MapReduceWorkflowAM.pdf
>
>
> It is useful to have a workflow application master, which will be capable of 
> running a DAG of jobs. The workflow client submits a DAG request to the AM 
> and then the AM will manage the life cycle of this application in terms of 
> requesting the needed resources from the RM, and starting, monitoring and 
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, 
> these are some of the advantages:
>  - Less number of consumed resources, since only one application master will 
> be spawned for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple 
> consecutive jobs in the workflow (no need to request/wait for resources for 
> every individual job from the central RM).
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the 
> workflow (e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig 
> and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to