[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473467#comment-13473467
]
Arun C Murthy commented on MAPREDUCE-4495:
------------------------------------------
Let me restate what I've been saying all along:
WFAM has very wide scope, is importing a whole new bunch (500KB) of code from
Oozie i.e. workflowlib.
Given that, it belong in a separate project by itself, no need to extend Hadoop
to incorporate Oozie.
---
OTOH, if you want to merely support DAG of MR jobs we already have JobControl -
we can, trivially, change JobControl to run in an AM without any need for
workflowlib. So, let's not import that in.
----
Let's not blow up Hadoop into an even bigger umbrella project by importing
Oozie into it. Let's do it in an incubator project.
I have a proposal which I'll share, you can be part of the it from day one.
Makes sense? Thanks.
> Workflow Application Master in YARN
> -----------------------------------
>
> Key: MAPREDUCE-4495
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Affects Versions: 2.0.0-alpha
> Reporter: Bo Wang
> Assignee: Bo Wang
> Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch,
> MapReduceWorkflowAM.pdf
>
>
> It is useful to have a workflow application master, which will be capable of
> running a DAG of jobs. The workflow client submits a DAG request to the AM
> and then the AM will manage the life cycle of this application in terms of
> requesting the needed resources from the RM, and starting, monitoring and
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master,
> these are some of the advantages:
> - Less number of consumed resources, since only one application master will
> be spawned for the whole workflow.
> - Reuse of resources, since the same resources can be used by multiple
> consecutive jobs in the workflow (no need to request/wait for resources for
> every individual job from the central RM).
> - More optimization opportunities in terms of collective resource requests.
> - Optimization opportunities in terms of rewriting and composing jobs in the
> workflow (e.g. pushing down Mappers).
> - This Application Master can be reused/extended by higher systems like Pig
> and hive to provide an optimized way of running their workflows.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira