[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473453#comment-13473453
]
Alejandro Abdelnur commented on MAPREDUCE-4495:
-----------------------------------------------
Arun, I don't see why you should be disappointed, you've told me you'd look
over the (last) weekend, I'd waited as we agree.
Sure, I'll wait.
bq *Why aren't ...*
As I've mentioned in the JIRA and to personally to you, I think the WFAM has to
be close to MRAM, at least until matures.
bq *What is the need for the complex event system here?*
What complex event system you refer to? The patch uses YARN events. If you
refer to the pattern of creating events with the workflowlib handlers and
dispatching them after the workflowlib call; then that is (learned from Oozie)
to be able to checkpoint consistent states of the workflow job state machine,
thus enabling recoveries from the last known state.
bq *Why aren't we using JobControl ..*
JobControl is job dependency-driven relying on topological sort to create the
execution DAG.
WFAM executes a specified DAG.
Certainly we could have an implementation of JobControl that creates a WFAM
job. IMO this would be a follow up JIRA and it would probably require some
tweaks in JobControl API to be able to use both implementations (IMO this
reinforces my previous comment that WFAM should live in Hadoop MR).
Thx
> Workflow Application Master in YARN
> -----------------------------------
>
> Key: MAPREDUCE-4495
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Affects Versions: 2.0.0-alpha
> Reporter: Bo Wang
> Assignee: Bo Wang
> Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch,
> MapReduceWorkflowAM.pdf
>
>
> It is useful to have a workflow application master, which will be capable of
> running a DAG of jobs. The workflow client submits a DAG request to the AM
> and then the AM will manage the life cycle of this application in terms of
> requesting the needed resources from the RM, and starting, monitoring and
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master,
> these are some of the advantages:
> - Less number of consumed resources, since only one application master will
> be spawned for the whole workflow.
> - Reuse of resources, since the same resources can be used by multiple
> consecutive jobs in the workflow (no need to request/wait for resources for
> every individual job from the central RM).
> - More optimization opportunities in terms of collective resource requests.
> - Optimization opportunities in terms of rewriting and composing jobs in the
> workflow (e.g. pushing down Mappers).
> - This Application Master can be reused/extended by higher systems like Pig
> and hive to provide an optimized way of running their workflows.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira