[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473467#comment-13473467 ]
Arun C Murthy commented on MAPREDUCE-4495: ------------------------------------------ Let me restate what I've been saying all along: WFAM has very wide scope, is importing a whole new bunch (500KB) of code from Oozie i.e. workflowlib. Given that, it belong in a separate project by itself, no need to extend Hadoop to incorporate Oozie. --- OTOH, if you want to merely support DAG of MR jobs we already have JobControl - we can, trivially, change JobControl to run in an AM without any need for workflowlib. So, let's not import that in. ---- Let's not blow up Hadoop into an even bigger umbrella project by importing Oozie into it. Let's do it in an incubator project. I have a proposal which I'll share, you can be part of the it from day one. Makes sense? Thanks. > Workflow Application Master in YARN > ----------------------------------- > > Key: MAPREDUCE-4495 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Affects Versions: 2.0.0-alpha > Reporter: Bo Wang > Assignee: Bo Wang > Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, > MapReduceWorkflowAM.pdf > > > It is useful to have a workflow application master, which will be capable of > running a DAG of jobs. The workflow client submits a DAG request to the AM > and then the AM will manage the life cycle of this application in terms of > requesting the needed resources from the RM, and starting, monitoring and > retrying the application's individual tasks. > Compared to running Oozie with the current MapReduce Application Master, > these are some of the advantages: > - Less number of consumed resources, since only one application master will > be spawned for the whole workflow. > - Reuse of resources, since the same resources can be used by multiple > consecutive jobs in the workflow (no need to request/wait for resources for > every individual job from the central RM). > - More optimization opportunities in terms of collective resource requests. > - Optimization opportunities in terms of rewriting and composing jobs in the > workflow (e.g. pushing down Mappers). > - This Application Master can be reused/extended by higher systems like Pig > and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira