[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428257#comment-13428257
]
Arun C Murthy commented on MAPREDUCE-4495:
------------------------------------------
Alejandro, making MR AM thread-safe is a good goal. We can do that
independently of the new AM. I have opened MAPREDUCE-4513 for the same.
I don't which other 'private' classes you need - frankly that concerns me. It
means you are adding new requirements on MR-AM which isn't an 'api' of that
nature.
Also, if we are going that route I strongly suggest we do not import code from
Oozie and merely take JobControl api and support it. That should be a trivial
exercise without adding any new 'interfaces' to MapReduce.
So, I see two options:
# Enhance JobControl api to work in AM by making MR-AM, specifially MRAppMaster
thread-safe. This will allow for multiple objects of MRAppMaster to be created.
This means there are no new interfaces to MapReduce.
# Go the full distance, make it generic, import code from Oozie, come up with a
new set of interfaces etc. etc. and do it in a separate Incubator project.
As I indicated previously, my preference is option #2 and I have already
offered help to deal with the specifics so you and Bo can concentrate on
getting the code out.
Thoughts?
> Workflow Application Master in YARN
> -----------------------------------
>
> Key: MAPREDUCE-4495
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Affects Versions: 2.0.0-alpha
> Reporter: Bo Wang
> Assignee: Bo Wang
>
> It is useful to have a workflow application master, which will be capable of
> running a DAG of jobs. The workflow client submits a DAG request to the AM
> and then the AM will manage the life cycle of this application in terms of
> requesting the needed resources from the RM, and starting, monitoring and
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master,
> these are some of the advantages:
> - Less number of consumed resources, since only one application master will
> be spawned for the whole workflow.
> - Reuse of resources, since the same resources can be used by multiple
> consecutive jobs in the workflow (no need to request/wait for resources for
> every individual job from the central RM).
> - More optimization opportunities in terms of collective resource requests.
> - Optimization opportunities in terms of rewriting and composing jobs in the
> workflow (e.g. pushing down Mappers).
> - This Application Master can be reused/extended by higher systems like Pig
> and hive to provide an optimized way of running their workflows.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira