[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427585#comment-13427585
]
Chris A. Mattmann commented on MAPREDUCE-4495:
----------------------------------------------
Hi Josh:
bq. @Chris, thanks for clarifying your meaning. I think the line that threw me
for a curve in your original comment was "Putting code into Apache Hadoop is
the same as putting code in yet-to-be-named-Apache-Incubator-project," since
for that to be true, we would need for the incubating project to have, at the
very least, a source code repository created by infrastructure.
And again, having the repo created isn't anything big. Most mentors can do this
themselves on day 1 of the project being accepted? It's a simple svn mkdir
https://svn.apache.org/repos/asf/...
Besides that, regarding your statistics data, I wouldn't put much data quality
investment into your stats since:
* JIRA issues aren't always closed the minute/second/milisecond the work is
done. I've been involved in lots of projects where an issue is finished, and
then the issue is closed days, weeks, even months later (e.g., "oh, I forgot to
close that issue...")
* Not all work in INFRA is a JIRA issue.
* The JIRA SVN plugin requires that the person tagged the SVN commit with the
JIRA issue ID. Not everything is tracked in Subversion.
Thus, unfortunately, garbage in, garbage out.
> Workflow Application Master in YARN
> -----------------------------------
>
> Key: MAPREDUCE-4495
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Affects Versions: 2.0.0-alpha
> Reporter: Bo Wang
> Assignee: Bo Wang
>
> It is useful to have a workflow application master, which will be capable of
> running a DAG of jobs. The workflow client submits a DAG request to the AM
> and then the AM will manage the life cycle of this application in terms of
> requesting the needed resources from the RM, and starting, monitoring and
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master,
> these are some of the advantages:
> - Less number of consumed resources, since only one application master will
> be spawned for the whole workflow.
> - Reuse of resources, since the same resources can be used by multiple
> consecutive jobs in the workflow (no need to request/wait for resources for
> every individual job from the central RM).
> - More optimization opportunities in terms of collective resource requests.
> - Optimization opportunities in terms of rewriting and composing jobs in the
> workflow (e.g. pushing down Mappers).
> - This Application Master can be reused/extended by higher systems like Pig
> and hive to provide an optimized way of running their workflows.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira