[
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun C Murthy updated MAPREDUCE-279:
------------------------------------
Attachment: MR-279_MR_files_to_move.txt
MR-279.sh
MR-279.patch
Folks, we are happy to put out a first cut of MRv2.
A brief overview:
A global ResourceManager (RM) tracks machine availability and scheduling
invariants while a per-application ApplicationMaster (AM) runs inside the
cluster and tracks the program semantics for a given job. An application is
either a single MapReduce job as the JobTracker supports today, it could be a
directed, acyclic graph (DAG) of MapReduce jobs, or it could be a new
framework. Each machine in the cluster runs a per-node daemon, the NodeManager
(NM), responsible for enforcing and reporting the resource allocations made by
the RM and monitoring the lifecycle of processes spawned on behalf of an
application. Each process started by the NM is conceptually a container, or a
bundle of resources allocated by the RM.
We call the new framework (RM/NM) as YARN (Yet Another Resource Negotiator)...
;-)
Source layout:
# A new yarn source folder contains the RM and NM.
# A new mr-client folder contains all of the MapReduce runtime. This includes
the MapReduce ApplicationMaster and all of the classes for running MapReduce
applications. Please note that the MR runtime has not changed at all, including
the user apis - we continue to support both the old 'mapred' api and the new
'mapreduce' api (context-objects). We are moving some classes from
src/java/mapred/* to mr-client to achieve the same.
# We have continued to keep the old JobTracker/TaskTracker based MapReduce
framework in src/java.
Build:
# We decided to embrace maven for MRv2, hence yarn and mr-client are built via
maven.
# For now the old JT/TT based MR framework continues to use ant/ivy. Hopefully
we can change this soon - I know Giri is working on this for common, hdfs and
mapreduce at one go.
There is a INSTALL file which describes how to build, deploy MRv2 and also how
to run MR applications.
----
I'm planning on committing this patch to a development branch (named
MAPREDUCE-279) soon so that we can continue all our work via Apache in the
open. We *really* look forward to feedback and working with the community
henceforth. We have many many miles to go and promises to keep! ;-)
PS: I have attached a script (MR-279.sh) to show the the files being moved to
mr-client for the MR runtime, a list of files being moved and the actual patch
to apply after. Also, please note that the patch is significantly bigger than
it should be since it includes binary images (via git diff --text).
> Map-Reduce 2.0
> --------------
>
> Key: MAPREDUCE-279
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker, tasktracker
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.23.0
>
> Attachments: MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt
>
>
> Re-factor MapReduce into a generic resource scheduler and a per-job,
> user-defined component that manages the application execution.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira