[jira] Updated: (MAPREDUCE-279) Map-Reduce 2.0

Arun C Murthy (JIRA) Wed, 16 Mar 2011 18:22:59 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Arun C Murthy updated MAPREDUCE-279:
------------------------------------

    Attachment: MR-279_MR_files_to_move.txt
                MR-279.sh
                MR-279.patch

Folks, we are happy to put out a first cut of MRv2.

A brief overview:

A global ResourceManager (RM) tracks machine availability and scheduling 
invariants while a per-application ApplicationMaster (AM) runs inside the 
cluster and tracks the program semantics for a given job. An application is 
either a single MapReduce job as the JobTracker supports today, it could be a 
directed, acyclic graph (DAG) of MapReduce jobs, or it could be a new 
framework. Each machine in the cluster runs a per-node daemon, the NodeManager 
(NM), responsible for enforcing and reporting the resource allocations made by 
the RM and monitoring the lifecycle of processes spawned on behalf of an 
application. Each process started by the NM is conceptually a container, or a 
bundle of resources allocated by the RM.

We call the new framework (RM/NM) as YARN (Yet Another Resource Negotiator)... 
;-)

Source layout:

# A new yarn source folder contains the RM and NM.
# A new mr-client folder contains all of the MapReduce runtime. This includes 
the MapReduce ApplicationMaster and all of the classes for running MapReduce 
applications. Please note that the MR runtime has not changed at all, including 
the user apis - we continue to support both the old 'mapred' api and the new 
'mapreduce' api (context-objects). We are moving some classes from 
src/java/mapred/* to mr-client to achieve the same.
# We have continued to keep the old JobTracker/TaskTracker based MapReduce 
framework in src/java.

Build:
# We decided to embrace maven for MRv2, hence yarn and mr-client are built via 
maven.
# For now the old JT/TT based MR framework continues to use ant/ivy. Hopefully 
we can change this soon - I know Giri is working on this for common, hdfs and 
mapreduce at one go.

There is a INSTALL file which describes how to build, deploy MRv2 and also how 
to run MR applications.


----

I'm planning on committing this patch to a development branch (named 
MAPREDUCE-279) soon so that we can continue all our work via Apache in the 
open. We *really* look forward to feedback and working with the community 
henceforth. We have many many miles to go and promises to keep! ;-)

PS: I have attached a script (MR-279.sh) to show the the files being moved to 
mr-client for the MR runtime, a list of files being moved and the actual patch 
to apply after. Also, please note that the patch is significantly bigger than 
it should be since it includes binary images (via git diff --text).

> Map-Reduce 2.0
> --------------
>
>                 Key: MAPREDUCE-279
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, tasktracker
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.23.0
>
>         Attachments: MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt
>
>
> Re-factor MapReduce into a generic resource scheduler and a per-job, 
> user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-279) Map-Reduce 2.0

Reply via email to