-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9811/
-----------------------------------------------------------

(Updated March 24, 2013, 8:08 p.m.)


Review request for giraph.


Changes
-------

Just a rebase. See last revision comments for good build and run instructions.


Description
-------

Port Giraph to "pure YARN" clusters, using Hadoop MapReduce classes in our code 
(IO formats etc.) but running the cluster job without any active participation 
by a running MapReduce framework. This means doing some things ourselves that 
Hadoop used to do for us.

I am putting this up for review to aid some non-Giraphers in having a peek at 
the YARN component. There is a bit of latency in the job launch that I am still 
diagnosing. I am also still finishing up an integration test to verify the YARN 
components can run a no-op Giraph job successfully. All BSP code is covered by 
our MRv1 tests, which are sufficient since once Giraph is running, it does not 
know or care if its running on YARN. The grand total is TWO files with FOUR 
actual munges, total for the entire patch. All the rest is conditionally 
compiled and/or manipulated through conf settings without ever calling into 
YARN-specific code from inside Giraph. This will allow us to wait on ripping 
apart our IO formats or other MRv1 baked-in dependencies before we're ready to 
abandon MR. This also sets up a paradigm by which it will be easy to port us to 
other cluster frameworks (Mesos, etc.)

I will ping Giraph folks when this is really ready for review (hopefully next 
day or so) but feel free to drop me a line now if you see something you are 
curious about or just plain don't like. The sooner I fix it, the sooner this 
gets committed, so please speak up if you do.

My goal is to make this not only our port of YARN, but another (there aren't 
many) good and well-commented example of how to run "real applications" like 
Giraph on YARN clusters. So I'm hoping its clear and easy to follow on that 
level as well. Happy to hear feedback on that angle as well!

Thanks! Will post a wiki page explaining a bit more about this when its all 
finished. This version is still depending on Hadoop-2.0.3-alpha, but I will 
attempt to back port to 2.0.2 before I'm done, and a future JIRA should bring 
us to 2.0.0 or higher (and trunk of course.)
 


Diffs (updated)
-----

  checkstyle.xml 3d8a6d4 
  giraph-core/pom.xml 3580d0c 
  giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 5bd5686 
  giraph-core/src/main/java/org/apache/giraph/bsp/BspInputFormat.java bce84b1 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 
6886d58 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ad9073d 
  giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java 
e74c59a 
  giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 
9188a23 
  giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 
41238d0 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 
74c1f87 
  giraph-core/src/main/java/org/apache/giraph/yarn/GiraphApplicationMaster.java 
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnClient.java 
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnTask.java 
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/yarn/YarnUtils.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/yarn/package-info.java 
PRE-CREATION 
  giraph-core/src/test/java/org/apache/giraph/yarn/TestYarnJob.java 
PRE-CREATION 
  giraph-core/src/test/resources/capacity-scheduler.xml PRE-CREATION 
  giraph-examples/pom.xml 3b6a08c 
  pom.xml e576e4b 

Diff: https://reviews.apache.org/r/9811/diff/


Testing
-------

Getting there, in-progress integration test is included for your amusment.


Thanks,

Eli Reisman

Reply via email to