----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9811/ -----------------------------------------------------------
(Updated March 28, 2013, 10:06 p.m.) Review request for giraph. Changes ------- Another rebase to trunk, is good to go as of today (for now!) See previous updated descriptions for a good command line to build and run this. Thanks for your time! Description ------- Port Giraph to "pure YARN" clusters, using Hadoop MapReduce classes in our code (IO formats etc.) but running the cluster job without any active participation by a running MapReduce framework. This means doing some things ourselves that Hadoop used to do for us. I am putting this up for review to aid some non-Giraphers in having a peek at the YARN component. There is a bit of latency in the job launch that I am still diagnosing. I am also still finishing up an integration test to verify the YARN components can run a no-op Giraph job successfully. All BSP code is covered by our MRv1 tests, which are sufficient since once Giraph is running, it does not know or care if its running on YARN. The grand total is TWO files with FOUR actual munges, total for the entire patch. All the rest is conditionally compiled and/or manipulated through conf settings without ever calling into YARN-specific code from inside Giraph. This will allow us to wait on ripping apart our IO formats or other MRv1 baked-in dependencies before we're ready to abandon MR. This also sets up a paradigm by which it will be easy to port us to other cluster frameworks (Mesos, etc.) I will ping Giraph folks when this is really ready for review (hopefully next day or so) but feel free to drop me a line now if you see something you are curious about or just plain don't like. The sooner I fix it, the sooner this gets committed, so please speak up if you do. My goal is to make this not only our port of YARN, but another (there aren't many) good and well-commented example of how to run "real applications" like Giraph on YARN clusters. So I'm hoping its clear and easy to follow on that level as well. Happy to hear feedback on that angle as well! Thanks! Will post a wiki page explaining a bit more about this when its all finished. This version is still depending on Hadoop-2.0.3-alpha, but I will attempt to back port to 2.0.2 before I'm done, and a future JIRA should bring us to 2.0.0 or higher (and trunk of course.) Diffs (updated) ----- checkstyle.xml 370c120 giraph-core/pom.xml 3580d0c giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 5bd5686 giraph-core/src/main/java/org/apache/giraph/bsp/BspInputFormat.java cc53271 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 963b82a giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c5b9b93 giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java 57f7dff giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 404e47e giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java bd30455 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 74c1f87 giraph-core/src/main/java/org/apache/giraph/yarn/GiraphApplicationMaster.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnClient.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnTask.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/yarn/YarnUtils.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/yarn/package-info.java PRE-CREATION giraph-core/src/test/java/org/apache/giraph/yarn/TestYarnJob.java PRE-CREATION giraph-core/src/test/resources/capacity-scheduler.xml PRE-CREATION giraph-examples/pom.xml 3b6a08c pom.xml 1e321b8 Diff: https://reviews.apache.org/r/9811/diff/ Testing ------- Getting there, in-progress integration test is included for your amusment. Thanks, Eli Reisman
