Merge pull request #199 from harveyfeng/yarn-2.2 Hadoop 2.2 migration
Includes support for the YARN API stabilized in the Hadoop 2.2 release, and a few style patches. Short description for each set of commits: a98f5a0 - "Misc style changes in the 'yarn' package" a67ebf4 - "A few more style fixes in the 'yarn' package" Both of these are some minor style changes, such as fixing lines over 100 chars, to the existing YARN code. ab8652f - "Add a 'new-yarn' directory ... " Copies everything from `SPARK_HOME/yarn` to `SPARK_HOME/new-yarn`. No actual code changes here. 4f1c3fa - "Hadoop 2.2 YARN API migration ..." API patches to code in the `SPARK_HOME/new-yarn` directory. There are a few more small style changes mixed in, too. Based on @colorant's Hadoop 2.2 support for the scala-2.10 branch in #141. a1a1c62 - "Add optional Hadoop 2.2 settings in sbt build ... " If Spark should be built against Hadoop 2.2, then: a) the `org.apache.spark.deploy.yarn` package will be compiled from the `new-yarn` directory. b) Protobuf v2.5 will be used as a Spark dependency, since Hadoop 2.2 depends on it. Also, Spark will be built against a version of Akka v2.0.5 that's built against Protobuf 2.5, named `akka-2.0.5-protobuf-2.5`. The patched Akka is here: https://github.com/harveyfeng/akka/tree/2.0.5-protobuf-2.5, and was published to local Ivy during testing. There's also a new boolean environment variable, `SPARK_IS_NEW_HADOOP`, that users can manually set if their `SPARK_HADOOP_VERSION` specification does not start with `2.2`, which is how the build file tries to detect a 2.2 version. Not sure if this is necessary or done in the best way, though... Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/72b69615 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/72b69615 Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/72b69615 Branch: refs/heads/scala-2.10 Commit: 72b696156c8662cae2cef4b943520b4be86148ea Parents: 182f9ba 46b87b8 Author: Matei Zaharia <[email protected]> Authored: Wed Dec 4 23:33:04 2013 -0800 Committer: Matei Zaharia <[email protected]> Committed: Wed Dec 4 23:33:04 2013 -0800 ---------------------------------------------------------------------- core/pom.xml | 10 +- .../scala/org/apache/spark/SparkContext.scala | 2 +- new-yarn/pom.xml | 161 +++++ .../spark/deploy/yarn/ApplicationMaster.scala | 446 ++++++++++++ .../yarn/ApplicationMasterArguments.scala | 94 +++ .../org/apache/spark/deploy/yarn/Client.scala | 519 ++++++++++++++ .../spark/deploy/yarn/ClientArguments.scala | 148 ++++ .../yarn/ClientDistributedCacheManager.scala | 228 ++++++ .../spark/deploy/yarn/WorkerLauncher.scala | 223 ++++++ .../spark/deploy/yarn/WorkerRunnable.scala | 209 ++++++ .../deploy/yarn/YarnAllocationHandler.scala | 687 +++++++++++++++++++ .../spark/deploy/yarn/YarnSparkHadoopUtil.scala | 43 ++ .../cluster/YarnClientClusterScheduler.scala | 47 ++ .../cluster/YarnClientSchedulerBackend.scala | 109 +++ .../cluster/YarnClusterScheduler.scala | 55 ++ .../ClientDistributedCacheManagerSuite.scala | 220 ++++++ pom.xml | 61 +- project/SparkBuild.scala | 34 +- streaming/pom.xml | 9 +- .../spark/deploy/yarn/ApplicationMaster.scala | 172 ++--- .../org/apache/spark/deploy/yarn/Client.scala | 151 ++-- .../spark/deploy/yarn/WorkerRunnable.scala | 85 ++- .../deploy/yarn/YarnAllocationHandler.scala | 346 ++++++---- 23 files changed, 3716 insertions(+), 343 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/72b69615/core/src/main/scala/org/apache/spark/SparkContext.scala ----------------------------------------------------------------------
