[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

pwendell Sat, 30 Aug 2014 21:28:23 -0700

Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2014#discussion_r16933038
  
    --- Diff: README.md ---
    @@ -66,78 +69,24 @@ Many of the example programs print usage help if no 
params are given.
     
     ## Running Tests
     
    -Testing first requires [building Spark](#building-spark). Once Spark is 
built, tests
    -can be run using:
    -
    -    ./dev/run-tests
    +Please see the guidance on how to 
    +[run all automated 
tests](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-AutomatedTesting)
     
     ## A Note About Hadoop Versions
     
     Spark uses the Hadoop core library to talk to HDFS and other 
Hadoop-supported
     storage systems. Because the protocols have changed in different versions 
of
     Hadoop, you must build Spark against the same version that your cluster 
runs.
    -You can change the version by setting `-Dhadoop.version` when building 
Spark.
    -
    -For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop
    -versions without YARN, use:
    -
    -    # Apache Hadoop 1.2.1
    -    $ sbt/sbt -Dhadoop.version=1.2.1 assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v1
    -    $ sbt/sbt -Dhadoop.version=2.0.0-mr1-cdh4.2.0 assembly
    -
    -For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x, Cloudera CDH MRv2, and 
other Hadoop versions
    -with YARN, also set `-Pyarn`:
    -
    -    # Apache Hadoop 2.0.5-alpha
    -    $ sbt/sbt -Dhadoop.version=2.0.5-alpha -Pyarn assembly
    -
    -    # Cloudera CDH 4.2.0 with MapReduce v2
    -    $ sbt/sbt -Dhadoop.version=2.0.0-cdh4.2.0 -Pyarn assembly
    -
    -    # Apache Hadoop 2.2.X and newer
    -    $ sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
    -
    -When developing a Spark application, specify the Hadoop version by adding 
the
    -"hadoop-client" artifact to your project's dependencies. For example, if 
you're
    -using Hadoop 1.2.1 and build your application using SBT, add this entry to
    -`libraryDependencies`:
    -
    -    "org.apache.hadoop" % "hadoop-client" % "1.2.1"
     
    -If your project is built with Maven, add this to your POM file's 
`<dependencies>` section:
    -
    -    <dependency>
    -      <groupId>org.apache.hadoop</groupId>
    -      <artifactId>hadoop-client</artifactId>
    -      <version>1.2.1</version>
    -    </dependency>
    -
    -
    -## A Note About Thrift JDBC server and CLI for Spark SQL
    -
    -Spark SQL supports Thrift JDBC server and CLI.
    -See sql-programming-guide.md for more information about using the JDBC 
server and CLI.
    -You can use those features by setting `-Phive` when building Spark as 
follows.
    -
    -    $ sbt/sbt -Phive  assembly
    +Please refer to the build documentation at
    +["Specifying the Hadoop 
Version"](http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version)
    --- End diff --
    
    Yeah maybe we should change the title of the doc to be called "Building 
Spark" and also it might be nice at the bottom to have a quick note that SBT is 
supported for developer builds but Maven is the reference build for all 
packaging. That might be the best way to do this overall.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

Reply via email to