[GitHub] spark pull request: [WIP] SPARK-1314: Use SPARK_HIVE to determine ...

aarondav Wed, 26 Mar 2014 00:37:22 -0700

GitHub user aarondav opened a pull request:

    https://github.com/apache/spark/pull/237


    [WIP] SPARK-1314: Use SPARK_HIVE to determine if we include Hive in 
packaging

    Previously, we based our decision regarding including datanucleus jars 
based on the existence of a spark-hive-assembly jar, which was incidentally 
built whenever "sbt assembly" is run. This means that a typical and previously 
supported pathway would start using hive jars.
    
    This patch has the following features/bug fixes:
    
    - Use of SPARK_HIVE (default false) to determine if we should include Hive 
in the assembly jar.
    - Analagous feature in Maven with -Phive (previously, there was no support 
for adding Hive to any of our jars produced by Maven)
    - assemble-deps fixed since we no longer use a different ASSEMBLY_DIR
    - avoid adding log message in compute-classpath.sh to the classpath :)
    
    Still TODO before mergeable:
    - We need to download the datanucleus jars outside of sbt. Perhaps we can 
have spark-class download them if SPARK_HIVE is set similar to how sbt 
downloads itself.
    - Spark SQL documentation updates.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aarondav/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/237.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #237
    
----
commit d09b3a0dea86a8a16e7901af3b3a801d47515c6b
Author: Aaron Davidson <[email protected]>
Date:   2014-03-13T18:26:28Z

    [WIP] Use SPARK_HIVE to determine if we include Hive in packaging
    
    Previously, we based our decision regarding including datanucleus jars
    based on the existence of a spark-hive-assembly jar, which was 
incididentally
    built whenever "sbt assembly" is run. This means that a typical and
    previously supported pathway would start using hive jars.
    
    This patch has the following features/bug fixes:
    
    - Use of SPARK_HIVE (default false) to determine if we should include Hive
      in the assembly jar.
    - Analagous feature in Maven with -Phive.
    - assemble-deps fixed since we no longer use a different ASSEMBLY_DIR
    
    Still TODO before mergeable:
    - We need to download the datanucleus jars outside of sbt. Perhaps we can 
have
      spark-class download them if SPARK_HIVE is set similar to how sbt 
downloads
      itself.
    - Spark SQL documentation updates.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [WIP] SPARK-1314: Use SPARK_HIVE to determine ...

Reply via email to