GitHub user aarondav opened a pull request:
https://github.com/apache/spark/pull/237
[WIP] SPARK-1314: Use SPARK_HIVE to determine if we include Hive in
packaging
Previously, we based our decision regarding including datanucleus jars
based on the existence of a spark-hive-assembly jar, which was incidentally
built whenever "sbt assembly" is run. This means that a typical and previously
supported pathway would start using hive jars.
This patch has the following features/bug fixes:
- Use of SPARK_HIVE (default false) to determine if we should include Hive
in the assembly jar.
- Analagous feature in Maven with -Phive (previously, there was no support
for adding Hive to any of our jars produced by Maven)
- assemble-deps fixed since we no longer use a different ASSEMBLY_DIR
- avoid adding log message in compute-classpath.sh to the classpath :)
Still TODO before mergeable:
- We need to download the datanucleus jars outside of sbt. Perhaps we can
have spark-class download them if SPARK_HIVE is set similar to how sbt
downloads itself.
- Spark SQL documentation updates.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/aarondav/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/237.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #237
----
commit d09b3a0dea86a8a16e7901af3b3a801d47515c6b
Author: Aaron Davidson <[email protected]>
Date: 2014-03-13T18:26:28Z
[WIP] Use SPARK_HIVE to determine if we include Hive in packaging
Previously, we based our decision regarding including datanucleus jars
based on the existence of a spark-hive-assembly jar, which was
incididentally
built whenever "sbt assembly" is run. This means that a typical and
previously supported pathway would start using hive jars.
This patch has the following features/bug fixes:
- Use of SPARK_HIVE (default false) to determine if we should include Hive
in the assembly jar.
- Analagous feature in Maven with -Phive.
- assemble-deps fixed since we no longer use a different ASSEMBLY_DIR
Still TODO before mergeable:
- We need to download the datanucleus jars outside of sbt. Perhaps we can
have
spark-class download them if SPARK_HIVE is set similar to how sbt
downloads
itself.
- Spark SQL documentation updates.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---