[
https://issues.apache.org/jira/browse/SPARK-22419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235774#comment-16235774
]
Adam Kramer edited comment on SPARK-22419 at 11/2/17 2:01 PM:
--------------------------------------------------------------
I'll assume it's on purpose for my stated reasons above. Apologies for not
posting to the mailing list, but I have a feeling this could act as a good web
reference from search, I rarely get results from the mailing list while
troubleshooting in Google. Also, the documentation for using Spark with
upgraded versions of Hadoop (e.g. 2.8) is definitely lacking or at best
confusing (i.e. a binary version including a version of Hadoop libs can still
be configured to use another version of Hadoop by following instruction from
the "without hadoop" wiki page). I suspect those instructions are old, but when
using SPARK_DIST_CLASSPATH to override the hadoop libraries you run into things
like log4j.properties files being hijacked by Hadoop version that change your
application logging altogether. My guess is that its something that likely
worked well a while ago or in a very specific situation, thus requires a lot of
trial and error.
was (Author: adamjk):
I'll assume it's on purpose for my stated reasons above. Apologies for not
posting to the mailing list, but I have a feeling this could act as a good web
reference from search, I rarely get results from the mailing list while
troubleshooting in Google. Also, the documentation for using Spark with
upgraded versions of Hadoop (e.g. 2.8) is definitely lacking or at best
confusing (i.e. a binary version including a version of Hadoop libs can still
be configured to use another version of Hadoop by following instruction from
the "without hadoop" wiki page). I suspect those instructions are old, but when
using SPARK_DIST_CLASSPATH to override the hadoop libraries you run into things
like log4j.properties files being hijacked by Hadoop version that change your
application logging altogether. My guess is that its something that likely
worked well a while ago or in a very specific situation requires a lot of
investigation.
> Hive and Hive Thriftserver jars missing from "without hadoop" build
> -------------------------------------------------------------------
>
> Key: SPARK-22419
> URL: https://issues.apache.org/jira/browse/SPARK-22419
> Project: Spark
> Issue Type: Question
> Components: Build
> Affects Versions: 2.1.1
> Reporter: Adam Kramer
> Priority: Minor
>
> The "without hadoop" binary distribution does not have hive-related libraries
> in the jars directory. This may be due to Hive being tied to major releases
> of Hadoop. My project requires using Hadoop 2.8, so "without hadoop" version
> seemed the best option. Should I use the make-distribution.sh instead?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]