[
https://issues.apache.org/jira/browse/PHOENIX-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586618#comment-14586618
]
Josh Mahonin commented on PHOENIX-2040:
---------------------------------------
[~ndimiduk] I've applied the patch to branch 4.4-HBase-0.98
(4.4.1-Hbase-0.98-SNAPSHOT) and can confirm that both sqlline and psql work,
though the benign sqlline NPE error I've seen in 4.4.0 is still there.
Re: Spark jobs and client JAR... that one's tricky. Myself, and other users [1]
have run into Spark classpath issues when working with HBase. Although it
should be technically possible to manually include in a jobs JAR all of the
dependencies necessary to get Spark talking to Phoenix, in practice it's
fraught with problems. Since most Spark distributions already include some
HBase libs, the classpath generally has to get updated with the proper
'hbase-protocol' JAR no matter what. What I've found to be the easiest
solution, is just to include the Phoenix client JAR in the Spark classpath for
both the driver and executors and forget about it [2].
I don't necessarily think that's the best way to do it, but it's the only one
I've found that a) works, and b) keeps the pom.xml and build.sbt files
manageable. I am sort of hoping that in time some other smart folks (maybe
Hortonworks or Cloudera?) who have a bit more experience with Spark and
classpath issues can help out here.
[1] http://mail-archives.apache.org/mod_mbox/phoenix-user/201506.mbox/browser
[2] https://phoenix.apache.org/phoenix_spark.html
> Mark spark/scala dependencies as 'provided'
> -------------------------------------------
>
> Key: PHOENIX-2040
> URL: https://issues.apache.org/jira/browse/PHOENIX-2040
> Project: Phoenix
> Issue Type: Bug
> Reporter: Josh Mahonin
> Assignee: Josh Mahonin
> Fix For: 5.0.0, 4.5.0
>
> Attachments: PHOENIX-2040.patch
>
>
> The Spark runtime provides both the scala library, as well as the Spark
> dependencies, so these should be marked as 'provided' in the phoenix-spark
> module. This greatly reduces the size of the resulting client JAR.
> This patch also adds back phoenix-spark to the list of modules in the
> assembly JAR, to be included in the client JAR.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)