[ 
https://issues.apache.org/jira/browse/PHOENIX-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586618#comment-14586618
 ] 

Josh Mahonin commented on PHOENIX-2040:
---------------------------------------

[~ndimiduk] I've applied the patch to branch 4.4-HBase-0.98 
(4.4.1-Hbase-0.98-SNAPSHOT) and can confirm that both sqlline and psql work, 
though the benign sqlline NPE error I've seen in 4.4.0 is still there.

Re: Spark jobs and client JAR... that one's tricky. Myself, and other users [1] 
have run into Spark classpath issues when working with HBase. Although it 
should be technically possible to manually include in a jobs JAR all of the 
dependencies necessary to get Spark talking to Phoenix, in practice it's 
fraught with problems. Since most Spark distributions already include some 
HBase libs, the classpath generally has to get updated with the proper 
'hbase-protocol' JAR no matter what. What I've found to be the easiest 
solution, is just to include the Phoenix client JAR in the Spark classpath for 
both the driver and executors and forget about it [2].

I don't necessarily think that's the best way to do it, but it's the only one 
I've found that a) works, and b) keeps the pom.xml and build.sbt files 
manageable. I am sort of hoping that in time some other smart folks (maybe 
Hortonworks or Cloudera?) who have a bit more experience with Spark and 
classpath issues can help out here.

[1] http://mail-archives.apache.org/mod_mbox/phoenix-user/201506.mbox/browser
[2] https://phoenix.apache.org/phoenix_spark.html

> Mark spark/scala dependencies as 'provided'
> -------------------------------------------
>
>                 Key: PHOENIX-2040
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2040
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Josh Mahonin
>            Assignee: Josh Mahonin
>             Fix For: 5.0.0, 4.5.0
>
>         Attachments: PHOENIX-2040.patch
>
>
> The Spark runtime provides both the scala library, as well as the Spark 
> dependencies, so these should be marked as 'provided' in the phoenix-spark 
> module. This greatly reduces the size of the resulting client JAR.
> This patch also adds back phoenix-spark to the list of modules in the 
> assembly JAR, to be included in the client JAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to