[jira] [Comment Edited] (SPARK-8020) Spark SQL in spark-defaults.conf make metadataHive get constructed too early

Cheolsoo Park (JIRA) Mon, 01 Jun 2015 20:19:18 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568444#comment-14568444
 ]


Cheolsoo Park edited comment on SPARK-8020 at 6/2/15 3:17 AM:
--------------------------------------------------------------

[~yhuai], the patch seems to fix the original error. Thank you!

But it doesn't make it easy for me to use Hive 0.12 metastore. Now the 
challenge is that I set {{spark.sql.hive.metastore.jars}} to 
{{/home/cheolsoop/hive-0.12.0-bin/lib/*:$(hadoop classpath)}}, and that brings 
in all sorts of class conflicts that I didn't have when using the built-in Hive 
metastore. For now, I'll probably continue to use my workaround (i.e. 
commenting out [this 
code|https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala#L174])
 and use the built-in Hive metastore. Btw, Hive 0.13 client is almost 
compatible with Hive 0.12 metastore server except one introduced by HIVE-6330. 
It is not too bad as long as users can build their own jars.


was (Author: cheolsoo):
[~yhuai], the patch seems to fix the original error. Thank you!

But it doesn't make it easy for me to use Hive 0.12 metastore. Now the 
challenge is that I set {{spark.sql.hive.metastore.jars}} to 
{{/home/cheolsoop/hive-0.12.0-bin/lib/*:$(hadoop classpath)}}, and that brings 
in all sorts of class conflicts that I didn't have when using the built-in Hive 
metastore. For now, I'll probably continue to use my workaround (i.e. 
commenting out [this 
code|https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala#L174])
 and use the built-in Hive metastore. Btw, Hive 0.12 client is almost 
compatible with Hive 0.13 metastore server except one introduced by HIVE-6330. 
It is not too bad as long as users can built their own jars.

> Spark SQL in spark-defaults.conf make metadataHive get constructed too early
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-8020
>                 URL: https://issues.apache.org/jira/browse/SPARK-8020
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>            Priority: Critical
>
> To correctly construct a {{metadataHive}} object, we need two settings, 
> {{spark.sql.hive.metastore.version}} and {{spark.sql.hive.metastore.jars}}. 
> If users want to use Hive 0.12's metastore, they need to set 
> {{spark.sql.hive.metastore.version}} to {{0.12.0}} and set 
> {{spark.sql.hive.metastore.jars}} to {{maven}} or a classpath containing Hive 
> and Hadoop's jars. However, any spark sql setting in the 
> {{spark-defaults.conf}} will trigger the construction of {{metadataHive}} and 
> cause Spark SQL connect to the wrong metastore (e.g. connect to the local 
> derby metastore instead of a remove mysql Hive 0.12 metastore). Also, if 
> {{spark.sql.hive.metastore.version 0.12.0}} is the first conf set to SQL 
> conf, we will get
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Builtin jars 
> can only be used when hive execution version == hive metastore version. 
> Execution: 0.13.1 != Metastore: 0.12.0. Specify a vaild path to the correct 
> hive jars using $HIVE_METASTORE_JARS or change 
> spark.sql.hive.metastore.version to 0.13.1.
>       at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:186)
>       at 
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:175)
>       at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:358)
>       at 
> org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:186)
>       at 
> org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:185)
>       at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>       at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>       at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:185)
>       at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:71)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:248)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:136)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
>       at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
>       at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
>       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
>       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-8020) Spark SQL in spark-defaults.conf make metadataHive get constructed too early

Reply via email to