[jira] [Comment Edited] (SPARK-8020) Spark SQL in spark-defaults.conf make metadataHive get constructed too early

2015-06-01 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568444#comment-14568444
 ] 

Cheolsoo Park edited comment on SPARK-8020 at 6/2/15 3:17 AM:
--

[~yhuai], the patch seems to fix the original error. Thank you!

But it doesn't make it easy for me to use Hive 0.12 metastore. Now the 
challenge is that I set {{spark.sql.hive.metastore.jars}} to 
{{/home/cheolsoop/hive-0.12.0-bin/lib/*:$(hadoop classpath)}}, and that brings 
in all sorts of class conflicts that I didn't have when using the built-in Hive 
metastore. For now, I'll probably continue to use my workaround (i.e. 
commenting out [this 
code|https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala#L174])
 and use the built-in Hive metastore. Btw, Hive 0.13 client is almost 
compatible with Hive 0.12 metastore server except one introduced by HIVE-6330. 
It is not too bad as long as users can build their own jars.


was (Author: cheolsoo):
[~yhuai], the patch seems to fix the original error. Thank you!

But it doesn't make it easy for me to use Hive 0.12 metastore. Now the 
challenge is that I set {{spark.sql.hive.metastore.jars}} to 
{{/home/cheolsoop/hive-0.12.0-bin/lib/*:$(hadoop classpath)}}, and that brings 
in all sorts of class conflicts that I didn't have when using the built-in Hive 
metastore. For now, I'll probably continue to use my workaround (i.e. 
commenting out [this 
code|https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala#L174])
 and use the built-in Hive metastore. Btw, Hive 0.12 client is almost 
compatible with Hive 0.13 metastore server except one introduced by HIVE-6330. 
It is not too bad as long as users can built their own jars.

> Spark SQL in spark-defaults.conf make metadataHive get constructed too early
> 
>
> Key: SPARK-8020
> URL: https://issues.apache.org/jira/browse/SPARK-8020
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0
>Reporter: Yin Huai
>Assignee: Yin Huai
>Priority: Critical
>
> To correctly construct a {{metadataHive}} object, we need two settings, 
> {{spark.sql.hive.metastore.version}} and {{spark.sql.hive.metastore.jars}}. 
> If users want to use Hive 0.12's metastore, they need to set 
> {{spark.sql.hive.metastore.version}} to {{0.12.0}} and set 
> {{spark.sql.hive.metastore.jars}} to {{maven}} or a classpath containing Hive 
> and Hadoop's jars. However, any spark sql setting in the 
> {{spark-defaults.conf}} will trigger the construction of {{metadataHive}} and 
> cause Spark SQL connect to the wrong metastore (e.g. connect to the local 
> derby metastore instead of a remove mysql Hive 0.12 metastore). Also, if 
> {{spark.sql.hive.metastore.version 0.12.0}} is the first conf set to SQL 
> conf, we will get
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Builtin jars 
> can only be used when hive execution version == hive metastore version. 
> Execution: 0.13.1 != Metastore: 0.12.0. Specify a vaild path to the correct 
> hive jars using $HIVE_METASTORE_JARS or change 
> spark.sql.hive.metastore.version to 0.13.1.
>   at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:186)
>   at 
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:175)
>   at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:358)
>   at 
> org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:186)
>   at 
> org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:185)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>   at org.apache.spark.sql.SQLContext.(SQLContext.scala:185)
>   at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:71)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:248)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:136)
>   at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.j

[jira] [Comment Edited] (SPARK-8020) Spark SQL in spark-defaults.conf make metadataHive get constructed too early

2015-06-01 Thread jeanlyn (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568427#comment-14568427
 ] 

jeanlyn edited comment on SPARK-8020 at 6/2/15 3:16 AM:


[~yhuai],I set *spark.sql.hive.metastore.jars* in spark-defaults.conf i got 
errors like yours.But when i set *spark.sql.hive.metastore.jars* in 
*hive-site.xml* i got
{code}
5/06/02 10:42:04 INFO storage.BlockManagerMaster: Trying to register 
BlockManager
15/06/02 10:42:04 INFO storage.BlockManagerMasterEndpoint: Registering block 
manager localhost:41416 with 706.6 MB RAM, BlockManagerId(driver, localhost, 
41416)
15/06/02 10:42:04 INFO storage.BlockManagerMaster: Registered BlockManager
SET spark.sql.hive.metastore.version=0.12.0
15/06/02 10:42:04 WARN conf.HiveConf: DEPRECATED: Configuration property 
hive.metastore.local no longer has any effect. Make sure to provide a valid 
value for hive.metastore.u
ris if you are connecting to a remote metastore.
15/06/02 10:42:04 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no 
longer has any effect.  Use hive.hmshandler.retry.* instead
15/06/02 10:42:04 INFO hive.HiveContext: Initializing HiveMetastoreConnection 
version 0.12.0 using maven.
Ivy Default Cache set to: /home/dd_edw/.ivy2/cache
The jars for the packages stored in: /home/dd_edw/.ivy2/jars
http://www.datanucleus.org/downloads/maven2 added as a remote repository with 
the name: repo-1
:: loading settings :: url = 
jar:file:/data0/spark-1.3.0-bin-2.2.0/lib/spark-assembly-1.4.0-SNAPSHOT-hadoop2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.hive#hive-metastore added as a dependency
org.apache.hive#hive-exec added as a dependency
org.apache.hive#hive-common added as a dependency
org.apache.hive#hive-serde added as a dependency
com.google.guava#guava added as a dependency
org.apache.hadoop#hadoop-client added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
   confs: [default]
   found org.apache.hive#hive-metastore;0.12.0 in central
   found org.antlr#antlr;3.4 in central
   found org.antlr#antlr-runtime;3.4 in central

xception in thread "main" java.lang.ClassNotFoundException: 
java.lang.NoClassDefFoundError: com/google/common/base/Preconditions when 
creating Hive client using classpath: fi
le:/tmp/hive3795822184995995241vv12/aopalliance_aopalliance-1.0.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.hive_hive-exec-0.12.0.jar, 
file:/tmp/hive3795822184995995
241vv12/org.apache.thrift_libfb303-0.9.0.jar, 
file:/tmp/hive3795822184995995241vv12/commons-digester_commons-digester-1.8.jar,
 file:/tmp/hive3795822184995995241vv12/com.sun.je
rsey_jersey-client-1.9.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.httpcomponents_httpclient-4.2.5.jar,
 file:/tmp/hive3795822184995995241vv12/org.antlr_stringtemplat
e-3.2.1.jar, 
file:/tmp/hive3795822184995995241vv12/commons-logging_commons-logging-1.1.3.jar,
 file:/tmp/hive3795822184995995241vv12/org.antlr_antlr-runtime-3.4.jar, 
file:/tmp/
hive3795822184995995241vv12/org.mockito_mockito-all-1.8.2.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.derby_derby-10.4.2.0.jar, 
file:/tmp/hive3795822184995995241vv12
/antlr_antlr-2.7.7.jar, 
file:/tmp/hive3795822184995995241vv12/commons-net_commons-net-3.1.jar, 
file:/tmp/hive3795822184995995241vv12/org.slf4j_slf4j-log4j12-1.7.5.jar, file:/t
mp/hive3795822184995995241vv12/junit_junit-3.8.1.jar, 
file:/tmp/hive3795822184995995241vv12/org.codehaus.jackson_jackson-jaxrs-1.8.8.jar,
 file:/tmp/hive3795822184995995241vv12
/commons-cli_commons-cli-1.2.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.hive_hive-serde-0.12.0.jar, 
file:/tmp/hive3795822184995995241vv12/org.codehaus.jettison_jett
ison-1.1.jar, 
file:/tmp/hive3795822184995995241vv12/javax.xml.stream_stax-api-1.0-2.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.avro_avro-1.7.4.jar, 
file:/tmp/hive37
95822184995995241vv12/org.apache.hadoop_hadoop-mapreduce-client-app-2.4.0.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.hadoop_hadoop-mapreduce-client-common-2.4.0.jar
, 
file:/tmp/hive3795822184995995241vv12/org.codehaus.jackson_jackson-xc-1.8.8.jar,
 
file:/tmp/hive3795822184995995241vv12/org.apache.hadoop_hadoop-annotations-2.4.0.jar,
 file:/
tmp/hive3795822184995995241vv12/org.mortbay.jetty_jetty-util-6.1.26.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.commons_commons-math3-3.1.1.jar,
 file:/tmp/hive379582
2184995995241vv12/javax.transaction_jta-1.1.jar, 
file:/tmp/hive3795822184995995241vv12/commons-httpclient_commons-httpclient-3.1.jar,
 file:/tmp/hive3795822184995995241vv12/xml
enc_xmlenc-0.52.jar, 
file:/tmp/hive3795822184995995241vv12/org.sonatype.sisu.inject_cglib-2.2.1-v20090111.jar,
 file:/tmp/hive3795822184995995241vv12/com.google.code.findbugs_j
sr305-1.3.9.jar, 
file:/tmp/hive3795822184995995241vv12/commons-codec_commons-codec-1.4.jar, 
fi

[jira] [Comment Edited] (SPARK-8020) Spark SQL in spark-defaults.conf make metadataHive get constructed too early

2015-06-01 Thread jeanlyn (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568427#comment-14568427
 ] 

jeanlyn edited comment on SPARK-8020 at 6/2/15 2:48 AM:


[~yhuai],I set *spark.sql.hive.metastore.jars* in spark-defaults.conf i got 
errors like yours.But when i set *spark.sql.hive.metastore.jars* in 
*hive-set.xml* i got
{code}
5/06/02 10:42:04 INFO storage.BlockManagerMaster: Trying to register 
BlockManager
15/06/02 10:42:04 INFO storage.BlockManagerMasterEndpoint: Registering block 
manager localhost:41416 with 706.6 MB RAM, BlockManagerId(driver, localhost, 
41416)
15/06/02 10:42:04 INFO storage.BlockManagerMaster: Registered BlockManager
SET spark.sql.hive.metastore.version=0.12.0
15/06/02 10:42:04 WARN conf.HiveConf: DEPRECATED: Configuration property 
hive.metastore.local no longer has any effect. Make sure to provide a valid 
value for hive.metastore.u
ris if you are connecting to a remote metastore.
15/06/02 10:42:04 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no 
longer has any effect.  Use hive.hmshandler.retry.* instead
15/06/02 10:42:04 INFO hive.HiveContext: Initializing HiveMetastoreConnection 
version 0.12.0 using maven.
Ivy Default Cache set to: /home/dd_edw/.ivy2/cache
The jars for the packages stored in: /home/dd_edw/.ivy2/jars
http://www.datanucleus.org/downloads/maven2 added as a remote repository with 
the name: repo-1
:: loading settings :: url = 
jar:file:/data0/spark-1.3.0-bin-2.2.0/lib/spark-assembly-1.4.0-SNAPSHOT-hadoop2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.hive#hive-metastore added as a dependency
org.apache.hive#hive-exec added as a dependency
org.apache.hive#hive-common added as a dependency
org.apache.hive#hive-serde added as a dependency
com.google.guava#guava added as a dependency
org.apache.hadoop#hadoop-client added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
   confs: [default]
   found org.apache.hive#hive-metastore;0.12.0 in central
   found org.antlr#antlr;3.4 in central
   found org.antlr#antlr-runtime;3.4 in central

xception in thread "main" java.lang.ClassNotFoundException: 
java.lang.NoClassDefFoundError: com/google/common/base/Preconditions when 
creating Hive client using classpath: fi
le:/tmp/hive3795822184995995241vv12/aopalliance_aopalliance-1.0.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.hive_hive-exec-0.12.0.jar, 
file:/tmp/hive3795822184995995
241vv12/org.apache.thrift_libfb303-0.9.0.jar, 
file:/tmp/hive3795822184995995241vv12/commons-digester_commons-digester-1.8.jar,
 file:/tmp/hive3795822184995995241vv12/com.sun.je
rsey_jersey-client-1.9.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.httpcomponents_httpclient-4.2.5.jar,
 file:/tmp/hive3795822184995995241vv12/org.antlr_stringtemplat
e-3.2.1.jar, 
file:/tmp/hive3795822184995995241vv12/commons-logging_commons-logging-1.1.3.jar,
 file:/tmp/hive3795822184995995241vv12/org.antlr_antlr-runtime-3.4.jar, 
file:/tmp/
hive3795822184995995241vv12/org.mockito_mockito-all-1.8.2.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.derby_derby-10.4.2.0.jar, 
file:/tmp/hive3795822184995995241vv12
/antlr_antlr-2.7.7.jar, 
file:/tmp/hive3795822184995995241vv12/commons-net_commons-net-3.1.jar, 
file:/tmp/hive3795822184995995241vv12/org.slf4j_slf4j-log4j12-1.7.5.jar, file:/t
mp/hive3795822184995995241vv12/junit_junit-3.8.1.jar, 
file:/tmp/hive3795822184995995241vv12/org.codehaus.jackson_jackson-jaxrs-1.8.8.jar,
 file:/tmp/hive3795822184995995241vv12
/commons-cli_commons-cli-1.2.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.hive_hive-serde-0.12.0.jar, 
file:/tmp/hive3795822184995995241vv12/org.codehaus.jettison_jett
ison-1.1.jar, 
file:/tmp/hive3795822184995995241vv12/javax.xml.stream_stax-api-1.0-2.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.avro_avro-1.7.4.jar, 
file:/tmp/hive37
95822184995995241vv12/org.apache.hadoop_hadoop-mapreduce-client-app-2.4.0.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.hadoop_hadoop-mapreduce-client-common-2.4.0.jar
, 
file:/tmp/hive3795822184995995241vv12/org.codehaus.jackson_jackson-xc-1.8.8.jar,
 
file:/tmp/hive3795822184995995241vv12/org.apache.hadoop_hadoop-annotations-2.4.0.jar,
 file:/
tmp/hive3795822184995995241vv12/org.mortbay.jetty_jetty-util-6.1.26.jar, 
file:/tmp/hive3795822184995995241vv12/org.apache.commons_commons-math3-3.1.1.jar,
 file:/tmp/hive379582
2184995995241vv12/javax.transaction_jta-1.1.jar, 
file:/tmp/hive3795822184995995241vv12/commons-httpclient_commons-httpclient-3.1.jar,
 file:/tmp/hive3795822184995995241vv12/xml
enc_xmlenc-0.52.jar, 
file:/tmp/hive3795822184995995241vv12/org.sonatype.sisu.inject_cglib-2.2.1-v20090111.jar,
 file:/tmp/hive3795822184995995241vv12/com.google.code.findbugs_j
sr305-1.3.9.jar, 
file:/tmp/hive3795822184995995241vv12/commons-codec_commons-codec-1.4.jar, 
fil