Re: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available
Hi Young, Sorry for the duplicate post, want to reply to all. I just downloaded the bits prebuilt form apache spark download site. Started the spark shell and got the same error. I then started the shell as follows: ./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2 --driver-class-path /usr/local/spark/lib/mysql-connector-java-5.1.34-bin.jar --jars $(echo ~/Downloads/apache-hive-0.13.1-bin/lib/*.jar | tr ' ' ',') this worked, or at least got rid of this scala> case class MetricTable(path: String, pathElements: String, name: String, value: String) scala.reflect.internal.Types$TypeError: bad symbolic reference. A signature in HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling HiveMetastoreCatalog .class. That entry seems to have slain the compiler. Shall I replay your session? I can re-run each line except the last one. [y/n] Still getting the ClassNotFoundException, json_tuple, from this statement same as in 1.2.1: sql( """SELECT path, name, value, v1.peValue, v1.peName FROM metric_table lateral view json_tuple(pathElements, 'name', 'value') v1 as peName, peValue """) .collect.foreach(println(_)) 15/04/02 20:50:14 INFO ParseDriver: Parsing command: SELECT path, name, value, v1.peValue, v1.peName FROM metric_table lateral view json_tuple(pathElements, 'name', 'value') v1 as peName, peValue15/04/02 20:50:14 INFO ParseDriver: Parse Completed java.lang.ClassNotFoundException: json_tuple at scala.tools.nsc.interpreter.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:83) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) Any ideas on the json_tuple exception? Modified the syntax to take into account some minor changes in 1.3. The one posted this morning was from my 1.2.1 test. import sqlContext.implicits._case class MetricTable(path: String, pathElements: String, name: String, value: String)val mt = new MetricTable("""path": "/DC1/HOST1/""", """pathElements": [{"node": "DataCenter","value": "DC1"},{"node": "host","value": "HOST1"}]""", """name": "Memory Usage (%)""", """value": 29.590943279257175""")val rdd1 = sc.makeRDD(List(mt))val df = rdd1.toDF df.printSchema df.show df.registerTempTable("metric_table") sql( """SELECT path, name, value, v1.peValue, v1.peName FROM metric_table lateral view json_tuple(pathElements, 'name', 'value') v1 as peName, peValue """) .collect.foreach(println(_)) On Thu, Apr 2, 2015 at 8:21 PM, java8964 wrote: > Hmm, I just tested my own Spark 1.3.0 build. I have the same problem, but > I cannot reproduce it on Spark 1.2.1 > > If we check the code change below: > > Spark 1.3 branch > > https://github.com/apache/spark/blob/branch-1.3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala > > vs > > Spark 1.2 branch > > https://github.com/apache/spark/blob/branch-1.2/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala > > You can see that on line 24: > > import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache} > > is introduced on 1.3 branch. > > The error basically mean runtime com.google.common.cache package cannot be > found in the classpath. > > Either you and me made the same mistake when we build Spark 1.3.0, or > there are something wrong with Spark 1.3 pom.xml file. > > Here is how I built the 1.3.0: > > 1) Download the spark 1.3.0 source > 2) make-distribution --targz -Dhadoop.version=1.1.1 -Phive -Phive-0.12.0 > -Phive-thriftserver -DskipTests > > Is this only due to that I built against Hadoop 1.x? > > Yong > > > -- > Date: Thu, 2 Apr 2015 13:56:33 -0400 > Subject: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class > refers to term cache in package com.google.common which is not available > From: tsind...@gmail.com > To: user@spark.apache.org > > > I was trying a simple test from the spark-shell to see if 1.3.0 would > address a problem I was having with locating the json_tuple class and got > the following error: > > scala> import org.apache.spark.sql.hive._ > import org.apache.spark.sql.hive._ > > scala> val sqlContext = new HiveConte
RE: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available
Hmm, I just tested my own Spark 1.3.0 build. I have the same problem, but I cannot reproduce it on Spark 1.2.1 If we check the code change below: Spark 1.3 branchhttps://github.com/apache/spark/blob/branch-1.3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala vs Spark 1.2 branchhttps://github.com/apache/spark/blob/branch-1.2/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala You can see that on line 24: import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache} is introduced on 1.3 branch. The error basically mean runtime com.google.common.cache package cannot be found in the classpath. Either you and me made the same mistake when we build Spark 1.3.0, or there are something wrong with Spark 1.3 pom.xml file. Here is how I built the 1.3.0: 1) Download the spark 1.3.0 source2) make-distribution --targz -Dhadoop.version=1.1.1 -Phive -Phive-0.12.0 -Phive-thriftserver -DskipTests Is this only due to that I built against Hadoop 1.x? Yong Date: Thu, 2 Apr 2015 13:56:33 -0400 Subject: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available From: tsind...@gmail.com To: user@spark.apache.org I was trying a simple test from the spark-shell to see if 1.3.0 would address a problem I was having with locating the json_tuple class and got the following error: scala> import org.apache.spark.sql.hive._ import org.apache.spark.sql.hive._ scala> val sqlContext = new HiveContext(sc) sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@79c849c7 scala> import sqlContext._ import sqlContext._ scala> case class MetricTable(path: String, pathElements: String, name: String, value: String) scala.reflect.internal.Types$TypeError: bad symbolic reference. A signature in HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling HiveMetastoreCatalog.class. That entry seems to have slain the compiler. Shall I replay your session? I can re-run each line except the last one. [y/n] Abandoning crashed session.I entered the shell as follows:./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2 --driver-class-path /usr/local/spark/lib/mysql-connector-java-5.1.34-bin.jarhive-site.xml looks like this: hive.semantic.analyzer.factory.impl org.apache.hcatalog.cli.HCatSemanticAnalyzerFactory hive.metastore.sasl.enabled false hive.server2.authentication NONE hive.server2.enable.doAs true hive.warehouse.subdir.inherit.perms true hive.metastore.schema.verification false javax.jdo.option.ConnectionURL jdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true metadata is stored in a MySQL server javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver MySQL JDBC driver class javax.jdo.option.ConnectionUserName *** javax.jdo.option.ConnectionPassword I have downloaded a clean version of 1.3.0 and tried it again but same error. Is this a know issue? Or a configuration issue on my part?TIA for the assistances.-Todd
Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available
I was trying a simple test from the spark-shell to see if 1.3.0 would address a problem I was having with locating the json_tuple class and got the following error: scala> import org.apache.spark.sql.hive._ import org.apache.spark.sql.hive._ scala> val sqlContext = new HiveContext(sc)sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@79c849c7 scala> import sqlContext._ import sqlContext._ scala> case class MetricTable(path: String, pathElements: String, name: String, value: String)scala.reflect.internal.Types$TypeError: bad symbolic reference. A signature in HiveMetastoreCatalog.class refers to term cachein package com.google.common which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling HiveMetastoreCatalog.class. That entry seems to have slain the compiler. Shall I replay your session? I can re-run each line except the last one. [y/n] Abandoning crashed session. I entered the shell as follows: ./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2 --driver-class-path /usr/local/spark/lib/mysql-connector-java-5.1.34-bin.jar hive-site.xml looks like this: hive.semantic.analyzer.factory.impl org.apache.hcatalog.cli.HCatSemanticAnalyzerFactory hive.metastore.sasl.enabled false hive.server2.authentication NONE hive.server2.enable.doAs true hive.warehouse.subdir.inherit.perms true hive.metastore.schema.verification false javax.jdo.option.ConnectionURL jdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true metadata is stored in a MySQL server javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver MySQL JDBC driver class javax.jdo.option.ConnectionUserName *** javax.jdo.option.ConnectionPassword I have downloaded a clean version of 1.3.0 and tried it again but same error. Is this a know issue? Or a configuration issue on my part? TIA for the assistances. -Todd