RE: Build Spark 1.2.0-rc1 encounter exceptions when running HiveContext - Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy
Hi Patrick, I manually hardcoded the hive version to 0.13.1a and it works. It turns out that for some reason, 0.13.1 is being picked up instead of the 0.13.1a version from maven. So my solution was:hardcode the hive.version to 0.13.1a in my case since I am building it against hive 0.13 only, so the pom.xml was hardcoded with that version string, and the final JAR is working now with hive-exec 0.13.1a embed. Possible Reason why it didn't work?I suspect our internal environment is picking up 0.13.1 since we do use our own maven repo as a proxy and caching. 0.13.1a did appear in our own repo and it got replicated from the maven central repo, but during the build process, maven picked up 0.13.1 instead of 0.13.1a. Date: Wed, 10 Dec 2014 12:23:08 -0800 Subject: Re: Build Spark 1.2.0-rc1 encounter exceptions when running HiveContext - Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy From: pwend...@gmail.com To: alee...@hotmail.com CC: dev@spark.apache.org Hi Andrew, It looks like somehow you are including jars from the upstream Apache Hive 0.13 project on your classpath. For Spark 1.2 Hive 0.13 support, we had to modify Hive to use a different version of Kryo that was compatible with Spark's Kryo version. https://github.com/pwendell/hive/commit/5b582f242946312e353cfce92fc3f3fa472aedf3 I would look through the actual classpath and make sure you aren't including your own hive-exec jar somehow. - Patrick On Wed, Dec 10, 2014 at 9:48 AM, Andrew Lee alee...@hotmail.com wrote: Apologize for the format, somehow it got messed up and linefeed were removed. Here's a reformatted version. Hi All, I tried to include necessary libraries in SPARK_CLASSPATH in spark-env.sh to include auxiliaries JARs and datanucleus*.jars from Hive, however, when I run HiveContext, it gives me the following error: Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy I have checked the JARs with (jar tf), looks like this is already included (shaded) in the assembly JAR (spark-assembly-1.2.0-hadoop2.4.1.jar) which is configured in the System classpath already. I couldn't figure out what is going on with the shading on the esotericsoftware JARs here. Any help is appreciated. How to reproduce the problem? Run the following 3 statements in spark-shell ( This is how I launched my spark-shell. cd /opt/spark; ./bin/spark-shell --master yarn --deploy-mode client --queue research --driver-memory 1024M) import org.apache.spark.SparkContext val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) hiveContext.hql(CREATE TABLE IF NOT EXISTS spark_hive_test_table (key INT, value STRING)) A reference of my environment. Apache Hadoop 2.4.1 Apache Hive 0.13.1 Apache Spark branch-1.2 (installed under /opt/spark/, and config under /etc/spark/) Maven build command: mvn -U -X -Phadoop-2.4 -Pyarn -Phive -Phive-0.13.1 -Dhadoop.version=2.4.1 -Dyarn.version=2.4.1 -Dhive.version=0.13.1 -DskipTests install Source Code commit label: eb4d457a870f7a281dc0267db72715cd00245e82 My spark-env.sh have the following contents when I executed spark-shell: HADOOP_HOME=/opt/hadoop/ HIVE_HOME=/opt/hive/ HADOOP_CONF_DIR=/etc/hadoop/ YARN_CONF_DIR=/etc/hadoop/ HIVE_CONF_DIR=/etc/hive/ HADOOP_SNAPPY_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name snappy-java-*.jar) HADOOP_LZO_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name hadoop-lzo-*.jar) SPARK_YARN_DIST_FILES=/user/spark/libs/spark-assembly-1.2.0-hadoop2.4.1.jar export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native export SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_SNAPPY_JAR:$HADOOP_LZO_JAR:$HIVE_CONF_DIR:/opt/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/hive/lib/datanucleus-core-3.2.10.jar:/opt/hive/lib/datanucleus-rdbms-3.2.9.jar Here's what I see from my stack trace. warning: there were 1 deprecation warning(s); re-run with -deprecation for details Hive history file=/home/hive/log/alti-test-01/hive_job_log_b5db9539-4736-44b3-a601-04fa77cb6730_1220828461.txt java.lang.NoClassDefFoundError: com/esotericsoftware/shaded/org/objenesis/strategy/InstantiatorStrategy at org.apache.hadoop.hive.ql.exec.Utilities.clinit(Utilities.java:925) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9718) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9712) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:434) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal
RE: Build Spark 1.2.0-rc1 encounter exceptions when running HiveContext - Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy
Apologize for the format, somehow it got messed up and linefeed were removed. Here's a reformatted version. Hi All, I tried to include necessary libraries in SPARK_CLASSPATH in spark-env.sh to include auxiliaries JARs and datanucleus*.jars from Hive, however, when I run HiveContext, it gives me the following error: Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy I have checked the JARs with (jar tf), looks like this is already included (shaded) in the assembly JAR (spark-assembly-1.2.0-hadoop2.4.1.jar) which is configured in the System classpath already. I couldn't figure out what is going on with the shading on the esotericsoftware JARs here. Any help is appreciated. How to reproduce the problem? Run the following 3 statements in spark-shell ( This is how I launched my spark-shell. cd /opt/spark; ./bin/spark-shell --master yarn --deploy-mode client --queue research --driver-memory 1024M) import org.apache.spark.SparkContext val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) hiveContext.hql(CREATE TABLE IF NOT EXISTS spark_hive_test_table (key INT, value STRING)) A reference of my environment. Apache Hadoop 2.4.1 Apache Hive 0.13.1 Apache Spark branch-1.2 (installed under /opt/spark/, and config under /etc/spark/) Maven build command: mvn -U -X -Phadoop-2.4 -Pyarn -Phive -Phive-0.13.1 -Dhadoop.version=2.4.1 -Dyarn.version=2.4.1 -Dhive.version=0.13.1 -DskipTests install Source Code commit label: eb4d457a870f7a281dc0267db72715cd00245e82 My spark-env.sh have the following contents when I executed spark-shell: HADOOP_HOME=/opt/hadoop/ HIVE_HOME=/opt/hive/ HADOOP_CONF_DIR=/etc/hadoop/ YARN_CONF_DIR=/etc/hadoop/ HIVE_CONF_DIR=/etc/hive/ HADOOP_SNAPPY_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name snappy-java-*.jar) HADOOP_LZO_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name hadoop-lzo-*.jar) SPARK_YARN_DIST_FILES=/user/spark/libs/spark-assembly-1.2.0-hadoop2.4.1.jar export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native export SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_SNAPPY_JAR:$HADOOP_LZO_JAR:$HIVE_CONF_DIR:/opt/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/hive/lib/datanucleus-core-3.2.10.jar:/opt/hive/lib/datanucleus-rdbms-3.2.9.jar Here's what I see from my stack trace. warning: there were 1 deprecation warning(s); re-run with -deprecation for details Hive history file=/home/hive/log/alti-test-01/hive_job_log_b5db9539-4736-44b3-a601-04fa77cb6730_1220828461.txt java.lang.NoClassDefFoundError: com/esotericsoftware/shaded/org/objenesis/strategy/InstantiatorStrategy at org.apache.hadoop.hive.ql.exec.Utilities.clinit(Utilities.java:925) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9718) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9712) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:434) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:108) at org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:102) at org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:106) at $iwC$$iwC$$iwC$$iwC.init(console:16) at $iwC$$iwC$$iwC.init(console:21) at $iwC$$iwC.init(console:23) at $iwC.init(console:25) at init(console:27) at .init(console:31) at .clinit(console) at .init(console:7) at .clinit(console) at $print(console) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
Re: Build Spark 1.2.0-rc1 encounter exceptions when running HiveContext - Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy
Hi Andrew, It looks like somehow you are including jars from the upstream Apache Hive 0.13 project on your classpath. For Spark 1.2 Hive 0.13 support, we had to modify Hive to use a different version of Kryo that was compatible with Spark's Kryo version. https://github.com/pwendell/hive/commit/5b582f242946312e353cfce92fc3f3fa472aedf3 I would look through the actual classpath and make sure you aren't including your own hive-exec jar somehow. - Patrick On Wed, Dec 10, 2014 at 9:48 AM, Andrew Lee alee...@hotmail.com wrote: Apologize for the format, somehow it got messed up and linefeed were removed. Here's a reformatted version. Hi All, I tried to include necessary libraries in SPARK_CLASSPATH in spark-env.sh to include auxiliaries JARs and datanucleus*.jars from Hive, however, when I run HiveContext, it gives me the following error: Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy I have checked the JARs with (jar tf), looks like this is already included (shaded) in the assembly JAR (spark-assembly-1.2.0-hadoop2.4.1.jar) which is configured in the System classpath already. I couldn't figure out what is going on with the shading on the esotericsoftware JARs here. Any help is appreciated. How to reproduce the problem? Run the following 3 statements in spark-shell ( This is how I launched my spark-shell. cd /opt/spark; ./bin/spark-shell --master yarn --deploy-mode client --queue research --driver-memory 1024M) import org.apache.spark.SparkContext val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) hiveContext.hql(CREATE TABLE IF NOT EXISTS spark_hive_test_table (key INT, value STRING)) A reference of my environment. Apache Hadoop 2.4.1 Apache Hive 0.13.1 Apache Spark branch-1.2 (installed under /opt/spark/, and config under /etc/spark/) Maven build command: mvn -U -X -Phadoop-2.4 -Pyarn -Phive -Phive-0.13.1 -Dhadoop.version=2.4.1 -Dyarn.version=2.4.1 -Dhive.version=0.13.1 -DskipTests install Source Code commit label: eb4d457a870f7a281dc0267db72715cd00245e82 My spark-env.sh have the following contents when I executed spark-shell: HADOOP_HOME=/opt/hadoop/ HIVE_HOME=/opt/hive/ HADOOP_CONF_DIR=/etc/hadoop/ YARN_CONF_DIR=/etc/hadoop/ HIVE_CONF_DIR=/etc/hive/ HADOOP_SNAPPY_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name snappy-java-*.jar) HADOOP_LZO_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name hadoop-lzo-*.jar) SPARK_YARN_DIST_FILES=/user/spark/libs/spark-assembly-1.2.0-hadoop2.4.1.jar export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native export SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_SNAPPY_JAR:$HADOOP_LZO_JAR:$HIVE_CONF_DIR:/opt/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/hive/lib/datanucleus-core-3.2.10.jar:/opt/hive/lib/datanucleus-rdbms-3.2.9.jar Here's what I see from my stack trace. warning: there were 1 deprecation warning(s); re-run with -deprecation for details Hive history file=/home/hive/log/alti-test-01/hive_job_log_b5db9539-4736-44b3-a601-04fa77cb6730_1220828461.txt java.lang.NoClassDefFoundError: com/esotericsoftware/shaded/org/objenesis/strategy/InstantiatorStrategy at org.apache.hadoop.hive.ql.exec.Utilities.clinit(Utilities.java:925) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9718) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9712) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:434) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:108) at