* i've installed hive 2.3 and spark 2.2 * i've read this doc plenty of times -> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
* i run this query: hive --hiveconf hive.root.logger=DEBUG,console -e 'set hive.execution.engine=spark; select date_key, count(*) from fe_inventory.merged_properties_hist group by 1 order by 1;' * i get this error: * Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/scheduler/SparkListenerInterface* * this class in: /usr/lib/spark-2.2.0-bin-hadoop2.6/jars/spark-core_2.11-2.2.0.jar * i have copied all the spark jars to hdfs://dwrdevnn1/spark-2.2-jars * i have updated hive-site.xml to set spark.yarn.jars to it. * i see this is the console: 2017-09-26T13:34:15,505 INFO [334aa7db-ad0c-48c3-9ada-467aaf05cff3 main] spark.HiveSparkClientFactory: load spark property from hive configuration (spark.yarn.jars -> hdfs://dwrdevnn1.sv2.trulia.com:8020/spark-2.2-jars/*). * i see this on the console 2017-09-26T14:04:45,678 INFO [4cb82b6d-9568-4518-8e00-f0cf7ac58cd3 main] client.SparkClientImpl: Running client driver with argv: /usr/lib/spark-2.2.0-bin-hadoop2.6/bin/spark-submit --properties-file /tmp/spark-submit.6105784757200912217.properties --class org.apache.hive.spark.client.RemoteDriver /usr/lib/apache-hive-2.3.0-bin/lib/hive-exec-2.3.0.jar --remote-host dwrdevnn1.sv2.trulia.com --remote-port 53393 --conf hive.spark.client.connect.timeout=1000 --conf hive.spark.client.server.connect.timeout=90000 --conf hive.spark.client.channel.log.level=null --conf hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256 --conf hive.spark.client.rpc.server.address=null * i even print out CLASSPATH in this script: /usr/lib/spark-2.2.0-bin-hadoop2.6/bin/spark-submit and /usr/lib/spark-2.2.0-bin-hadoop2.6/jars/spark-core_2.11-2.2.0.jar is in it. so i ask... what am i missing? thanks, Stephen