Hi, I have an error when submitting a Spark SQL application to our Spark cluster:
14/09/29 16:02:11 WARN scheduler.TaskSetManager: Loss was due to java.lang.NoClassDefFoundError *java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf* at org.apache.spark.sql.hive.SparkHiveHadoopWriter.setIDs(SparkHadoopWriter.scala:169) at org.apache.spark.sql.hive.SparkHiveHadoopWriter.setup(SparkHadoopWriter.scala:69) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org $apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(hiveOperators.scala:260) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$1.apply(hiveOperators.scala:274) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$1.apply(hiveOperators.scala:274) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) at org.apache.spark.scheduler.Task.run(Task.scala:51) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I assume this is because the Executor does not have the hadoop-core.jar file. I've tried adding it to the SparkContext using addJar but this didn't help. I also see that the documentation says you must rebuild Spark if you want to use Hive. https://spark.apache.org/docs/1.0.2/sql-programming-guide.html#hive-tables Is this really true or can we just package the jar files with the Spark Application we build? Rebuilding Spark itself isn't possible for us as it is installed on a VM without internet access and we are using the Cloudera distribution (Spark 1.0). Is it possible to assemble the Hive dependencies into our Spark Application and submit this to the cluster? I've tried to do this with spark-submit (and the Hadoop JobConf class is in AAC-assembly-1.0.jar) but the Executor doesn't find the class. Here is the command: sudo ./spark-submit --class aac.main.SparkDriver --master spark://localhost:7077 --jars AAC-assembly-1.0.jar aacApp_2.10-1.0.jar Any pointers would be appreciated! Best regards, Patrick