Hi I have define SPARK_HOME in my env file and I am therefore assuming that zeppelin is using spark-submit.
I have a scala class that I test using spark-submit and it works fine. I have now split this class in a couple of paragraphs in a notebook. One is for loading dependencies (%dep) and the others are containing the code. Somehow I got that exception when running the last paragraph. The error seems to be related to the fact that I am using Job in the following snippet: val hConf = sc.hadoopConfiguration var job = new Job(hConf) FileInputFormat.setInputPaths(job,new Path(path)); // construct RDD and proceed var hRDD = new NewHadoopRDD(sc, classOf[RandomAccessInputFormat], classOf[IntWritable], classOf[BytesWritable], job.getConfiguration()) val result = hRDD.mapPartitionsWithInputSplit{ (split, iter) => extractCAS(split, iter)}.collect() Has anyone faced a similar issue? Cheers Guillaume path: String = /Users/tog/Downloads/zeppelin-0.5.5-incubating_all_modified/data/MERGED_DAR.DAT_256 hConf: org.apache.hadoop.conf.Configuration = Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml warning: there were 1 deprecation warning(s); re-run with -deprecation for details java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:294) at org.apache.hadoop.mapreduce.Job.toString(Job.java:463) at scala.runtime.ScalaRunTime$.scala$runtime$ScalaRunTime$$inner$1(ScalaRunTime.scala:324) at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:329) at scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337) at .<init>(<console>:10) at .<clinit>(<console>) at $print(<console>)