Found related issue https://issues.apache.org/jira/browse/SPARK-4289
On Sat, Dec 12, 2015 at 6:31 PM tog <guillaume.all...@gmail.com> wrote: > Hi > > I have define SPARK_HOME in my env file and I am therefore assuming that > zeppelin is using spark-submit. > > I have a scala class that I test using spark-submit and it works fine. I > have now split this class in a couple of paragraphs in a notebook. One is > for loading dependencies (%dep) and the others are containing the code. > > Somehow I got that exception when running the last paragraph. The error > seems to be related to the fact that I am using Job in the following > snippet: > val hConf = sc.hadoopConfiguration > > var job = new Job(hConf) > > FileInputFormat.setInputPaths(job,new Path(path)); > > // construct RDD and proceed > var hRDD = new NewHadoopRDD(sc, classOf[RandomAccessInputFormat], > classOf[IntWritable], classOf[BytesWritable], job.getConfiguration()) > val result = hRDD.mapPartitionsWithInputSplit{ (split, iter) => > extractCAS(split, iter)}.collect() > > Has anyone faced a similar issue? > > Cheers > Guillaume > > path: String = > /Users/tog/Downloads/zeppelin-0.5.5-incubating_all_modified/data/MERGED_DAR.DAT_256 > hConf: org.apache.hadoop.conf.Configuration = Configuration: > core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, > yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml warning: > there were 1 deprecation warning(s); re-run with -deprecation for details > java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING at > org.apache.hadoop.mapreduce.Job.ensureState(Job.java:294) at > org.apache.hadoop.mapreduce.Job.toString(Job.java:463) at > scala.runtime.ScalaRunTime$.scala$runtime$ScalaRunTime$$inner$1(ScalaRunTime.scala:324) > at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:329) at > scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337) at > .<init>(<console>:10) at .<clinit>(<console>) at $print(<console>) >