Tried again and it yielded the same error. Many thanks.
Best, -- Axel Oehmichen Research Assistant Data Science Institute +44 (0) 7 842 734 702 [email protected] -----Original Message----- From: Oussama Chougna [mailto:[email protected]] Sent: 11 November 2015 14:23 To: [email protected] Subject: RE: Spark action using python file as JAR OK, Now in your job.properties include: oozie.use.system.libpath=true This tells Oozie to use that sharelib. Cheers, Oussama Chougna > From: [email protected] > To: [email protected] > Subject: RE: Spark action using python file as JAR > Date: Wed, 11 Nov 2015 14:18:45 +0000 > > Hello Oussama, > > Thanks for the response. The sharelib folder does exists on HDFS under > /oozie/share/lib/spark > > Best, > Axel > > -----Original Message----- > From: Oussama Chougna [mailto:[email protected]] > Sent: 11 November 2015 13:30 > To: [email protected] > Subject: RE: Spark action using python file as JAR > > Hi Axel, > Did you also install the Oozie sharelib? Sounds like you missing the > sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto. > Cheers, > > Oussama Chougna > > > From: [email protected] > > To: [email protected] > > Subject: Spark action using python file as JAR > > Date: Wed, 11 Nov 2015 11:09:32 +0000 > > > > Hello, > > > > I am trying to use OOzie to get some python workflows running. I have > > installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 and > > Spark 1.4.1. > > No matter what I do, I get this error message: > > "java.lang.ClassNotFoundException: Class > > org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018. > > > > I was able to reproduce using the wordcount.py example. > > (https://github.com/apache/spark/blob/master/examples/src/main/python/ > > wordcount.py) (The idea of running wordcount comes from Nitin Kumar > > message) > > > > The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job > > -oozie="http://localhost:11000/oozie" -config job.properties -run " > > I have tried through the Java API as well and I end up with the same result. > > > > My job.properties contains: > > nameNode=maprfs:/// > > jobTracker=spark-master:8032 > > oozie.wf.application.path=maprfs:/user/mapr/ > > > > my workflow.xml: > > > > <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'> > > <start to='spark-node' /> > > > > <action name='spark-node'> > > <spark xmlns="uri:oozie:spark-action:0.1"> > > <job-tracker>${jobTracker}</job-tracker> > > <name-node>${nameNode}</name-node> > > <master>yarn-client</master> > > <mode>client</mode> > > <name>wordcount</name> > > <jar>wordcount.py</jar> > > <spark-opts>--num-executors 2 --driver-memory 1024m > > --executor-memory 512m --executor-cores 1</spark-opts> > > </spark> > > <ok to="end" /> > > <error to="fail" /> > > </action> > > > > <kill name="fail"> > > <message>Workflow failed, error > > message[${wf:errorMessage(wf:lastErrorNode())}] > > </message> > > </kill> > > <end name='end' /> > > </workflow-app> > > > > I have tried to change the oozie.wf.application.path, specify explicitly > > the jar path, remove ord add different fields in the xml, put the wordcount > > a little bit everywhere and some other stuff but nothing changed... > > > > I welcome any suggestion or point out any error I made. > > > > Many thanks. > > > > Axel > > >
