Hi Robert, thank you very much for your support.
Concerning the first problem, I made it working just deleting the
<property>
<name>oozie.base.url</name>
<value>http://publicEC2ip:11000/oozie/
<http://publicec2ip:11000/oozie/></value>
</property>
and letting oozie communicate with the job launcher using the EC2 private
ip address.
Concerning running spark on yarn through oozie, I moved the spark assembly
jar compiled against the version of hadoop that I am using, and after
executing the sharelibupdate command that error message disappeared.
I still have problems on running a spark jar on YARN, but it seems it
something unrelated to oozie ( container memory overflow). I'll investigate
more on the settings of YARN and Spark.
Matteo
2015-08-28 20:26 GMT+02:00 Robert Kanter <[email protected]>:
> Hi Matteo,
>
> I'm not sure about the "Unable to delete unexpected local file/dir" issue,
> but I can speak to the other issues.
>
> 99% of the time, if Oozie doesn't notice a job finishing for ~10min, that
> means that the callbacks from Hadoop are not getting through to Oozie. (By
> default, Oozie only checks each job's status every 10min so it doesn't spam
> the RM. The reason you don't normally have to wait that long is because
> Oozie configures the Launcher Job to send a callback over HTTP when the job
> finishes) IIRC, the AM log for the Launcher Job (or actual job if it's an
> MR action) may give more details about any problems encountered with the
> callback. You can also verify that the URL is correct; it's set by the
> oozie.service.CallbackService.base.url config in oozie-site, which defaults
> to ${oozie.base.url}/callback.
>
> As for the Spark problem, you need to add the spark-assembly jar into the
> spark sharelib and then run the 'oozie admin -sharelibupdate' command.
> This is mentioned here
> <
> http://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html#Spark_on_YARN
> >
> in the documentation. Though you should be aware that there are some
> problems with custom jars being added to the Spark driver and executor (see
> OOZIE-2277 <https://issues.apache.org/jira/browse/OOZIE-2277>). While
> working on that, I was also able to get Spark-on-Yarn to work without the
> assembly jar; I think the key is to add the spark-yarn_2.10 jar.
>
>
> - Robert
>
>
> On Fri, Aug 28, 2015 at 12:49 AM, Matteo Luzzi <[email protected]>
> wrote:
>
> > Hi again,
> >
> > previously I had some configuration problems in running oozie 4.2.0
> against
> > hadoop 2.7.0 in localmode but eventually I could make it working. I am
> able
> > to execute all my use cases in localmode.
> > Now I'm deploying my system on a EC2 cluster. I'm using the same distro
> of
> > oozie that I built in localmode since I am using exaclty the same
> > components versions. I'm using just two node at the moment: in the master
> > node I have namenode, resourcemanager and jobhistoryserver daemons while
> on
> > the slave I have datanode and nodemanager daemons correctly running (I
> can
> > interact with the hdfs and running map-reduce example jobs). I installed
> > the oozie distro on the master node and I am using the same user for
> using
> > either hadoop and oozie ( as I'm doing in local mode)
> >
> > I have two main issues:
> >
> > 1) Oozie server seems up and running, I can execute jobs through it. The
> > jobs are actually executed and completed by yarn but on the oozie console
> > they stay in RUNNING mode and only after 10 minutes they switch to
> > SUCCEEDED or FAILED according to the result. Checking at the syslog of
> each
> > job I can see two suspicious lines:
> >
> > WARN [uber-SubtaskRunner]
> org.apache.hadoop.mapred.LocalContainerLauncher:
> > Unable to delete unexpected local file/dir .action.xml.crc: insufficient
> > permissions?
> > ...
> > ERROR [uber-EventHandler]
> org.apache.hadoop.mapred.LocalContainerLauncher:
> > Returning, interrupted : java.lang.InterruptedException
> > The rest of the log seems normal: I can follow the execution of the job
> > from the submission till the completion.
> >
> > 2) I had this problem also in localmode: when trying to execute a spark
> > action specifying the master=yarn-client in the job.properties file I get
> > this error in the stderr log file:
> >
> > Error: Could not load YARN classes. This copy of Spark may not have been
> > compiled with YARN support.
> > The error is clear, but I'm using the share libs that were shipped with
> > the latest oozie version, which is supposed to be able to execute spark
> > action on yarn. Am I missing some additional libs?
> >
> >
> > I report some configuration files that I'm using
> >
> > example of job.property:
> >
> > nameNode=hdfs://172.31.25.237:9000 (internal IP address of my EC2 master
> > node)
> > jobTracker=http://172.31.25.237:8032
> > queueName=default
> > examplesRoot=examples
> >
> > oozie.wf.application.path=${nameNode}/user/${user.name
> > }/${examplesRoot}/apps/java-main
> >
> > oozie-site.xml
> >
> > <property>
> >
> <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
> > <value>*=/opt/hadoop-2.7.0/hadoop-2.7.0/etc/hadoop</value>
> > </property>
> >
> > <property>
> > <name>oozie.processing.timezone</name>
> > <value>GMT+0200</value>
> > </property>
> >
> > <property>
> > <name>oozie.base.url</name>
> > <value>http://publicEC2ip:11000/oozie/</value>
> > </property>
> >
> >
> > --
> > Matteo Remo Luzzi
> >
>
--
Matteo Remo Luzzi