Hi Matteo,

I'm not sure about the "Unable to delete unexpected local file/dir" issue,
but I can speak to the other issues.

99% of the time, if Oozie doesn't notice a job finishing for ~10min, that
means that the callbacks from Hadoop are not getting through to Oozie.  (By
default, Oozie only checks each job's status every 10min so it doesn't spam
the RM.  The reason you don't normally have to wait that long is because
Oozie configures the Launcher Job to send a callback over HTTP when the job
finishes)  IIRC, the AM log for the Launcher Job (or actual job if it's an
MR action) may give more details about any problems encountered with the
callback.  You can also verify that the URL is correct; it's set by the
oozie.service.CallbackService.base.url config in oozie-site, which defaults
to ${oozie.base.url}/callback.

As for the Spark problem, you need to add the spark-assembly jar into the
spark sharelib and then run the 'oozie admin -sharelibupdate' command.
This is mentioned here
<http://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html#Spark_on_YARN>
in the documentation.  Though you should be aware that there are some
problems with custom jars being added to the Spark driver and executor (see
OOZIE-2277 <https://issues.apache.org/jira/browse/OOZIE-2277>).  While
working on that, I was also able to get Spark-on-Yarn to work without the
assembly jar; I think the key is to add the spark-yarn_2.10 jar.


- Robert


On Fri, Aug 28, 2015 at 12:49 AM, Matteo Luzzi <[email protected]>
wrote:

> Hi again,
>
> previously I had some configuration problems in running oozie 4.2.0 against
> hadoop 2.7.0 in localmode but eventually I could make it working. I am able
> to execute all my use cases in localmode.
> Now I'm deploying my system on a EC2 cluster. I'm using the same distro of
> oozie that I built in localmode since I am using exaclty the same
> components versions. I'm using just two node at the moment: in the master
> node I have namenode, resourcemanager and jobhistoryserver daemons while on
> the slave I have datanode and nodemanager daemons correctly running (I can
> interact with the hdfs and running map-reduce example jobs). I installed
> the oozie distro on the master node and I am using the same user for using
> either hadoop and oozie ( as I'm doing in local mode)
>
> I have two main issues:
>
> 1) Oozie server seems up and running, I can execute jobs through it. The
> jobs are actually executed and completed by yarn but on the oozie console
> they stay in RUNNING mode and only after 10 minutes they switch to
> SUCCEEDED or FAILED according to the result. Checking at the syslog of each
> job I can see two suspicious lines:
>
> WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher:
> Unable to delete unexpected local file/dir .action.xml.crc: insufficient
> permissions?
> ...
> ERROR [uber-EventHandler] org.apache.hadoop.mapred.LocalContainerLauncher:
> Returning, interrupted : java.lang.InterruptedException
> The rest of the log seems normal: I can follow the execution of the job
> from the submission till the completion.
>
> 2) I had this problem also in localmode: when trying to execute a spark
> action specifying the master=yarn-client in the job.properties file I get
> this error in the stderr log file:
>
> Error: Could not load YARN classes. This copy of Spark may not have been
> compiled with YARN support.
>  The error is clear, but I'm using the share libs that  were shipped with
> the latest oozie version,  which is supposed to be able to execute spark
> action on yarn. Am I missing some additional libs?
>
>
> I report some configuration files that I'm using
>
> example of job.property:
>
> nameNode=hdfs://172.31.25.237:9000 (internal IP address of my EC2 master
> node)
> jobTracker=http://172.31.25.237:8032
> queueName=default
> examplesRoot=examples
>
> oozie.wf.application.path=${nameNode}/user/${user.name
> }/${examplesRoot}/apps/java-main
>
> oozie-site.xml
>
> <property>
>     <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
>     <value>*=/opt/hadoop-2.7.0/hadoop-2.7.0/etc/hadoop</value>
>   </property>
>
>  <property>
>     <name>oozie.processing.timezone</name>
>     <value>GMT+0200</value>
>     </property>
>
> <property>
>         <name>oozie.base.url</name>
>         <value>http://publicEC2ip:11000/oozie/</value>
>     </property>
>
>
> --
> Matteo Remo Luzzi
>

Reply via email to