Im currently having trouble getting Oozie to work properly on my hadoop
install.
Actually I`m completely stuck, so any input is appreciated as I`m a
complete beginner in all of this.
I use:
hadoop 2.6.0 (with Yarn), oozie 4.0.1, hive 1.0.0, hue 3.7.1, pig 0.12
Its a local install which I run in pseudo distributed.
I installed everything from tars and configured it manually because sadly
the one-click install from cloudera doesnt work in OS X.

Hadoop+Hive seem to work fine as far as I can tell, both in CLI and Hue.

Pig editor from Hue doesnt quite work yet, I can access and use files from
HDFS but I get an error when I try to access Hive tables with HCatalog
(ERROR 2245:<file script.pig, line 1, column 4> Cannot get schema from
loadFunc org.apache.hcatalog.pig.HCatLoader).

But right now it`s more important that the Oozie scheduler works, which it
doesn`t.
When I try to run for example a shellscript in an Oozie workflow I get this
error:

*Cannot run program "testscript.sh" (in directory
"/Volumes/WS2Data/hadoop_hdfs/tmp/nm-local-dir/usercache/admin/appcache/application_1427878722813_0003/container_1427878722813_0003_01_000002"):
error=2, No such file or directory*

Now I`m trying to understand what`s happening here: What is Hadoop trying
to cache in appcache dir? The script? (there is no I/O involved in the
script itself its just a simple shell command)

Afaik its hadoop that caches in those directories, not Oozie ,right? Then
why wouldnt Oozie be able to find the application container? I can run
mapreduce jobs with Hive without any problem, if Hadoop had a problem or
misconfiguration concerning the caching this wouldnt work either?


I basically followed this guide
http://gauravkohli.com/2014/08/26/apache-oozie-installation-on-hadoop-2-4-1/
to install Oozie, except I skipped the part where he reconfigures the
pom.xml for a different hadoop version, because there just weren`t any
repositories for 2.6.0

I just built it as it came for hadoop version 2.3.0 with "mkdistro.sh -P
hadoop-2 -DskipTests" and then just replaced the libs in */libext* dir with
the ones from version 2.6.0

After that I linked my **-site.xml* files from into oozies */conf/hadoop-conf
*folder.

The Oozie server is up and responsive, running a simple Pig script from Hue
which uses Oozie works fine too (with the above mentioned exception).


Any help or idea is appreciated

Reply via email to