Im currently having trouble getting Oozie to work properly on my hadoop install. Actually I`m completely stuck, so any input is appreciated as I`m a complete beginner in all of this. I use: hadoop 2.6.0 (with Yarn), oozie 4.0.1, hive 1.0.0, hue 3.7.1, pig 0.12 Its a local install which I run in pseudo distributed. I installed everything from tars and configured it manually because sadly the one-click install from cloudera doesnt work in OS X.
Hadoop+Hive seem to work fine as far as I can tell, both in CLI and Hue. Pig editor from Hue doesnt quite work yet, I can access and use files from HDFS but I get an error when I try to access Hive tables with HCatalog (ERROR 2245:<file script.pig, line 1, column 4> Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader). But right now it`s more important that the Oozie scheduler works, which it doesn`t. When I try to run for example a shellscript in an Oozie workflow I get this error: *Cannot run program "testscript.sh" (in directory "/Volumes/WS2Data/hadoop_hdfs/tmp/nm-local-dir/usercache/admin/appcache/application_1427878722813_0003/container_1427878722813_0003_01_000002"): error=2, No such file or directory* Now I`m trying to understand what`s happening here: What is Hadoop trying to cache in appcache dir? The script? (there is no I/O involved in the script itself its just a simple shell command) Afaik its hadoop that caches in those directories, not Oozie ,right? Then why wouldnt Oozie be able to find the application container? I can run mapreduce jobs with Hive without any problem, if Hadoop had a problem or misconfiguration concerning the caching this wouldnt work either? I basically followed this guide http://gauravkohli.com/2014/08/26/apache-oozie-installation-on-hadoop-2-4-1/ to install Oozie, except I skipped the part where he reconfigures the pom.xml for a different hadoop version, because there just weren`t any repositories for 2.6.0 I just built it as it came for hadoop version 2.3.0 with "mkdistro.sh -P hadoop-2 -DskipTests" and then just replaced the libs in */libext* dir with the ones from version 2.6.0 After that I linked my **-site.xml* files from into oozies */conf/hadoop-conf *folder. The Oozie server is up and responsive, running a simple Pig script from Hue which uses Oozie works fine too (with the above mentioned exception). Any help or idea is appreciated
