Hi everybody, it was not an easy way to run a map reduce job at all, ie if a third party jars are involved... a good help is the article by cloudera: http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
i still can not use the -libjars argument for running a MR job with 3rd party jars, like described in the first option for some reason it does not work for me... the tasks fail with the java.lang.ClassNotFoundException, classes of the 3rd party lib are not found the second option: Include the referenced JAR in the lib subdirectory of the submittable JAR actually this works fine for me, starting a job from the shell like this: ./bin/hadoop jar /tmp/my.jar package.HBaseReader not the most elegant way, but finally it works now i would like to start MR jobs from my web application running on a tomcat, is there an elegant way to do it using 3rd party jars? the third option described at the article is to include the jars on every tasktracker, which is IMHO not the very best, like the second... the second question: at the moment i use the TextOutputFormatis the output format, which creates a file in the specified dfs directory: part-r-00000 so i can read id using ./bin/hadoop fs -cat /tmp/requests/part-r-00000 on the shell how can i get the path to this output file after my job is finished, to process it however... is there another way to collect results of a MR job, a text file is good for humans, but IMHO parsing a text file for results is not the preferable way... thanks in advance andre PS: versions: - Linux version 2.6.26-2-amd64 (Debian 2.6.26-25lenny1) - hadoop-0.20.2-CDH3B4 - hbase-0.90.1-CDH3B4 - zookeeper-3.3.2-CDH3B4