Uh, do I have to copy the jar file manually into HDFS before I invoke
the hadoop jar command starting my own job?
Utkarsh Agarwal wrote:
I think you can use DistributedCache to specify the location of the jar
after you have it in hdfs..
On Wed, Mar 10, 2010 at 6:11 AM, Reik Schatz <[email protected]> wrote:
Hi, I am playing around with version 0.20.2 of Hadoop. I have written and
packaged a Job using a custom Mapper and Reducer. The input format in my Job
is set to StreamInputFormat. Also setting property stream.recordreader.class
to org.apache.hadoop.streaming.StreamXmlRecordReader.
This is how I want to start my job:
hadoop jar custom-1.0-SNAPSHOT.jar EmailCountingJob /input /output
The problem is that in this case all classes from
hadoop-0.20.2-streaming.jar are missing (ClassNotFoundException). I tried
using -libjars without luck.
hadoop jar -libjars PATH/hadoop-0.20.2-streaming.jar
custom-1.0-SNAPSHOT.jar EmailCountingJob /input /output
Any chance to use streaming classes with your own Jobs without copying
these classes to your projects and packaging them into your own jar?
/Reik