Hi, > I'd like to run a Hadoop (0.20.2) job > from within another application, using ToolRunner. > > One class of this other application implements the Tool interface. > The implemented run() method: > * constructs a Job() > * sets the input/output/mapper/reducer > * sets the jar file by calling job.setJarByClass(). > * calls job.waitForCompletion() > > The question is: where should the jar file be made available? > In the current local directory of the parent application? In the system > directory in HDFS? ...? > > I'd like to find documentation and learn how this works. >
If you are planning to use job.setJarByClass, the files only need to be only on your classpath locally where you are running the application. You could look at o.a.h.mapred.JobConf.findContainingJar which is passed the class name you set in setJarByClass to see how the jar file is located. Thanks Hemanth