Hi Libo,
You can implement your driver code using ToolRunner.So that you can pass
your extra configuration through command line instead of editing your code
all the time.
Driver code
----------------
public class WordCount extends Configured implements Tool {
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new Configuration(), new
WordCount(), args);
System.exit(exitCode);
}
public int run(String[] args) throws Exception {
if (args.length != 2) {
System.out.printf("Usage: %s [generic options] <input
dir> <output dir>\n", getClass().getSimpleName());
return -1;
}
Job job = new Job(getConf());
job.setJarByClass(WordCount.class);
job.setJobName("Word Count");
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(WordMapper.class);
job.setReducerClass(SumReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
boolean success = job.waitForCompletion(true);
return success ? 0 : 1;
}
}
command line
-------------------
$ hadoop jar myjar.jar MyDriver -D mapred.reduce.tasks=10 myinputdir
myoutputdir
This is a better practise.
Happy Hadooping.
--
*Thanks & Regards*
Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/