Hi Libo,
 You can implement your driver code using ToolRunner.So that you can pass
your extra configuration through command line instead of editing your code
all the time.

Driver code
----------------
public class WordCount extends Configured implements Tool {

  public static void main(String[] args) throws Exception {
            int exitCode = ToolRunner.run(new Configuration(), new
WordCount(), args);
            System.exit(exitCode);
  }
  public int run(String[] args) throws Exception {
            if (args.length != 2) {
                    System.out.printf("Usage: %s [generic options] <input
dir> <output dir>\n", getClass().getSimpleName());
             return -1;
           }
         Job job = new Job(getConf());
         job.setJarByClass(WordCount.class);
         job.setJobName("Word Count");
         FileInputFormat.setInputPaths(job, new Path(args[0]));
         FileOutputFormat.setOutputPath(job, new Path(args[1]));
         job.setMapperClass(WordMapper.class);
         job.setReducerClass(SumReducer.class);
         job.setMapOutputKeyClass(Text.class);
         job.setMapOutputValueClass(IntWritable.class);
         job.setOutputKeyClass(Text.class);
         job.setOutputValueClass(IntWritable.class);
         boolean success = job.waitForCompletion(true);
        return success ? 0 : 1;
    }
}

command line
-------------------
$ hadoop jar myjar.jar MyDriver -D mapred.reduce.tasks=10 myinputdir
myoutputdir

This is a better practise.


Happy Hadooping.


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham

http://www.unmeshasreeveni.blogspot.in/

Reply via email to