Re: maprd vs mapreduce api

Stevens, Keith D. Fri, 05 Aug 2011 15:43:10 -0700

The Mapper and Reducer class in org.apache.hadoop.mapreduce implement the 
identity function.  So you should be able to just do


conf.setMapperClass(org.apache.hadoop.mapreduce.Mapper.class);
conf.setReducerClass(org.apache.hadoop.mapreduce.Reducer.class);

without having to implement your own no-op classes.

I recommend reading the javadoc for differences between the old api and the new 
api, for example http://hadoop.apache.org/common/docs/r0.20.2/api/index.html 
indicates the different functionality of Mapper in the new api and it's dual 
use as the identity mapper.

Cheers,
--Keith

On Aug 5, 2011, at 1:15 PM, garpinc wrote:

> 
> I was following this tutorial on version 0.19.1
> 
> http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
> 
> I however wanted to use the latest version of api 0.20.2
> 
> The original code in tutorial had following lines
> conf.setMapperClass(org.apache.hadoop.mapred.lib.IdentityMapper.class);
> conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);
> 
> both Identity classes are deprecated.. So seemed the solution was to create
> mapper and reducer as follows:
> public static class NOOPMapper 
>      extends Mapper<Text, IntWritable, Text, IntWritable>{
> 
> 
>   public void map(Text key, IntWritable value, Context context
>                   ) throws IOException, InterruptedException {
> 
>       context.write(key, value);
> 
>   }
> }
> 
> public static class NOOPReducer 
>      extends Reducer<Text,IntWritable,Text,IntWritable> {
>   private IntWritable result = new IntWritable();
> 
>   public void reduce(Text key, Iterable<IntWritable> values, 
>                      Context context
>                      ) throws IOException, InterruptedException {
>     context.write(key, result);
>   }
> }
> 
> 
> And then with code:
>               Configuration conf = new Configuration();
>               Job job = new Job(conf, "testdriver");
> 
>               job.setOutputKeyClass(Text.class);
>               job.setOutputValueClass(IntWritable.class);
> 
>               job.setInputFormatClass(TextInputFormat.class);
>               job.setOutputFormatClass(TextOutputFormat.class);
> 
>               FileInputFormat.addInputPath(job, new Path("In"));
>               FileOutputFormat.setOutputPath(job, new Path("Out"));
> 
>               job.setMapperClass(NOOPMapper.class);
>               job.setReducerClass(NOOPReducer.class);
> 
>               job.waitForCompletion(true);
> 
> 
> However I get this message
> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
> cast to org.apache.hadoop.io.Text
>       at TestDriver$NOOPMapper.map(TestDriver.java:1)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 11/08/01 16:41:01 INFO mapred.JobClient:  map 0% reduce 0%
> 11/08/01 16:41:01 INFO mapred.JobClient: Job complete: job_local_0001
> 11/08/01 16:41:01 INFO mapred.JobClient: Counters: 0
> 
> 
> 
> Can anyone tell me what I need for this to work.
> 
> Attached is full code..
> http://old.nabble.com/file/p32174859/TestDriver.java TestDriver.java 
> -- 
> View this message in context: 
> http://old.nabble.com/maprd-vs-mapreduce-api-tp32174859p32174859.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>

Re: maprd vs mapreduce api

Reply via email to