Thanks Joey,

It works, but one place I don't understand:

1: in the map

 extends Mapper<Text, Text, Text, IntWritable>
so the output value is of type IntWritable
2: in the reduce
extends Reducer<Text,Text,Text,IntWritable>
So input value is of type Text.

type of map output should be the same as input type of reduce, correct? but 
here 
IntWritable<>Text

And the code can run without any error, shouldn't it complain type mismatch?

At 2011-07-29 22:49:31,"Joey Echeverria" <[email protected]> wrote:
>If you want to use a combiner, your map has to output the same types
>as your combiner outputs. In your case, modify your map to look like
>this:
>
>  public static class TokenizerMapper
>       extends Mapper<Text, Text, Text, IntWritable>{
>    public void map(Text key, Text value, Context context
>                    ) throws IOException, InterruptedException {
>        context.write(key, new IntWritable(1));
>    }
>  }
>
>>  11/07/29 22:22:22 INFO mapred.JobClient: Task Id : 
>> attempt_201107292131_0011_m_000000_2, Status : FAILED
>> java.io.IOException: Type mismatch in value from map: expected 
>> org.apache.hadoop.io.IntWritable, recieved org.apache.hadoop.io.Text
>>
>> But I already set IntWritable in 2 places,
>> 1: Reducer<Text,Text,Text,IntWritable>
>> 2:job.setOutputValueClass(IntWritable.class);
>>
>> So where am I wrong?
>>
>> public class MyTest {
>>
>>  public static class TokenizerMapper
>>       extends Mapper<Text, Text, Text, Text>{
>>    public void map(Text key, Text value, Context context
>>                    ) throws IOException, InterruptedException {
>>        context.write(key, value);
>>    }
>>  }
>>
>>  public static class IntSumReducer
>>       extends Reducer<Text,Text,Text,IntWritable> {
>>
>>    public void reduce(Text key, Iterable<Text> values,
>>                       Context context
>>                       ) throws IOException, InterruptedException {
>>       int count = 0;
>>       for (Text iw:values) {
>>            count++;
>>       }
>>      context.write(key, new IntWritable(count));
>>     }
>>   }
>>
>>  public static void main(String[] args) throws Exception {
>>    Configuration conf = new Configuration();
>> // the configure of seprator should be done in conf
>>    conf.set("key.value.separator.in.input.line", ",");
>>    String[] otherArgs = new GenericOptionsParser(conf, 
>> args).getRemainingArgs();
>>    if (otherArgs.length != 2) {
>>      System.err.println("Usage: wordcount <in> <out>");
>>      System.exit(2);
>>    }
>>    Job job = new Job(conf, "word count");
>>    job.setJarByClass(WordCount.class);
>>    job.setMapperClass(TokenizerMapper.class);
>>    job.setCombinerClass(IntSumReducer.class);
>> //    job.setReducerClass(IntSumReducer.class);
>>    job.setInputFormatClass(KeyValueTextInputFormat.class);
>>    // job.set("key.value.separator.in.input.line", ",");
>>    job.setOutputKeyClass(Text.class);
>>    job.setOutputValueClass(IntWritable.class);
>>    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
>>    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
>>    System.exit(job.waitForCompletion(true) ? 0 : 1);
>>  }
>> }
>>
>
>
>
>-- 
>Joseph Echeverria
>Cloudera, Inc.
>443.305.9434

Reply via email to