It should. Whats the input value class for reducer you are setting in Job? 2011/7/30 Daniel,Wu <[email protected]>
> Thanks Joey, > > It works, but one place I don't understand: > > 1: in the map > > extends Mapper<Text, Text, Text, IntWritable> > so the output value is of type IntWritable > 2: in the reduce > extends Reducer<Text,Text,Text,IntWritable> > So input value is of type Text. > > type of map output should be the same as input type of reduce, correct? but > here > IntWritable<>Text > > And the code can run without any error, shouldn't it complain type > mismatch? > > At 2011-07-29 22:49:31,"Joey Echeverria" <[email protected]> wrote: > >If you want to use a combiner, your map has to output the same types > >as your combiner outputs. In your case, modify your map to look like > >this: > > > > public static class TokenizerMapper > > extends Mapper<Text, Text, Text, IntWritable>{ > > public void map(Text key, Text value, Context context > > ) throws IOException, InterruptedException { > > context.write(key, new IntWritable(1)); > > } > > } > > > >> 11/07/29 22:22:22 INFO mapred.JobClient: Task Id : > attempt_201107292131_0011_m_000000_2, Status : FAILED > >> java.io.IOException: Type mismatch in value from map: expected > org.apache.hadoop.io.IntWritable, recieved org.apache.hadoop.io.Text > >> > >> But I already set IntWritable in 2 places, > >> 1: Reducer<Text,Text,Text,IntWritable> > >> 2:job.setOutputValueClass(IntWritable.class); > >> > >> So where am I wrong? > >> > >> public class MyTest { > >> > >> public static class TokenizerMapper > >> extends Mapper<Text, Text, Text, Text>{ > >> public void map(Text key, Text value, Context context > >> ) throws IOException, InterruptedException { > >> context.write(key, value); > >> } > >> } > >> > >> public static class IntSumReducer > >> extends Reducer<Text,Text,Text,IntWritable> { > >> > >> public void reduce(Text key, Iterable<Text> values, > >> Context context > >> ) throws IOException, InterruptedException { > >> int count = 0; > >> for (Text iw:values) { > >> count++; > >> } > >> context.write(key, new IntWritable(count)); > >> } > >> } > >> > >> public static void main(String[] args) throws Exception { > >> Configuration conf = new Configuration(); > >> // the configure of seprator should be done in conf > >> conf.set("key.value.separator.in.input.line", ","); > >> String[] otherArgs = new GenericOptionsParser(conf, > args).getRemainingArgs(); > >> if (otherArgs.length != 2) { > >> System.err.println("Usage: wordcount <in> <out>"); > >> System.exit(2); > >> } > >> Job job = new Job(conf, "word count"); > >> job.setJarByClass(WordCount.class); > >> job.setMapperClass(TokenizerMapper.class); > >> job.setCombinerClass(IntSumReducer.class); > >> // job.setReducerClass(IntSumReducer.class); > >> job.setInputFormatClass(KeyValueTextInputFormat.class); > >> // job.set("key.value.separator.in.input.line", ","); > >> job.setOutputKeyClass(Text.class); > >> job.setOutputValueClass(IntWritable.class); > >> FileInputFormat.addInputPath(job, new Path(otherArgs[0])); > >> FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); > >> System.exit(job.waitForCompletion(true) ? 0 : 1); > >> } > >> } > >> > > > > > > > >-- > >Joseph Echeverria > >Cloudera, Inc. > >443.305.9434 > -- Join me at http://hadoopworkshop.eventbrite.com/
