Sorry for earlier reply . Is your combiner outputting the Text,Text key/value pairs?
On Wed, Aug 3, 2011 at 5:26 PM, madhu phatak <[email protected]> wrote: > It should. Whats the input value class for reducer you are setting in Job? > > 2011/7/30 Daniel,Wu <[email protected]> > > Thanks Joey, >> >> It works, but one place I don't understand: >> >> 1: in the map >> >> extends Mapper<Text, Text, Text, IntWritable> >> so the output value is of type IntWritable >> 2: in the reduce >> extends Reducer<Text,Text,Text,IntWritable> >> So input value is of type Text. >> >> type of map output should be the same as input type of reduce, correct? >> but here >> IntWritable<>Text >> >> And the code can run without any error, shouldn't it complain type >> mismatch? >> >> At 2011-07-29 22:49:31,"Joey Echeverria" <[email protected]> wrote: >> >If you want to use a combiner, your map has to output the same types >> >as your combiner outputs. In your case, modify your map to look like >> >this: >> > >> > public static class TokenizerMapper >> > extends Mapper<Text, Text, Text, IntWritable>{ >> > public void map(Text key, Text value, Context context >> > ) throws IOException, InterruptedException { >> > context.write(key, new IntWritable(1)); >> > } >> > } >> > >> >> 11/07/29 22:22:22 INFO mapred.JobClient: Task Id : >> attempt_201107292131_0011_m_000000_2, Status : FAILED >> >> java.io.IOException: Type mismatch in value from map: expected >> org.apache.hadoop.io.IntWritable, recieved org.apache.hadoop.io.Text >> >> >> >> But I already set IntWritable in 2 places, >> >> 1: Reducer<Text,Text,Text,IntWritable> >> >> 2:job.setOutputValueClass(IntWritable.class); >> >> >> >> So where am I wrong? >> >> >> >> public class MyTest { >> >> >> >> public static class TokenizerMapper >> >> extends Mapper<Text, Text, Text, Text>{ >> >> public void map(Text key, Text value, Context context >> >> ) throws IOException, InterruptedException { >> >> context.write(key, value); >> >> } >> >> } >> >> >> >> public static class IntSumReducer >> >> extends Reducer<Text,Text,Text,IntWritable> { >> >> >> >> public void reduce(Text key, Iterable<Text> values, >> >> Context context >> >> ) throws IOException, InterruptedException { >> >> int count = 0; >> >> for (Text iw:values) { >> >> count++; >> >> } >> >> context.write(key, new IntWritable(count)); >> >> } >> >> } >> >> >> >> public static void main(String[] args) throws Exception { >> >> Configuration conf = new Configuration(); >> >> // the configure of seprator should be done in conf >> >> conf.set("key.value.separator.in.input.line", ","); >> >> String[] otherArgs = new GenericOptionsParser(conf, >> args).getRemainingArgs(); >> >> if (otherArgs.length != 2) { >> >> System.err.println("Usage: wordcount <in> <out>"); >> >> System.exit(2); >> >> } >> >> Job job = new Job(conf, "word count"); >> >> job.setJarByClass(WordCount.class); >> >> job.setMapperClass(TokenizerMapper.class); >> >> job.setCombinerClass(IntSumReducer.class); >> >> // job.setReducerClass(IntSumReducer.class); >> >> job.setInputFormatClass(KeyValueTextInputFormat.class); >> >> // job.set("key.value.separator.in.input.line", ","); >> >> job.setOutputKeyClass(Text.class); >> >> job.setOutputValueClass(IntWritable.class); >> >> FileInputFormat.addInputPath(job, new Path(otherArgs[0])); >> >> FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); >> >> System.exit(job.waitForCompletion(true) ? 0 : 1); >> >> } >> >> } >> >> >> > >> > >> > >> >-- >> >Joseph Echeverria >> >Cloudera, Inc. >> >443.305.9434 >> > > > > -- > Join me at http://hadoopworkshop.eventbrite.com/ > -- Join me at http://hadoopworkshop.eventbrite.com/
