Does the class of the Mapper output need to match the exact class of the specified output?

Wilkes, Chris Tue, 26 Jan 2010 17:17:41 -0800

I'm outputting a Text and a LongWritable in my mapper and told the jobthat my mapper output class is Writable (the interface shared by bothof them):

  job.setMapOutputValueClass(Writable.class);

I'm doing this as I have two different types of input files and amcombining them together. I could write them both as as Text but thenI'll have to put a marker in front of the tag to indicate what type ofentry it is instead of doing aif (value instanceof Text) { } else if (value instanceofLongWritable) { }


This exception is thrown:

java.io.IOException: Type mismatch in value from map: expectedorg.apache.hadoop.io.Writable, recievedorg.apache.hadoop.io.LongWritableat org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:812)at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:504)atorg.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)The MapTask code (which is being used even though I'm using the newAPI) shows that a != is used to compare the classes:

http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/MapTask.java?view=log

 if (value.getClass() != valClass) {

throw new IOException("Type mismatch in value from map:expected "

                              + valClass.getName() + ", recieved "
                              + value.getClass().getName());
      }

Does this level of checking really need to be done? Could it just bea Class.isAssignableFrom() check?

Does the class of the Mapper output need to match the exact class of the specified output?

Reply via email to