Hi, all
>From the original paper of mapReduce by google, the signature of data
transform is that:
map: (k1, v1) -> list(k2, v2)
reduce: (k2, list(v2)) -> list(v2)
Here, the output value type is v2, and the final type is also v2.
But, what I want to achieve is that, the final value type should be
different from the map output value type.
E.g. The output value type of map function is bytesWritable, but the final
output value type is Text.
I set the map function is <LongWritable, Text, Text, BytesWritable>,
then set the reduce function is <Text, BytesWritable, Text, Text>
But when I implement this, there is an exception, said:
java.io.IOException: Type mismatch in value from map: expected
org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.BytesWritable
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:850)
at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
at org.myorg.ReadDivdeMap.map(ReadDivdeMap.java:100)
at org.myorg.ReadDivdeMap.map(ReadDivdeMap.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
It seems like the map function in this job expect Text as map output value
type, am I right?
So if I want achieve that, do I must use two map-reduce job?
Thanks all!
Best Regards
Jiamin Lu