Hi, Saliya,
The data transformation in MapReduce is:
*map* (k1,v1) -> list(k2,list(v2))
*reduce* (k2, list(v2)) -> (k3, list(v3))
The output from map will be sent to reducer as input directly. In your
recude function, you can only get k2, v2 as input type. So, in your case,
the type of the data should be:
k1 = Text | v1 = Text
k2 = Text | v2 = BytesWritable
k3 = Text | v3 = BytesWritable
Hence for your code, I think you can write:
In job configuration:
JobConf conf = new JobConf(YourClass.class);
conf.setOutputKeyClass(k3.class);
conf.setOutputValueClass(v3.class);
then in map class, set the map class as:
class YourMapClass extends MapReduceBase
implements Mapper<k1, v1, k2, v2> {
....
}
If your v3 is different from the v2, then you can in the job configuration
set
conf.setMapOutputKeyClass(k2.class);
conf.setMapOutputValueClass(v2.class);
Hope this can help you!
Best Regards
Jiamin Lu