A very Basic Question:
Form the WordCount example below: I don't see why do we need the "LongWritable
key" argument in the Map function. Can anybody tell me the importance of it?
As I understand the worker process reads in the designated input split as a
series of strings. Which the map functions operates on to produce the <key,
value> pair, in this case the 'output' variable. Then, Why would one need
"LongWritable key" as the argument for map function?
Thank you,
Amit
<snip>
public static class MapClass extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value,
OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line);
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
output.collect(word, one);
}
}
}
</snip>