Pavel,
Since each map processes only one region, that a row is only stored in one
region and that all intermediate keys from a given mapper goes to a single
reducer, there will be no stale data in this situation.
J-D
On Wed, Jul 30, 2008 at 10:09 AM, Pavel <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I feel lack of mapreduce approach understanding and would like to ask some
> questions (mainly on its reduce part). Below is reduce job that gets values
> count for given row key and inserts resulting value into other table using
> the same row key.
>
> What makes me doubt is that I cannot figure out how would that code work if
> there're several redurers are running. Is it possible that they will
> process
> values for same row key and as consequence write stale data into the table?
> Say reducerA has counted total for 5 messages while reducerB for 3
> messages,
> would that all end up with 8 value in resulting table?
>
> Thank you.
> Pavel
>
> public class MessagesTableReduce extends TableReduce<Text, LongWritable> {
>
> public void reduce(Text key, Iterator<LongWritable> values,
> OutputCollector<Text, MapWritable> output, Reporter reporter)
> throws IOException {
>
> System.out.println("REDUCE: processing messages for author: " +
> key.toString());
>
> int total = 0;
> while (values.hasNext()) {
> values.next();
> total++;
> }
>
> MapWritable map = new MapWritable();
> map.put(new Text("messages:sent"), new
> ImmutableBytesWritable(String.valueOf(total).getBytes()));
> output.collect(key, map);
> }
> }
>