I'm trying to write a Reducer which will eliminate duplicates from the list of 
values before writing them out. I have the following code for my Reducer:

/*****************/
public class ClickStreamIndexerReducer extends Reducer<Text, Text, Text, Text> {
    @Override
    public void reduce(Text dirName, Iterable<Text> values, Reducer<Text, Text, 
Text, Text>.Context context) throws IOException, InterruptedException {
        Text value = new Text();
        Text lastValue = new Text();
        Iterator<Text> valuesIterator = values.iterator();

        while(valuesIterator.hasNext()) {
            value = valuesIterator.next();
            while(value.equals(lastValue)){
                context.write(key, value);
                lastValue = value;
            }
        }
    }
}
/*****************/

Right before the first time "value = valuesIterator.next()" is called, both 
value and lastValue are empty as expected. Then value is set to the first value 
and lastValue is still empty. After I write out value I set lastValue to value. 
The first time through the outer while loop everything goes as expected. 
However the next time through, when "value = valuesIterator.next()" is called, 
both value and lastValue are set to the exact same object. Every time through 
the loop after that, when value is set, lastValue gets set to the same thing.

Reply via email to