i found out what my problem was. apparently, when you iterate over
Iterable<Type> values, that instance of Type is being used over and
over. for example, in my reducer,

public void reduce(Key key, Iterator<Value> values, Context context)
throws IOException, InterruptedException {
 Iterator<Value> it = values.iterator();
 Value a = it.next();
 Value b = it.next();
}

the variables, a and b of type Value, will be the same object
instance! i suppose this behavior of the iterator is to optimize
iterating so as to avoid the new operator.



On Thu, Apr 5, 2012 at 4:55 PM, Jane Wayne <jane.wayne2...@gmail.com> wrote:
> i am currently testing my map reduce job on Windows + Cygwin + Hadoop
> v0.20.205. for some strange reason, the list of values (i.e.
> Iterable<T> values) going into the reducer looks all wrong. i have
> tracked the map reduce process with logging statements (i.e. logged
> the input to the map, logged the output from the map, logged the
> partitioner, logged the input to the reducer). at all stages,
> everything looks correct except at the reducer.
>
> is there anyway (using Windows  + Cygwin) to view the local map
> outputs before they are shuffled/sorted to the reducer? i need to know
> why the values are incorrect.

Reply via email to