Jane,

Yes and thats documented:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Reducer.html#reduce(K2,%20java.util.Iterator,%20org.apache.hadoop.mapred.OutputCollector,%20org.apache.hadoop.mapred.Reporter)

"The framework will reuse the key and value objects that are passed
into the reduce, therefore the application should clone the objects
they want to keep a copy of."

On Fri, Apr 6, 2012 at 6:26 AM, Jane Wayne <jane.wayne2...@gmail.com> wrote:
> i found out what my problem was. apparently, when you iterate over
> Iterable<Type> values, that instance of Type is being used over and
> over. for example, in my reducer,
>
> public void reduce(Key key, Iterator<Value> values, Context context)
> throws IOException, InterruptedException {
>  Iterator<Value> it = values.iterator();
>  Value a = it.next();
>  Value b = it.next();
> }
>
> the variables, a and b of type Value, will be the same object
> instance! i suppose this behavior of the iterator is to optimize
> iterating so as to avoid the new operator.
>
>
>
> On Thu, Apr 5, 2012 at 4:55 PM, Jane Wayne <jane.wayne2...@gmail.com> wrote:
>> i am currently testing my map reduce job on Windows + Cygwin + Hadoop
>> v0.20.205. for some strange reason, the list of values (i.e.
>> Iterable<T> values) going into the reducer looks all wrong. i have
>> tracked the map reduce process with logging statements (i.e. logged
>> the input to the map, logged the output from the map, logged the
>> partitioner, logged the input to the reducer). at all stages,
>> everything looks correct except at the reducer.
>>
>> is there anyway (using Windows  + Cygwin) to view the local map
>> outputs before they are shuffled/sorted to the reducer? i need to know
>> why the values are incorrect.



-- 
Harsh J

Reply via email to