The Hadoop framework reuses Writable objects for key and value arguments,
so if your code stores a pointer to that object instead of copying it you
can find yourself with mysterious duplicate objects.  This has tripped me
up a number of times. Details on what exactly I encountered and how I fixed
it are here
http://cornercases.wordpress.com/2011/03/14/serializing-complex-mapreduce-keys/
and
here
http://cornercases.wordpress.com/2011/08/18/hadoop-object-reuse-pitfall-all-my-reducer-values-are-the-same/
 .

Reply via email to