MapWritable.readFields needs to clear internal hash else instance accumulates 
entries forever
---------------------------------------------------------------------------------------------

                 Key: HADOOP-2244
                 URL: https://issues.apache.org/jira/browse/HADOOP-2244
             Project: Hadoop
          Issue Type: Bug
          Components: io
            Reporter: stack
             Fix For: 0.16.0


A common framework pattern is to get an instance of a Writable, usually by 
reflection, and then just keep calling readFields to make new 'instances' of 
the particular Writable.

For example, the spill-to-disk that is run at the end of a map task gets 
instances of map output keys and values and then loops over the (sorted) map 
output calling readFields to make instances to write out to the filesystem (See 
around line #470 in the spill method).

If the particular Writable is an instance of MapWritable, currently we get 
funny results.  It has an internal hash map that is created on instantiation.  
Each time the readFields method is called, the newly deserialized entries are 
added to the internal map.  The map needs to be reset when readFields is called 
so it doesn't just keep growing ad infinitum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to