MapWritable.readFields needs to clear internal hash else instance accumulates
entries forever
---------------------------------------------------------------------------------------------
Key: HADOOP-2244
URL: https://issues.apache.org/jira/browse/HADOOP-2244
Project: Hadoop
Issue Type: Bug
Components: io
Reporter: stack
Fix For: 0.16.0
A common framework pattern is to get an instance of a Writable, usually by
reflection, and then just keep calling readFields to make new 'instances' of
the particular Writable.
For example, the spill-to-disk that is run at the end of a map task gets
instances of map output keys and values and then loops over the (sorted) map
output calling readFields to make instances to write out to the filesystem (See
around line #470 in the spill method).
If the particular Writable is an instance of MapWritable, currently we get
funny results. It has an internal hash map that is created on instantiation.
Each time the readFields method is called, the newly deserialized entries are
added to the internal map. The map needs to be reset when readFields is called
so it doesn't just keep growing ad infinitum.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.