[
https://issues.apache.org/jira/browse/HADOOP-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HADOOP-2244:
--------------------------
Attachment: hadoop-2244.patch
Rest the internal hash map every time readFields is called on MapWritable
> MapWritable.readFields needs to clear internal hash else instance accumulates
> entries forever
> ---------------------------------------------------------------------------------------------
>
> Key: HADOOP-2244
> URL: https://issues.apache.org/jira/browse/HADOOP-2244
> Project: Hadoop
> Issue Type: Bug
> Components: io
> Reporter: stack
> Fix For: 0.16.0
>
> Attachments: hadoop-2244.patch
>
>
> A common framework pattern is to get an instance of a Writable, usually by
> reflection, and then just keep calling readFields to make new 'instances' of
> the particular Writable.
> For example, the spill-to-disk that is run at the end of a map task gets
> instances of map output keys and values and then loops over the (sorted) map
> output calling readFields to make instances to write out to the filesystem
> (See around line #470 in the spill method).
> If the particular Writable is an instance of MapWritable, currently we get
> funny results. It has an internal hash map that is created on instantiation.
> Each time the readFields method is called, the newly deserialized entries
> are added to the internal map. The map needs to be reset when readFields is
> called so it doesn't just keep growing ad infinitum.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.