[ https://issues.apache.org/jira/browse/HADOOP-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley updated HADOOP-2399: ---------------------------------- Attachment: 2399-3.patch This patch fixes the value iterator to reuse the key and value between iterations. Aggregation was assuming that the reduce inputs where not reused, so I stringified the value. Is that ok, Runping? I got a minor speed up of 2:33 instead of 2:37 on a simple 1 node word count. > Input key and value to combiner and reducer should be reused > ------------------------------------------------------------ > > Key: HADOOP-2399 > URL: https://issues.apache.org/jira/browse/HADOOP-2399 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.15.1 > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Attachments: 2399-3.patch, reuse-obj-2.patch, reuse-obj.patch > > > Currently, the input key and value are recreated on every iteration for input > to the combiner and reducer. It would speed up the system substantially if we > reused the keys and values. The down side of doing it, is that it may break > applications that count on holding references to previous keys and values, > but I think it is worth doing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.