Cloning in java is broken. It is not intuitive, and volumes have been written on deep-copy vs shallow-copy, and still joe q user does not get it.
If hadoop were to be truly "usable", can you comment on hadoop-2399, which has caused the clone issue to come fore ? IMHO, Hadoop-2399 is the root cause of all this. There is a better way to get the performance advantages of Hadoop-2399 without changing the accepted semantics of java Iterator. If that were fixed, this jira would be WONTFIX. ----- Original Message ----- From: Runping Qi (JIRA) <[EMAIL PROTECTED]> To: [email protected] <[email protected]> Sent: Wed Jul 02 18:44:44 2008 Subject: [jira] Updated: (HADOOP-3684) The data_join should allow the user to implement a customer cloning function [ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-3684: ------------------------------- Status: Open (was: Patch Available) > The data_join should allow the user to implement a customer cloning function > ---------------------------------------------------------------------------- > > Key: HADOOP-3684 > URL: https://issues.apache.org/jira/browse/HADOOP-3684 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Reporter: Runping Qi > Fix For: 0.19.0 > > Attachments: H-3684.txt > > > Currently, the framework uses serialization/deserialization to clone the > values passed to the resuce function. > This amounts to a very heavy weight deep copy of the value objects. > That is way too expensive. Although that may be a generic way to work for all > possible value classes, thus good as a default way, > the framework should allow the user to implemet an application specific yet > efficient cloning function. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
