[ 
https://issues.apache.org/jira/browse/HADOOP-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605148#action_12605148
 ] 

Ralf Gutsche commented on HADOOP-2399:
--------------------------------------

This piece of code will print different output with Hadoop 17 (compared to 
Hadoop 16).

public void reduce(... Iterator<Writable> aValues...) throws IOException {
        ArrayList<Writable> ret = new ArrayList<Writable>();

        System.out.println("First");
        while (aValues.hasNext()) {
                Writable val = aValues.next();
                System.out.println(val.toString());
                ret.add(val);
        }

        System.out.println("Second");
        for(Writable w: ret){
                System.out.println(w.toString());
        } 
}

In Hadoop 16, the values printed after First and Second were the same.
In Hadoop 17, the values printed after First are identical to Hadoop 16. 
However, in Hadoop 17, all the records printed after Second are identical.
Adding a clone (ret.add(val.cone())) will fix this, if the clone is implemented 
correctly.

I guess this is the consequence of this JIRA.

> Input key and value to combiner and reducer should be reused
> ------------------------------------------------------------
>
>                 Key: HADOOP-2399
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2399
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.1
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.17.0
>
>         Attachments: 2399-3.patch, 2399-4.patch
>
>
> Currently, the input key and value are recreated on every iteration for input 
> to the combiner and reducer. It would speed up the system substantially if we 
> reused the keys and values. The down side of doing it, is that it may break 
> applications that count on holding references to previous keys and values, 
> but I think it is worth doing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to