[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15787560#comment-15787560
 ] 

ASF GitHub Bot commented on MAPREDUCE-6827:
-------------------------------------------

GitHub user javeme opened a pull request:

    https://github.com/apache/hadoop/pull/177

    MAPREDUCE-6827. Failed to traverse Iterable values the second time in…

    … reduce() method
    
    The following code is a reduce() method (of WordCount):
    
    
        public static class WcReducer extends Reducer<Text, IntWritable, Text, 
IntWritable> {
    
                @Override
                protected void reduce(Text key, Iterable<IntWritable> values, 
Context context)
                                throws IOException, InterruptedException {
    
                        // print some logs
                        List<String> vals = new LinkedList<>();
                        for(IntWritable i : values) {
                                vals.add(i.toString());
                        }
                        System.out.println(String.format(">>>> reduce(%s, 
[%s])",
                                        key, String.join(", ", vals)));
    
                        // sum of values
                        int sum = 0;
                        for(IntWritable i : values) {
                                sum += i.get();
                        }
                        System.out.println(String.format(">>>> reduced(%s, %s)",
                                        key, sum));
    
                        context.write(key, new IntWritable(sum));
                }
        }
    
    After running it, we got the result that all sums were zero!
    
    After debugging, it was found that the second foreach-loop was not 
executed, and the root cause was the returned value of Iterable.iterator(), it 
returned the same instance in the two calls by foreach-loop. In general, 
Iterable.iterator() should return a new instance in each call, such as 
ArrayList.iterator(). This patch fixed the bug.
    
    Signed-off-by: Javeme <[email protected]>

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/javeme/hadoop foreach-bug-of-ValueIterable

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hadoop/pull/177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #177
    
----
commit 6c323fdc1a0013938d09b09b2e16061910a92c97
Author: Javeme <[email protected]>
Date:   2016-12-30T11:39:20Z

    MAPREDUCE-6827. Failed to traverse Iterable values the second time in 
reduce() method
    
    The following code is a reduce() method (of WordCount):
    
        public static class WcReducer extends Reducer<Text, IntWritable, Text, 
IntWritable> {
    
                @Override
                protected void reduce(Text key, Iterable<IntWritable> values, 
Context context)
                                throws IOException, InterruptedException {
    
                        // print some logs
                        List<String> vals = new LinkedList<>();
                        for(IntWritable i : values) {
                                vals.add(i.toString());
                        }
                        System.out.println(String.format(">>>> reduce(%s, 
[%s])",
                                        key, String.join(", ", vals)));
    
                        // sum of values
                        int sum = 0;
                        for(IntWritable i : values) {
                                sum += i.get();
                        }
                        System.out.println(String.format(">>>> reduced(%s, %s)",
                                        key, sum));
    
                        context.write(key, new IntWritable(sum));
                }
        }
    
    After running it, we got the result that all sums were zero!
    
    After debugging, it was found that the second foreach-loop was not 
executed, and the root cause was the returned value of Iterable.iterator(), it 
returned the same instance in the two calls by foreach-loop. In general, 
Iterable.iterator() should return a new instance in each call, such as 
ArrayList.iterator(). This patch fixed the bug.
    
    Signed-off-by: Javeme <[email protected]>

----


> Failed to traverse Iterable values the second time in reduce() method
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6827
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6827
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 3.0.0-alpha1
>         Environment: hadoop2.7.3
>            Reporter: javaloveme
>
> Failed to traverse Iterable values the second time in reduce() method
> The following code is a reduce() method (of WordCount):
> {code:title=WordCount.java|borderStyle=solid}
>       public static class WcReducer extends Reducer<Text, IntWritable, Text, 
> IntWritable> {
>               @Override
>               protected void reduce(Text key, Iterable<IntWritable> values, 
> Context context)
>                               throws IOException, InterruptedException {
>                       // print some logs
>                       List<String> vals = new LinkedList<>();
>                       for(IntWritable i : values) {
>                               vals.add(i.toString());
>                       }
>                       System.out.println(String.format(">>>> reduce(%s, 
> [%s])",
>                                       key, String.join(", ", vals)));
>                       // sum of values
>                       int sum = 0;
>                       for(IntWritable i : values) {
>                               sum += i.get();
>                       }
>                       System.out.println(String.format(">>>> reduced(%s, %s)",
>                                       key, sum));
>                       
>                       context.write(key, new IntWritable(sum));
>               }                       
>       }
> {code}
> After running it, we got the result that all sums were zero!
> After debugging, it was found that the second foreach-loop was not executed, 
> and the root cause was the returned value of Iterable.iterator(), it returned 
> the same instance in the two calls by foreach-loop. In general, 
> Iterable.iterator() should return a new instance in each call, such as 
> ArrayList.iterator().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to