[jira] [Commented] (MAPREDUCE-6067) native-task: spilled records counter is incorrect

Sean Zhong (JIRA) Wed, 03 Sep 2014 23:50:07 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121028#comment-14121028
 ]


Sean Zhong commented on MAPREDUCE-6067:
---------------------------------------

{quote}
//        assertEquals("Native Reduce reduce group counter should equal orignal 
reduce group counter",
//            nativeReduceGroups.getValue(), normalReduceGroups.getValue());
{quote}

Hi Todd,

I made that change one year ago. The idea is that since combiner is an optional 
step, so no matter combiner can reduce 50% of data, 90% of data, they are both 
correct. So, for some optimization, we may not need to combine every key, and 
just leave them to  be handled by reducer.

> native-task: spilled records counter is incorrect
> -------------------------------------------------
>
>                 Key: MAPREDUCE-6067
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6067
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: task
>            Reporter: Todd Lipcon
>            Assignee: Binglin Chang
>         Attachments: MAPREDUCE-6067.v1.patch, native-counters.html, 
> trunk-counters.html
>
>
> After running a terasort, I see the spilled records counter at 5028651606, 
> which is about half what I expected to see. Using the non-native collector I 
> see the expected count of 10000000000. It seems the correct number of records 
> were indeed spilled, because the job's output record count is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6067) native-task: spilled records counter is incorrect

Reply via email to