[ 
https://issues.apache.org/jira/browse/HADOOP-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656530#action_12656530
 ] 

he yongqiang commented on HADOOP-4845:
--------------------------------------

{quote}
Doesn't this still count a mix of compressed and uncompressed bytes? 
{quote}

yes. For "shuffleInMemory", the counter records the size of decompressed data, 
and for shuffleToDisk, the counter records the size of  the compressed data 
fetched.
If we need more accurate and the exact compressed size for both, it may have to 
introduce a filed in MapOutput for recording.  Currently the patch uses the 
size field of MapOutput, which counts compressed bytes size for shuffleToDisk 
and uncompressed bytes size for shuffleInMemory.



> Shuffle counter issues
> ----------------------
>
>                 Key: HADOOP-4845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4845
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Chris Douglas
>            Assignee: he yongqiang
>             Fix For: 0.20.0
>
>         Attachments: Hadoop-4845-3.patch
>
>
> HADOOP-4749 added a new counter tracking the bytes shuffled into the reduce. 
> It adds an accumulator to ReduceCopier instead of simply incrementing the new 
> counter and did not define a human-readable value in 
> src/mapred/org/apache/hadoop/mapred/Task_Counter.properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to