[ 
https://issues.apache.org/jira/browse/HADOOP-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655891#action_12655891
 ] 

Runping Qi commented on HADOOP-4845:
------------------------------------


We need to have a counter accounting the number of bytes FETCHED  for each 
reduce the end of shuffling.
If the compression was turned on, that should be the number of bytes of the 
compressed data.

We should also estimate the compression ratio of the fetched  compressed data 
and report it somehow.
We should also report the number of segments and number of bytes written to the 
local disks at the end of shuffling

At the end of reduce, we should know the number of records and bytes to the 
reduce.
That number of bytes may be different than the number of fetched bytes.


 





> Shuffle counter issues
> ----------------------
>
>                 Key: HADOOP-4845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4845
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Chris Douglas
>             Fix For: 0.20.0
>
>
> HADOOP-4749 added a new counter tracking the bytes shuffled into the reduce. 
> It adds an accumulator to ReduceCopier instead of simply incrementing the new 
> counter and did not define a human-readable value in 
> src/mapred/org/apache/hadoop/mapred/Task_Counter.properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to