[
https://issues.apache.org/jira/browse/MAPREDUCE-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923767#action_12923767
]
Ravi Gummadi commented on MAPREDUCE-2135:
-----------------------------------------
Ah! Thats wrong. Since SpilledRecords counter gives number of records(and not
bytes), it is not just difference of FILE_BYTES_WRITTEN and SpilledRecords.
Since we don't have the number of bytes written during spills(i.e. the bytes
corresponding to SpilledRecords), it seems difficult to get the actual
mapoutputFileSize from the existing counters. Right ?
> FILE_BYTES_WRITTEN counter in map task seems incorrect
> ------------------------------------------------------
>
> Key: MAPREDUCE-2135
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2135
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: task
> Reporter: Ravi Gummadi
>
> With MapReduce trunk,
> The FileSystem counter FILE_BYTES_WRITTEN is a lot less than "Map output
> bytes" counter even when map output compression is OFF. I think this
> FILE_BYTES_WRITTEN signifies the bytes written to local file system. So it
> should be more than map output bytes(in the counters shown below, 210 Vs
> 19200000). Right ?
> Here are some counters from map task of wordcount example:
> Counters for attempt_201010141448_0001_m_000000_0
> FileInputFormatCounters
> BYTES_READ 9,600,000
> FileSystemCounters
> FILE_BYTES_READ 92
> FILE_BYTES_WRITTEN 210
> HDFS_BYTES_READ 9,600,107
> Map-Reduce Framework
> Combine input records 2,400,000
> Combine output records 8
> CPU_MILLISECONDS 4,810
> Failed Shuffles 0
> GC time elapsed (ms) 73
> Map input records 600,000
> Map output bytes 19,200,000
> Map output records 2,400,000
> Merged Map outputs 0
> PHYSICAL_MEMORY_BYTES 131,518,464
> Spilled Records 16
> SPLIT_RAW_BYTES 107
> VIRTUAL_MEMORY_BYTES 581,021,696
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.