Hi
Is your input file compressed or named with the suffix gz ,or like that?
It is interesting .
Map_input_bytes is the number of bytes of uncompressed input consumed by
all the maps in the job.incremented every time a record is read from a
RecordReader and passed to the map's map method by framework .[Hadoop
Definitive Guide page 226]
Please inform of us ,if you get anything further.
Regards.
发自我的 iPhone
在 2013-4-6,0:01,Philippe Signoret <[email protected]> 写道:
> I noticed recently that some Word Count jobs I've run are finishing with the
> MAP_INPUT_BYTES counter missing.
>
> I'm using Hadoop 1.1.2 with mostly default configuration with 5 nodes. The
> input was a single 100KB text file.
>
> Questions:
> Is it normal for any final counters values not to be present?
> Is MAP_INPUT_BYTES the best was to determine total input data size? (I do so
> programmatically, while it's running and after the job is complete.)
> The counters I did get:
>
> Job Counters
> TOTAL_LAUNCHED_REDUCES:1
> SLOTS_MILLIS_MAPS: 6006
> FALLOW_SLOTS_MILLIS_REDUCES: 0
> FALLOW_SLOTS_MILLIS_MAPS: 0
> TOTAL_LAUNCHED_MAPS: 1
> DATA_LOCAL_MAPS: 1
> SLOTS_MILLIS_REDUCES: 9293
> File Output Format Counters
> BYTES_WRITTEN: 366752
> FileSystemCounters
> FILE_BYTES_READ: 505552
> HDFS_BYTES_READ: 1085517
> FILE_BYTES_WRITTEN: 1122685
> HDFS_BYTES_WRITTEN: 366752
> File Input Format Counters
> BYTES_READ: 1085357
> Map-Reduce Framework
> MAP_OUTPUT_MATERIALIZED_BYTES: 505552
> MAP_INPUT_RECORDS: 19446
> REDUCE_SHUFFLE_BYTES: 505552
> SPILLED_RECORDS: 70358
> MAP_OUTPUT_BYTES: 1750111
> CPU_MILLISECONDS: 5700
> COMMITTED_HEAP_BYTES: 401997824
> COMBINE_INPUT_RECORDS: 181151
> SPLIT_RAW_BYTES: 160
> REDUCE_INPUT_RECORDS: 35179
> REDUCE_INPUT_GROUPS: 35179
> COMBINE_OUTPUT_RECORDS:35179
> PHYSICAL_MEMORY_BYTES: 378482688
> REDUCE_OUTPUT_RECORDS: 35179
> VIRTUAL_MEMORY_BYTES: 1139838976
> MAP_OUTPUT_RECORDS: 181151
>
> Here are most of the relevant screens from the JobTracker web interface:
> http://jsfiddle.net/Fguyy/2/embedded/result/
>
> Here is the JobTracker log (relevant time frame): http://pastebin.com/dvsMn4fB
>
> Thanks!
> Philippe
>
> -------------------------------
> Philippe Signoret
> Skype: philippesignoret
> +33 6 95 89 55 55