Hi 

     Is your input file compressed or named with the suffix gz ,or like that?
     It is interesting .
     Map_input_bytes is the number of bytes of uncompressed  input consumed by 
all the maps in the job.incremented every time a record is read from a 
RecordReader and passed to the map's map method by framework .[Hadoop 
Definitive Guide page 226]

   Please inform of us ,if you get anything further.

Regards.



发自我的 iPhone

在 2013-4-6,0:01,Philippe Signoret <[email protected]> 写道:

> I noticed recently that some Word Count jobs I've run are finishing with the 
> MAP_INPUT_BYTES counter missing.
> 
> I'm using Hadoop 1.1.2 with mostly default configuration with 5 nodes. The 
> input was a single 100KB text file.
> 
> Questions:
> Is it normal for any final counters values not to be present?
> Is MAP_INPUT_BYTES the best was to determine total input data size? (I do so 
> programmatically, while it's running and after the job is complete.)
> The counters I did get:
> 
> Job Counters 
>  TOTAL_LAUNCHED_REDUCES:1
>  SLOTS_MILLIS_MAPS:   6006
>  FALLOW_SLOTS_MILLIS_REDUCES: 0
>  FALLOW_SLOTS_MILLIS_MAPS:    0
>  TOTAL_LAUNCHED_MAPS: 1
>  DATA_LOCAL_MAPS:     1
>  SLOTS_MILLIS_REDUCES:        9293
> File Output Format Counters 
>  BYTES_WRITTEN:               366752
> FileSystemCounters
>  FILE_BYTES_READ:     505552
>  HDFS_BYTES_READ:     1085517
>  FILE_BYTES_WRITTEN:  1122685
>  HDFS_BYTES_WRITTEN:  366752
> File Input Format Counters 
>  BYTES_READ:  1085357
> Map-Reduce Framework
>  MAP_OUTPUT_MATERIALIZED_BYTES:       505552
>  MAP_INPUT_RECORDS:   19446
>  REDUCE_SHUFFLE_BYTES:        505552
>  SPILLED_RECORDS:     70358
>  MAP_OUTPUT_BYTES:    1750111
>  CPU_MILLISECONDS:    5700
>  COMMITTED_HEAP_BYTES:        401997824
>  COMBINE_INPUT_RECORDS:       181151
>  SPLIT_RAW_BYTES:     160
>  REDUCE_INPUT_RECORDS:        35179
>  REDUCE_INPUT_GROUPS: 35179
>  COMBINE_OUTPUT_RECORDS:35179
>  PHYSICAL_MEMORY_BYTES:       378482688
>  REDUCE_OUTPUT_RECORDS:       35179
>  VIRTUAL_MEMORY_BYTES:        1139838976
>  MAP_OUTPUT_RECORDS:  181151
> 
> Here are most of the relevant screens from the JobTracker web interface: 
> http://jsfiddle.net/Fguyy/2/embedded/result/
> 
> Here is the JobTracker log (relevant time frame): http://pastebin.com/dvsMn4fB
> 
> Thanks!
> Philippe
> 
> -------------------------------
> Philippe Signoret
> Skype: philippesignoret
> +33 6 95 89 55 55

Reply via email to