Nope, regular simple text file (.txt from Guttenberg). I'll keep looking into it and try to reproduce consistently.
Thanks! Philippe On Apr 6, 2013 1:39 PM, "yypvsxf19870706" <[email protected]> wrote: > Hi > > Is your input file compressed or named with the suffix gz ,or like > that? > It is interesting . > Map_input_bytes is the number of bytes of uncompressed input > consumed by all the maps in the job.incremented every time a record is read > from a RecordReader and passed to the map's map method by framework > .[Hadoop Definitive Guide page 226] > > Please inform of us ,if you get anything further. > > Regards. > > > > 发自我的 iPhone > > 在 2013-4-6,0:01,Philippe Signoret <[email protected]> 写道: > > I noticed recently that some Word Count jobs I've run are finishing with > the MAP_INPUT_BYTES counter missing. > > I'm using Hadoop 1.1.2 with mostly default configuration with 5 nodes. The > input was a single 100KB text file. > > Questions: > > - Is it normal for any final counters values not to be present? > - Is MAP_INPUT_BYTES the best was to determine total input data size? > (I do so programmatically, while it's running and after the job is > complete.) > > The counters I did get: > > Job Counters > TOTAL_LAUNCHED_REDUCES:1 > SLOTS_MILLIS_MAPS: 6006 > FALLOW_SLOTS_MILLIS_REDUCES: 0 > FALLOW_SLOTS_MILLIS_MAPS: 0 > TOTAL_LAUNCHED_MAPS: 1 > DATA_LOCAL_MAPS: 1 > SLOTS_MILLIS_REDUCES: 9293 > File Output Format Counters > BYTES_WRITTEN: 366752 > FileSystemCounters > FILE_BYTES_READ: 505552 > HDFS_BYTES_READ: 1085517 > FILE_BYTES_WRITTEN: 1122685 > HDFS_BYTES_WRITTEN: 366752 > File Input Format Counters > BYTES_READ: 1085357 > Map-Reduce Framework > MAP_OUTPUT_MATERIALIZED_BYTES: 505552 > MAP_INPUT_RECORDS: 19446 > REDUCE_SHUFFLE_BYTES: 505552 > SPILLED_RECORDS: 70358 > MAP_OUTPUT_BYTES: 1750111 > CPU_MILLISECONDS: 5700 > COMMITTED_HEAP_BYTES: 401997824 > COMBINE_INPUT_RECORDS: 181151 > SPLIT_RAW_BYTES: 160 > REDUCE_INPUT_RECORDS: 35179 > REDUCE_INPUT_GROUPS: 35179 > COMBINE_OUTPUT_RECORDS:35179 > PHYSICAL_MEMORY_BYTES: 378482688 > REDUCE_OUTPUT_RECORDS: 35179 > VIRTUAL_MEMORY_BYTES: 1139838976 > MAP_OUTPUT_RECORDS: 181151 > > > Here are most of the relevant screens from the JobTracker web interface: > http://jsfiddle.net/Fguyy/2/embedded/result/ > > Here is the JobTracker log (relevant time frame): > http://pastebin.com/dvsMn4fB > > Thanks! > Philippe > > ------------------------------- > *Philippe Signoret* > Skype: philippesignoret > +33 6 95 89 55 55 > >
