Hello, When you have a Reduce phase, the mapper needs to (sort and) materialize KVs to local files to let reducers fetch it. This is where the FILE_BYTES_* counters appear from. Similarly, the Reducer fetches and stores on local disk and merge sorts them again, thus they appear for reduce phase as well.
In a map-only job, you should not generally see any FILE_BYTES_* counters. On Wed, Jun 15, 2011 at 9:32 AM, hailong.yang1115 <hailong.yang1...@gmail.com> wrote: > > Dear all, > > I am trying to the built-in example wordcount with nearly 15GB input. When > the Hadoop job finished, I got the following counters. > > > CounterMapReduceTotal > Job CountersLaunched reduce tasks001 > Rack-local map tasks0035 > Launched map tasks002,318 > Data-local map tasks002,283 > FileSystemCountersFILE_BYTES_READ22,863,580,65617,654,943,34140,518,523,997 > HDFS_BYTES_READ154,400,997,4590154,400,997,459 > FILE_BYTES_WRITTEN33,490,829,40317,654,943,34151,145,772,744 > HDFS_BYTES_WRITTEN02,747,356,7042,747,356,704 > > > My question is what does the FILE_BYTES_READ counter mean? And what is the > difference between FILE_BYTES_READ and HDFS_BYTES_READ? In my opinion, all > the input is located in HDFS, so where does FILE_BYTES_READ come from during > the map phase? > > > Any help will be appreciated! > > Hailong > > 2011-06-15 > > > > *********************************************** > * Hailong Yang, PhD. Candidate > * Sino-German Joint Software Institute, > * School of Computer Science&Engineering, Beihang University > * Phone: (86-010)82315908 > * Email: hailong.yang1...@gmail.com > * Address: G413, New Main Building in Beihang University, > * No.37 XueYuan Road,HaiDian District, > * Beijing,P.R.China,100191 > *********************************************** > -- Harsh J