Hello,

When you have a Reduce phase, the mapper needs to (sort and)
materialize KVs to local files to let reducers fetch it. This is where
the FILE_BYTES_* counters appear from. Similarly, the Reducer fetches
and stores on local disk and merge sorts them again, thus they appear
for reduce phase as well.

In a map-only job, you should not generally see any FILE_BYTES_* counters.

On Wed, Jun 15, 2011 at 9:32 AM, hailong.yang1115
<hailong.yang1...@gmail.com> wrote:
>
> Dear all,
>
> I am trying to the built-in example wordcount with nearly 15GB input. When 
> the Hadoop job finished, I got the following counters.
>
>
> CounterMapReduceTotal
> Job CountersLaunched reduce tasks001
> Rack-local map tasks0035
> Launched map tasks002,318
> Data-local map tasks002,283
> FileSystemCountersFILE_BYTES_READ22,863,580,65617,654,943,34140,518,523,997
> HDFS_BYTES_READ154,400,997,4590154,400,997,459
> FILE_BYTES_WRITTEN33,490,829,40317,654,943,34151,145,772,744
> HDFS_BYTES_WRITTEN02,747,356,7042,747,356,704
>
>
> My question is what does the FILE_BYTES_READ counter mean? And what is the 
> difference between FILE_BYTES_READ and HDFS_BYTES_READ? In my opinion, all 
> the input is located in HDFS, so where does FILE_BYTES_READ come from during 
> the map phase?
>
>
> Any help will be appreciated!
>
> Hailong
>
> 2011-06-15
>
>
>
> ***********************************************
> * Hailong Yang, PhD. Candidate
> * Sino-German Joint Software Institute,
> * School of Computer Science&Engineering, Beihang University
> * Phone: (86-010)82315908
> * Email: hailong.yang1...@gmail.com
> * Address: G413, New Main Building in Beihang University,
> *              No.37 XueYuan Road,HaiDian District,
> *              Beijing,P.R.China,100191
> ***********************************************
>



-- 
Harsh J

Reply via email to