Thanks for your reply Harsh, but this is confusing me more :( I can't experiment this because I'm using a single machine now and everything is reported as Local read/written.
or can I ? I'm using this line hdfs = FileSystem.get(getConf()); which I think means that the instance created is distributed. but the jobCoutners never uses it for intermediate results (Eg. for reducers to read map-outputs) So if you can answer my question further, I truly appreciate it ! Maha On Feb 25, 2011, at 12:00 PM, Harsh J wrote: > From what I could gather, all FileSystem instances put in an entry > into a static 'statistics' map. This map is used to update the > counters for each Task. Hence, all operations done on the same HDFS > URI by either the task or your application code, must be counted as > one. In fact, even if you are reading off another HDFS, the scheme > match is alone seen, so it would aggregate to the same counter as > well. > > I'm not very sure of this though. Perhaps writing a simple test should > be adequate to learn the truth. > > On Sat, Feb 26, 2011 at 1:04 AM, maha <[email protected]> wrote: >> Hello, please help me clear me ideas! >> >> When a reducer reads map-output data remotely ... Is that reflected in the >> HDFS_BYTES_READ? >> >> Or is HDFS_BYTES_READ/WRITTEN is only for the start and end of a job ? ie. >> first data read for maps as input and last data written from reducer as >> output for user to see. >> >> >> Thank you in advance, >> >> Maha > > > > -- > Harsh J > www.harshj.com
