Ah this answers alot about why some of my dynamic counters never show up and i have to bite my nails waiting to see whats going on until the end of the job- thanks.
Another question: what happens if a task fails ? What happen to the counters for it ? Do they dissappear into the ether? Or do they get merged in with the counters from other tasks? On Fri, Oct 19, 2012 at 9:50 AM, Bertrand Dechoux <[email protected]>wrote: > And by default the number of counters is limited to 120 with the > mapreduce.job.counters.limit property. > They are useful for displaying short statistics about a job but should not > be used for results (imho). > I know people may misuse them but I haven't tried so I wouldn't be able to > list the caveats. > > Regards > > Bertrand > > > On Fri, Oct 19, 2012 at 4:35 PM, Michael Segel > <[email protected]>wrote: > >> As I understand it... each Task has its own counters and are >> independently updated. As they report back to the JT, they update the >> counter(s)' status. >> The JT then will aggregate them. >> >> In terms of performance, Counters take up some memory in the JT so while >> its OK to use them, if you abuse them, you can run in to issues. >> As to limits... I guess that will depend on the amount of memory on the >> JT machine, the size of the cluster (Number of TT) and the number of >> counters. >> >> In terms of global accessibility... Maybe. >> >> The reason I say maybe is that I'm not sure by what you mean by globally >> accessible. >> If a task creates and implements a dynamic counter... I know that it will >> eventually be reflected in the JT. However, I do not believe that a >> separate Task could connect with the JT and see if the counter exists or if >> it could get a value or even an accurate value since the updates are >> asynchronous. Not to mention that I don't believe that the counters are >> aggregated until the job ends. It would make sense that the JT maintains a >> unique counter for each task until the tasks complete. (If a task fails, it >> would have to delete the counters so that when the task is restarted the >> correct count is maintained. ) Note, I haven't looked at the source code >> so I am probably wrong. >> >> HTH >> Mike >> On Oct 19, 2012, at 5:50 AM, Lin Ma <[email protected]> wrote: >> >> Hi guys, >> >> I have some quick questions regarding to Hadoop counter, >> >> >> - Hadoop counter (customer defined) is global accessible (for both >> read and write) for all Mappers and Reducers in a job? >> - What is the performance and best practices of using Hadoop >> counters? I am not sure if using Hadoop counters too heavy, there will be >> performance downgrade to the whole job? >> >> regards, >> Lin >> >> >> > > > -- > Bertrand Dechoux > -- Jay Vyas http://jayunit100.blogspot.com
