Looking at the blog:
"
Counters

Counters represent global counters, defined either by the Map/Reduce 
framework or applications. Applications can define arbitrary Counters 
and update them in the map and/or reduce methods. These counters are 
then globally aggregated by the framework.

Counters are appropriate for tracking few, important, global bits of 
information. They are definitely not meant to aggregate very 
fine-grained statistics of applications.

Counters are very expensive since the JobTracker has to maintain 
every counter of every map/reduce task for the entire duration of the 
application.

Grid Pattern: Applications should not use more than 10, 15 or 25 custom 
counters."
I have to question the limitation. It seems arbitrary.
I agree that counters add additional overhead, but suppose I wanted to run the 
word count m/r as a map only job and use counters as a way to capture a count 
per word?
At what point does the cost of the counter(s) exceed the cost of the reduce job?
-Mike

----------------------------------------
> Date: Tue, 29 Mar 2011 10:28:19 +0100
> From: [email protected]
> To: [email protected]
> Subject: Re: does counters go the performance down seriously?
>
> On 28/03/11 23:34, JunYoung Kim wrote:
> > hi,
> >
> > this linke is about hadoop usage for the good practices.
> >
> > http://developer.yahoo.com/blogs/hadoop/posts/2010/08/apache_hadoop_best_practices_a/
> >  by Arun C Murthy
> >
> > if I want to use about 50,000 counters for a job, does it cause serious 
> > performance down?
> >
> >
> >
>
> Yes, you will use up lots of JT memory and so put limits on the overall
> size of your cluster.
>
> If you have a small cluster and can crank up the memory settings on the
> JT to 48 GB this isn't going to be an issue, but as Y! are topping out
> at these numbers anyway, lots of counters just overload them.
>
>
                                          

Reply via email to