On 29/03/11 16:30, Michael Segel wrote:


Grid Pattern: Applications should not use more than 10, 15 or 25 custom 
counters."
I have to question the limitation. It seems arbitrary.
I agree that counters add additional overhead, but suppose I wanted to run the 
word count m/r as a map only job and use counters as a way to capture a count 
per word?
At what point does the cost of the counter(s) exceed the cost of the reduce job?

It's not a performance issue, it's total JT memory. Too many counters, your JT goes OOM, cluster restart time, all outstanding jobs get to restart, etc, etc.


The cost of a large cluster outage is greater than the cost of the reduce job.

On a small (not yahoo! size) cluster, if your JT process has enough memory, you can have more counters as there is less work to lose, and more memory to spare in the JT

Reply via email to