Jeremy, Here's my shot at it (pardon the quick crappy code): https://gist.github.com/3828246
Basically - you can achieve it in two ways: Requirement: All tasks must increment the "max" designated counter only AFTER the max has been computed (i.e. in cleanup). 1. All tasks may use same counter name. Later, we pull per-task counters and determine the max at the client. (This is my quick and dirty implementation) 2. All tasks may use their own task ID (Number part) in the counter name, but use the same group. Later, we fetch all counters for that group and iterate over it to find the max. This is cleaner, and doesn't end up using deprecated APIs such as the above. Does this help? On Wed, Oct 3, 2012 at 8:47 PM, Jeremy Lewi <[email protected]> wrote: > HI hadoop-users, > > I'm curious if there is an implementation somewhere of a counter which > tracks the maximum of some value across all mappers or reducers? > > Thanks > J -- Harsh J
