On Oct 30, 2008, at 9:03 AM, Joel Welling wrote:
I'm writing a Hadoop Pipes application, and I need to generate a
bunch
of integers that are unique across all map tasks. If each map task
has
a unique integer ID, I can make sure my integers are unique by
including
that integer ID. I have this theory that each map task has a unique
identifier associated with some configuration parameter, but I don't
know the name of that parameter.
Is there an integer associated with each task? If so, how do I get
it? While we're at it, is there a way to get the total number of map
tasks?
There is a unique identifier for each task and even each task attempt.
From:
http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Task+Execution+%26+Environment
mapred.tip.idThe task id
mapred.task.id The task attempt id
Since you probably don't care about re-execution, you probably want
mapred.tip.id.
If you just want the number of the mapred.tip.id as an integer, you
can use mapred.task.partition, which will be a number from 0 to M -
1 for maps or 0 to R -1 for reduces.
-- Owen