Is there a unique ID associated with each task?

2008-10-30 Thread Joel Welling
Hi folks;
  I'm writing a Hadoop Pipes application, and I need to generate a bunch
of integers that are unique across all map tasks.  If each map task has
a unique integer ID, I can make sure my integers are unique by including
that integer ID.  I have this theory that each map task has a unique
identifier associated with some configuration parameter, but I don't
know the name of that parameter.
  Is there an integer associated with each task?  If so, how do I get
it?  While we're at it, is there a way to get the total number of map
tasks?

Thanks,
-Joel



Re: Is there a unique ID associated with each task?

2008-10-30 Thread Owen O'Malley


On Oct 30, 2008, at 9:03 AM, Joel Welling wrote:

 I'm writing a Hadoop Pipes application, and I need to generate a  
bunch
of integers that are unique across all map tasks.  If each map task  
has
a unique integer ID, I can make sure my integers are unique by  
including

that integer ID.  I have this theory that each map task has a unique
identifier associated with some configuration parameter, but I don't
know the name of that parameter.
 Is there an integer associated with each task?  If so, how do I get
it?  While we're at it, is there a way to get the total number of map
tasks?


There is a unique identifier for each task and even each task attempt.  
From:


http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Task+Execution+%26+Environment

mapred.tip.idThe task id
mapred.task.id  The task attempt id

Since you probably don't care about re-execution, you probably want  
mapred.tip.id.


If you just want the number of the mapred.tip.id as an integer, you  
can use mapred.task.partition, which will be a number from 0 to M -  
1 for maps or 0 to R -1 for reduces.


-- Owen