Hi,

>
> The optimization of one Hadoop job I'm running would benefit from knowing
> the
> maximum number of map slots in the Hadoop cluster.
>
> This number can be obtained (if my understanding is correct) by:
>
> * parsing the mapred-site.xml file to get
>  the mapred.tasktracker.map.tasks.maximum value (assuming it is set of
> course)
>
> * parsing the slaves file to get the maximum number of compute nodes in the
> cluster
>
> * multiplying the 2 values
>
> My question is:
> I would like to learn about *all* possible ways to get this information
> through API calls (either the Hadoop Common API or the Hadoop MapReduce
> API), i.e. obtaining it through a Job object, through a Configuration
> object,...
>

The easiest way I can think of is using
o.a.h.m.ClusterStatus.getMaxMapTasks(). You can get an instance to
ClusterStatus using JobClient.getClusterStatus().

Thanks
hemanth

Reply via email to