Is your cluster running MR1 or MR2? On MR1, the CapacityScheduler
would allow you to do this if you used appropriate memory based
requests (see http://search-hadoop.com/m/gnFs91yIg1e), and on MR2
(depending on the YARN scheduler resource request limits config) you
can request your job be run with the maximum-most requests that would
soak up all provided resources (of CPU and Memory) of a node such that
only one container runs on a host at any given time.

On Wed, Jan 29, 2014 at 3:30 AM, Keith Wiley <[email protected]> wrote:
> I'm running a program which in the streaming layer automatically multithreads 
> and does so by automatically detecting the number of cores on the machine.  I 
> realize this model is somewhat in conflict with Hadoop, but nonetheless, 
> that's what I'm doing.  Thus, for even resource utilization, it would be nice 
> to not only assign one mapper per core, but only one mapper per machine.  I 
> realize that if I saturate the cluster none of this really matters, but 
> consider the following example for clarity: 4-core nodes, 10-node cluster, 
> thus 40 slots, fully configured across mappers and reducers (40 slots of 
> each).  Say I run this program with just two mappers.  It would run much more 
> efficiently (in essentially half the time) if I could force the two mappers 
> to go to slots on two separate machines instead of running the risk that 
> Hadoop may assign them both to the same machine.
>
> Can this be done?
>
> Thanks.
>
> ________________________________________________________________________________
> Keith Wiley     [email protected]     keithwiley.com    
> music.keithwiley.com
>
> "Yet mark his perfect self-contentment, and hence learn his lesson, that to be
> self-contented is to be vile and ignorant, and that to aspire is better than 
> to
> be blindly and impotently happy."
>                                            --  Edwin A. Abbott, Flatland
> ________________________________________________________________________________
>



-- 
Harsh J

Reply via email to