I'm running a program which in the streaming layer automatically multithreads 
and does so by automatically detecting the number of cores on the machine.  I 
realize this model is somewhat in conflict with Hadoop, but nonetheless, that's 
what I'm doing.  Thus, for even resource utilization, it would be nice to not 
only assign one mapper per core, but only one mapper per machine.  I realize 
that if I saturate the cluster none of this really matters, but consider the 
following example for clarity: 4-core nodes, 10-node cluster, thus 40 slots, 
fully configured across mappers and reducers (40 slots of each).  Say I run 
this program with just two mappers.  It would run much more efficiently (in 
essentially half the time) if I could force the two mappers to go to slots on 
two separate machines instead of running the risk that Hadoop may assign them 
both to the same machine.

Can this be done?

Thanks.

________________________________________________________________________________
Keith Wiley     [email protected]     keithwiley.com    music.keithwiley.com

"Yet mark his perfect self-contentment, and hence learn his lesson, that to be
self-contented is to be vile and ignorant, and that to aspire is better than to
be blindly and impotently happy."
                                           --  Edwin A. Abbott, Flatland
________________________________________________________________________________

Reply via email to