Yeah, it isn't, not even remotely, but thanks. On Jan 28, 2014, at 14:06 , Bryan Beaudreault wrote:
> If this cluster is being used exclusively for this goal, you could just set > the mapred.tasktracker.map.tasks.maximum to 1. > > > On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley <[email protected]> wrote: > I'm running a program which in the streaming layer automatically multithreads > and does so by automatically detecting the number of cores on the machine. I > realize this model is somewhat in conflict with Hadoop, but nonetheless, > that's what I'm doing. Thus, for even resource utilization, it would be nice > to not only assign one mapper per core, but only one mapper per machine. I > realize that if I saturate the cluster none of this really matters, but > consider the following example for clarity: 4-core nodes, 10-node cluster, > thus 40 slots, fully configured across mappers and reducers (40 slots of > each). Say I run this program with just two mappers. It would run much more > efficiently (in essentially half the time) if I could force the two mappers > to go to slots on two separate machines instead of running the risk that > Hadoop may assign them both to the same machine. > > Can this be done? > > Thanks. ________________________________________________________________________________ Keith Wiley [email protected] keithwiley.com music.keithwiley.com "Luminous beings are we, not this crude matter." -- Yoda ________________________________________________________________________________
