Re: Force one mapper per machine (not core)?

Amr Shahin Tue, 28 Jan 2014 22:56:13 -0800

-in theory this should work-
Find the part of hadoop code that calculates the number of cores and patch
it to always return one. [?]



On Wed, Jan 29, 2014 at 3:41 AM, Keith Wiley <[email protected]> wrote:

> Yeah, it isn't, not even remotely, but thanks.
>
> On Jan 28, 2014, at 14:06 , Bryan Beaudreault wrote:
>
> > If this cluster is being used exclusively for this goal, you could just
> set the mapred.tasktracker.map.tasks.maximum to 1.
> >
> >
> > On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley <[email protected]>
> wrote:
> > I'm running a program which in the streaming layer automatically
> multithreads and does so by automatically detecting the number of cores on
> the machine.  I realize this model is somewhat in conflict with Hadoop, but
> nonetheless, that's what I'm doing.  Thus, for even resource utilization,
> it would be nice to not only assign one mapper per core, but only one
> mapper per machine.  I realize that if I saturate the cluster none of this
> really matters, but consider the following example for clarity: 4-core
> nodes, 10-node cluster, thus 40 slots, fully configured across mappers and
> reducers (40 slots of each).  Say I run this program with just two mappers.
>  It would run much more efficiently (in essentially half the time) if I
> could force the two mappers to go to slots on two separate machines instead
> of running the risk that Hadoop may assign them both to the same machine.
> >
> > Can this be done?
> >
> > Thanks.
>
>
> ________________________________________________________________________________
> Keith Wiley     [email protected]     keithwiley.com
> music.keithwiley.com
>
> "Luminous beings are we, not this crude matter."
>                                            --  Yoda
>
> ________________________________________________________________________________
>
>

<<360.gif>>

Re: Force one mapper per machine (not core)?

Reply via email to