On Mon, 06 Jun 2011 09:34:56 -0400, <[email protected]> wrote: > Yeah, that's a good point. > > I wonder though, what the load on the tracker nodes (port et. al) would > be if a inter-rack fiber switch at 10's of GBS' is getting maxed. > > Seems to me that if there is that much traffic being mitigate across > racks, that the tracker node (or whatever node it is) would overload > first?
It could happen, but I don't think it would always. For example, tracker is on rack A; sees that the best place to put reducer R is on rack B; sees reducer still needs a few hellabytes from mapper M on rack C; tells M to send data to R; switches on B and C get throttled, leaving A free to handle other things. In fact, it almost makes me wonder if an ideal setup is not only to have each of the main control daemons on their own nodes, but to put THOSE nodes on their own rack and keep all the data elsewhere.
