Hi Stephen,

The true answer depends on the types of jobs you're running. As a back of
the envelope calculation I might figure something like this:

60 nodes total = 30 nodes per rack
Each node might process about 100MB/sec of data
In the case of a sort job where the intermediate data is the same size as
the input data, that means each node needs to shuffle 100MB/sec of data
In aggregate, each rack is then producing about 3GB/sec of data
However, given even reducer spread across the racks, each rack will need to
send 1.5GB/sec to reducers running on the other rack.
Since the connection is full duplex, that means you need 1.5GB/sec of
bisection bandwidth for this theoretical job. So that's 12Gbps.

However, the above calculations are probably somewhat of an upper bound. A
large number of jobs have significant data reduction during the map phase,
either by some kind of filtering/selection going on in the Mapper itself, or
by good usage of Combiners. Additionally, intermediate data compression can
cut the intermediate data transfer by a significant factor. Lastly, although
your disks can probably provide 100MB sustained throughput, it's rare to see
a MR job which can sustain disk speed IO through the entire pipeline. So,
I'd say my estimate is at least a factor of 2 too high.

So, the simple answer is that 4-6Gbps is most likely just fine for most
practical jobs. If you want to be extra safe, many inexpensive switches can
operate in a "stacked" configuration where the bandwidth between them is
essentially backplane speed. That should scale you to 96 nodes with plenty
of headroom.

-Todd

On Tue, May 26, 2009 at 3:10 AM, stephen mulcahy
<stephen.mulc...@deri.org>wrote:

> Hi,
>
> Has anyone here investigated what level of bisection bandwidth is needed
> for a Hadoop cluster which spans more than one rack?
>
> I'm currently sizing and planning a new Hadoop cluster and I'm wondering
> what the performance implications will be if we end up with a cluster spread
> across two racks. I'd expect we'll have one 48-port gigabit switch in each
> 42u rack. If we end up with 60 systems spread across these two switches -
> how much bandwidth should I have between the racks?
>
> I'll have 6 gigabit ports available for links between racks - i.e. up to 6
> Gbps. Would this be sufficient bisection bandwidth for Hadoop or should I be
> considering increased bandwidth between racks (maybe using fibre links
> between the switches or introducing another switch)?
>
> Thanks for any thoughts on this.
>
> -stephen
>
> --
> Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
> NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
> http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com
>

Reply via email to