I would say this is quite a difficult choice.  I've seen that our cluster
could use more bandwidth, but it wasn't to the nodes that made the big
difference, it was getting better switches that had better backplanes - the
fabric made the difference.

I've also seen some workloads where job design is critical - i.e. if you are
spinning through the data in your mappers you could easily overwhelm the
namenode and jobtracker with big enough clusters.  It is probably quite
early for you to know such things about your workload.  If this becomes a
problem you may need adjustments to your apps.

Overall, I think good quality Top Of Rack switches with good uplinks to
distribution switches can make your cluster fly. That is relatively cheap
compared to 10G throughout, and I've seen that more CPU's work well for _my_
workload (I always need more mappers and reducers, but it is quite rare that
the network is saturated now).

$0.02

-Matt





On Tue, Jun 28, 2011 at 5:13 PM, Russell Jurney <[email protected]>wrote:

> Price the cost of 1GbE->10GbE vs. more nodes, using data from monitoring
> your cluster during peak load.  It should be clear which is a better value.
>
> Russ
>
> On Tue, Jun 28, 2011 at 4:05 PM, Mathias Herberts <
> [email protected]> wrote:
>
> > On Wed, Jun 29, 2011 at 01:02, Matei Zaharia <[email protected]>
> > wrote:
> > > Ideally, to evaluate whether you want to go for 10GbE NICs, you would
> > profile your target Hadoop workload and see whether it's
> > communication-bound. Hadoop jobs can definitely be communication-bound if
> > you shuffle a lot of data between map and reduce, but I've also seen a
> lot
> > of clusters that are CPU-bound (due to decompression, running python, or
> > just running expensive user code) or disk-IO-bound. You might be
> surprised
> > at what your bottleneck is.
> >
> > From my experience, jobs that shuffle lots of data are also very often
> > slowed down by the sort phase, compressing mappers' output is a first
> > step to improve performance. Given the cost of a 10GbE infrastructure
> > with no oversubscription I'd monitor bandwith usage very closely prior
> > to investing in that kind of network gear.
> >
>

Reply via email to