On Wed, Jun 29, 2011 at 01:02, Matei Zaharia <[email protected]> wrote: > Ideally, to evaluate whether you want to go for 10GbE NICs, you would profile > your target Hadoop workload and see whether it's communication-bound. Hadoop > jobs can definitely be communication-bound if you shuffle a lot of data > between map and reduce, but I've also seen a lot of clusters that are > CPU-bound (due to decompression, running python, or just running expensive > user code) or disk-IO-bound. You might be surprised at what your bottleneck > is.
>From my experience, jobs that shuffle lots of data are also very often slowed down by the sort phase, compressing mappers' output is a first step to improve performance. Given the cost of a 10GbE infrastructure with no oversubscription I'd monitor bandwith usage very closely prior to investing in that kind of network gear.
