Multiple cores vs multiple nodes

Safdar Kureishy Sun, 01 Jul 2012 04:14:37 -0700

Hi,

I have a reasonably simple question that I thought I'd post to this list
because I don't have enough experience with hardware to figure this out
myself.


Let's assume that I have 2 separate cluster setups for slave nodes. The
master node is a separate machine *outside* these clusters:
*Setup A*: 28 nodes, each with a 2-core CPU, 8 GB RAM and 1 SATA drives (1
TB each)
*Setup B*: 7 nodes, each with a 8-core CPU, 32 GB Ram and 4 SATA drives (1
TB each)

Note that I have maintained the same *core:memory:spindle* ratio above. In
essence, setup B has the same overall processing + memory + spindle
capacity, but achieved with 4 times fewer nodes.

Ignoring the* cost* of each node above, and assuming a 10Gb Ethernet
connectivity and the same speed-per-core across nodes in both the scenarios
above, are Setup A and Setup B equivalent to each other in the context of
setting up a Hadoop cluster? Or will the relative performance be different?
Excluding the network connectivity between the nodes, what would be some
other criteria that might give one setup an edge over the other, for
regular Hadoop jobs?

Also, assuming the same type of Hadoop jobs on both clusters, how different
would the load experienced by the master node be for each setup above?

Thanks in advance,
Safdar

Multiple cores vs multiple nodes

Reply via email to