Pierre,

As discussed in recent other threads, it depends.
The most sensible thing for Hadoop nodes is to find a sweet spot for
price/performance.
In general that will mean keeping a balance between compute power, disks,
and network bandwidth, and factor in racks, space, operating costs etc.

How much storage capacity are you thinking of when you target "about 120
data nodes"?

If you had for example 60 quad core nodes with 12 * 2 TB disks (or more) I
would suspect you would be bottle-necked on your 1GB network connections.

Other things to consider is how many nodes per rack? If these 60 nodes
would be 2u and you'd fit 20 nodes in a rack, then loosing one top of the
rack switch means loosing 1/3 of the capacity of your cluster.

Yet another consideration is how easily you want to be able to expand your
cluster incrementally? Until you run Hadoop 0.23 you probably want all your
nodes to be roughly similar in capacity.

Cheers,

Joep

On Fri, Dec 16, 2011 at 3:50 AM, Cussol <pierre.cus...@cnes.fr> wrote:

>
>
> In my company, we intend to set up an hadoop cluster to run analylitics
> applications. This cluster would have about 120 data nodes with dual
> sockets
> servers with a GB interconnect. We are also exploring a solution with 60
> quad sockets servers. How do compare the quad sockets and dual sockets
> servers in an hadoop cluster ?
>
> any help ?
>
> pierre
> --
> View this message in context:
> http://old.nabble.com/Hadoop-and-hardware-tp32987374p32987374.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Reply via email to