Uhm... If I may add something...

Joep is correct. There are a lot of factors that will effect your cluster 
design.
And there have been a lot of threads on this topic because hardware prices 
frequently change along with advances in technology as well as non-commodity 
solutions aimed at niche spaces.
Plus this is the biggest decision that you can't easily change and you are 
forced to live with it...
(I think there's a potential blog in this ...)

Going from memory and at 4:30 am is not a good thing to do, I believe that in a 
standard rack there are 42 1U spaces so you can fit 20 2U boxes in your rack 
and still have room for your ToR switch. There is also the issue of power and 
cooling that may take up space too...

The one common question that no one seems to ask is .... "What are your 
constraints?"
For some it may be physical space, others budget... power... hardware 
availability.... 
That one question will have a big impact on you cluster design and nobody asks 
it. 

With respect to quad socket vs dual socket...

There was a post on Cloudera's site which recommended 2 drives per core so w 16 
cores, you would have 32 spindles. Maximizing your data density, you will want 
3.5" drives. I don't think that you can fit 16 3.5" drives in a 2U box, let 
alone 32 ...  Note that I didn't even think about 24 cores...

And as Joep points out that with this much disk 1GBe even port bonded isn't 
going to cut it...

Lots of thing to think about...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 16, 2011, at 10:47 AM, "J. Rottinghuis" <jrottingh...@gmail.com> wrote:

> Pierre,
> 
> As discussed in recent other threads, it depends.
> The most sensible thing for Hadoop nodes is to find a sweet spot for
> price/performance.
> In general that will mean keeping a balance between compute power, disks,
> and network bandwidth, and factor in racks, space, operating costs etc.
> 
> How much storage capacity are you thinking of when you target "about 120
> data nodes"?
> 
> If you had for example 60 quad core nodes with 12 * 2 TB disks (or more) I
> would suspect you would be bottle-necked on your 1GB network connections.
> 
> Other things to consider is how many nodes per rack? If these 60 nodes
> would be 2u and you'd fit 20 nodes in a rack, then loosing one top of the
> rack switch means loosing 1/3 of the capacity of your cluster.
> 
> Yet another consideration is how easily you want to be able to expand your
> cluster incrementally? Until you run Hadoop 0.23 you probably want all your
> nodes to be roughly similar in capacity.
> 
> Cheers,
> 
> Joep
> 
> On Fri, Dec 16, 2011 at 3:50 AM, Cussol <pierre.cus...@cnes.fr> wrote:
> 
>> 
>> 
>> In my company, we intend to set up an hadoop cluster to run analylitics
>> applications. This cluster would have about 120 data nodes with dual
>> sockets
>> servers with a GB interconnect. We are also exploring a solution with 60
>> quad sockets servers. How do compare the quad sockets and dual sockets
>> servers in an hadoop cluster ?
>> 
>> any help ?
>> 
>> pierre
>> --
>> View this message in context:
>> http://old.nabble.com/Hadoop-and-hardware-tp32987374p32987374.html
>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>> 
>> 

Reply via email to