Quick question about data node hadrware. I've read a few articles, which cover the basics, including the Cloudera's recommendations here:
http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/

The article is from early 2010, but I'm assuming that the general guidelines haven't deviated much from the recommended baselines. I'm skewing my build towards the "Compute optimized" side of the spectrum, which calls for a a 1:1 core to spindle model and more RAM for per node for in-memory caching. Other important consideration is low(ish) power consumption. With that in mind I had specced out the following (per node):

Chassis: 1U Supermicro chassis with 2x 1Gb/sec ethernet ports (http://www.supermicro.com/products/system/1u/5017/sys-5017c-mtf.cfm) (~500USD)
Memory: 32GB Unbuffered ECC RAM (~280USD)
Disks: 4x2TBHitachi Ultrastar 7200RPM SAS Drives (~960USD)
CPU: 1x Intel E3-1230-v2 (3.3Ghz 4 Core / 8 Thread 69W) (~240USD)

The backplane will consist of a dedicated high powered switch (not sure which one yet) with each node utilizing link aggregation.

Does this look reasonable? We are looking into buying 4-5 of those for our initial test bench for under $10000 and plan to expand to about 50-100 nodes by next year.



Reply via email to