Great , Thank you for the such detailed information,
By the way what type of Disk Controller do you use? Thanks Oleg. On Tue, Oct 2, 2012 at 6:34 AM, Alexander Pivovarov <[email protected]>wrote: > Privet Oleg > > Cloudera and Dell setup the following cluster for my company > Company receives 1.5 TB raw data per day > > 38 data nodes + 2 Name Nodes > > Data Node: > Dell PowerEdge C2100 series > 2 x XEON x5670 > 48 GB RAM ECC (12x4GB 1333MHz) > 12 x 2 TB 7200 RPM SATA HDD (with hot swap) JBOD > Intel Gigabit ET Dual port PCIe x4 > Redundant Power Supply > Hadoop CDH3 > max map tasks 24 > max reduce tasks 8 > > Name Node and Secondary Name Node are the similar but > 96GB RAM (not sure why) > 6x600Gb 15 RPM Serial SCSI > RAID10 > > > another config is here > page 298 > > http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA298&lpg=PA298&dq=hadoop+jbod&source=bl&ots=i7xVQBPb_w&sig=8mhq-MtpkRcTiRB1ioKciMxIasg&hl=en&sa=X&ei=AGtqUMK6D8T10gHD4ICQAQ&ved=0CEMQ6AEwAg#v=onepage&q=hadoop%20jbod&f=false > > > you probably need just 1 computer with 10 x 2 TB SATA HDD > > > > On Mon, Oct 1, 2012 at 6:02 PM, Oleg Ruchovets <[email protected]> > wrote: > > > Hi , > > We are on a very early stage of our hadoop project and want to do a > POC. > > > > We have ~ 5-6 terabytes of row data and we are going to execute some > > aggregations. > > > > We plan to use 8 - 10 machines > > > > Questions: > > > > 1) Which hardware should we use: > > a) How many discs , what discs is better to use? > > b) How many RAM? > > c) How many CPUs? > > > > > > 2) Please share best practices and tips / tricks related to utilise > > hardware using for hadoop projects. > > > > Thanks in advance > > Oleg. > > >
