Privet Oleg

Cloudera and Dell setup the following cluster for my company
Company receives 1.5 TB raw data per day

38 data nodes + 2 Name Nodes

Data Node:
Dell PowerEdge C2100 series
2 x XEON x5670
48 GB RAM ECC  (12x4GB 1333MHz)
12 x 2 TB  7200 RPM SATA HDD (with hot swap)  JBOD
Intel Gigabit ET Dual port PCIe x4
Redundant Power Supply
Hadoop CDH3
max map tasks 24
max reduce tasks 8

Name Node and Secondary Name Node are the similar but
96GB RAM  (not sure why)
6x600Gb 15 RPM Serial SCSI
RAID10


another config is here
page 298
http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA298&lpg=PA298&dq=hadoop+jbod&source=bl&ots=i7xVQBPb_w&sig=8mhq-MtpkRcTiRB1ioKciMxIasg&hl=en&sa=X&ei=AGtqUMK6D8T10gHD4ICQAQ&ved=0CEMQ6AEwAg#v=onepage&q=hadoop%20jbod&f=false


you probably need just 1 computer with 10 x 2 TB SATA HDD



On Mon, Oct 1, 2012 at 6:02 PM, Oleg Ruchovets <oruchov...@gmail.com> wrote:

> Hi ,
>   We are on a very early stage of our hadoop project and want to do a POC.
>
> We have ~ 5-6 terabytes of row data and we are going to execute some
> aggregations.
>
> We plan to use  8 - 10 machines
>
> Questions:
>
>   1)  Which hardware should we use:
>     a) How many discs , what discs is better to use?
>     b) How many RAM?
>     c) How many CPUs?
>
>
>    2) Please share best practices and tips / tricks related to utilise
> hardware using for hadoop projects.
>
> Thanks in advance
> Oleg.
>

Reply via email to