Oleg & Lior,

Couple of questions & couple of suggestions to ponder:
A)  When you say 20 Name Servers, I assume you are talking about 20 Task
Servers
B)  What type are your M/R jobs ? Compute Intensive vs. storage intensive ?
C)  What is your Data growth ?
D)  With the current jobs, are you saturating RAM ? CPU ? Or storage ?
Ganglia/Hadoop metrics should tell.
E)  Also are your jobs long running or short tasks ?
Suggestions:
A)  Your name node could be 32 GB, 2TB Disk. Make sure it is an enterprise
class server and also backup to an NFS mount.
B)  Also have a decent machine as the checkpoint name node. It could be
similar to the task nodes
B)  I assume by Master Machine, you mean Job Tracker. It could be similar
to the Task Trackers - 16/24 GB memory, with 4-8 TB disk
C)  As Jean-Daniel pointed out 500GB (with more spindles) is what I would
also recommend. But it also depends on your primary data, intermediate
data and final data size. 1 or 2 TB disks are also fine, because they give
you more strage. I assume you have the default replication of 3
D)  A 1Gb dedicated network would be good. As there are only ~25 machines,
you can hang them off of a good Gb switch. Consider 10Gb if there is too
much intermediate data traffic, in the future.
Cheers
<k/>

On 11/21/10 Sun Nov 21, 10, "Oleg Ruchovets" <[email protected]> wrote:

>Hi all,
>After testing HBase for few months with very light configurations  (5
>machines, 2 TB disk, 8 GB RAM), we are now planing for production.
>Our Load -
>1) 50GB log files to process per day by Map/Reduce jobs.
>2)  Insert 4-5GB to 3 tables in hbase.
>3) Run 10-20 scans per day (scanning about 20 regions in a table).
>All this should run in parallel.
>Our current configuration can't cope with this load and we are having many
>stability issues.
>
>This is what we have in mind :
>1. Master machine - 32 GB, 4 TB, Two quad core CPUs.
>2. Name node - 16 GB, 2TB, Two quad core CPUs.
>we plan to have up to 20 name servers (starting with 5).
>
>We already read
>http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-ba
>sic-hardware-recommendations/
>.
>
>We would appreciate your feedback on our proposed configuration.
>
>
>Regards Oleg & Lior


Reply via email to