Oleg & Lior, Couple of questions & couple of suggestions to ponder: A) When you say 20 Name Servers, I assume you are talking about 20 Task Servers B) What type are your M/R jobs ? Compute Intensive vs. storage intensive ? C) What is your Data growth ? D) With the current jobs, are you saturating RAM ? CPU ? Or storage ? Ganglia/Hadoop metrics should tell. E) Also are your jobs long running or short tasks ? Suggestions: A) Your name node could be 32 GB, 2TB Disk. Make sure it is an enterprise class server and also backup to an NFS mount. B) Also have a decent machine as the checkpoint name node. It could be similar to the task nodes B) I assume by Master Machine, you mean Job Tracker. It could be similar to the Task Trackers - 16/24 GB memory, with 4-8 TB disk C) As Jean-Daniel pointed out 500GB (with more spindles) is what I would also recommend. But it also depends on your primary data, intermediate data and final data size. 1 or 2 TB disks are also fine, because they give you more strage. I assume you have the default replication of 3 D) A 1Gb dedicated network would be good. As there are only ~25 machines, you can hang them off of a good Gb switch. Consider 10Gb if there is too much intermediate data traffic, in the future. Cheers <k/>
On 11/21/10 Sun Nov 21, 10, "Oleg Ruchovets" <[email protected]> wrote: >Hi all, >After testing HBase for few months with very light configurations (5 >machines, 2 TB disk, 8 GB RAM), we are now planing for production. >Our Load - >1) 50GB log files to process per day by Map/Reduce jobs. >2) Insert 4-5GB to 3 tables in hbase. >3) Run 10-20 scans per day (scanning about 20 regions in a table). >All this should run in parallel. >Our current configuration can't cope with this load and we are having many >stability issues. > >This is what we have in mind : >1. Master machine - 32 GB, 4 TB, Two quad core CPUs. >2. Name node - 16 GB, 2TB, Two quad core CPUs. >we plan to have up to 20 name servers (starting with 5). > >We already read >http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-ba >sic-hardware-recommendations/ >. > >We would appreciate your feedback on our proposed configuration. > > >Regards Oleg & Lior
