Hi all, After testing HBase for few months with very light configurations (5 machines, 2 TB disk, 8 GB RAM), we are now planing for production. Our Load - 1) 50GB log files to process per day by Map/Reduce jobs. 2) Insert 4-5GB to 3 tables in hbase. 3) Run 10-20 scans per day (scanning about 20 regions in a table). All this should run in parallel. Our current configuration can't cope with this load and we are having many stability issues.
This is what we have in mind : 1. Master machine - 32 GB, 4 TB, Two quad core CPUs. 2. Name node - 16 GB, 2TB, Two quad core CPUs. we plan to have up to 20 name servers (starting with 5). We already read http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/ . We would appreciate your feedback on our proposed configuration. Regards Oleg & Lior
