Re: Hadoop/HBase hardware requirement

Todd Lipcon Sun, 21 Nov 2010 16:45:08 -0800

On Sun, Nov 21, 2010 at 5:53 AM, Oleg Ruchovets <[email protected]>wrote:


> Hi all,
> After testing HBase for few months with very light configurations  (5
> machines, 2 TB disk, 8 GB RAM), we are now planing for production.
> Our Load -
> 1) 50GB log files to process per day by Map/Reduce jobs.
> 2)  Insert 4-5GB to 3 tables in hbase.
>

Are these insertions the output of the MR jobs?

If so, I would strongly recommend the bulk load functionality. It is
somewhere between 10x and 100x more efficient than direct API usage.


> 3) Run 10-20 scans per day (scanning about 20 regions in a table).
> All this should run in parallel.
> Our current configuration can't cope with this load and we are having many
> stability issues.
>
> This is what we have in mind :
> 1. Master machine - 32 GB, 4 TB, Two quad core CPUs.
> 2. Name node - 16 GB, 2TB, Two quad core CPUs.
> we plan to have up to 20 name servers (starting with 5).
>
> We already read
>
> http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/
> .
>
> We would appreciate your feedback on our proposed configuration.
>
>
> Regards Oleg & Lior
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Hadoop/HBase hardware requirement

Reply via email to