Please find the data requirements for our use case below : Raw data processing ---------------------------------- 1. Data is populated into hdfs , after etl around 3 billion puts per day in to hbase
2. Oldest data after X days to be deleted from hbase Aggregates processing ---------------------------------- 3 billion reads per day ... Large scan or reads KV size around 1 KB Daily Processing, raw and aggregates, via M/R jobs Hive queries in future, but not of immediate focus On Feb 5, 2014 12:48 AM, "Vladimir Rodionov" <[email protected]> wrote: > Yes, > > 1. What is the expected avg and peak load in writes/updates/deletes/reads? > 2. What is the average size of a KV? > 3. Reads/small scans/medium/large scan %% > 4. Do you plan M/R jobs, Hive query? > > > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: [email protected] > > ________________________________________ > From: Nick Xie [[email protected]] > Sent: Tuesday, February 04, 2014 10:02 AM > To: [email protected] > Subject: Re: Regarding Hardware configuration for HBase cluster > > I guess you'd better describe a little bit more about your applications. > Does the data increase over the time at all? > > Nick > > > On Tue, Feb 4, 2014 at 5:22 AM, suresh babu <[email protected]> wrote: > > > Hi folks, > > > > We are trying to setup HBase cluster for the following requirement: > > > > We have to maintain data of size around 800TB, > > > > For the above requirement,please suggest me the best hardware > configuration > > details like > > > > 1)how many disks to consider for machine and the capacity of disks ,for > > example, 16/24 disks per node with 1/2TB capacity per each disk > > > > 2) which compression method is suited for production environment , space > is > > not a major limitation , but speed is of prime concern for my use case > > > > 3) how many CPU Cores should be configured for each node/machine ? Or > > ideal ratio of number of cores to the number of disks,for example > > 1core/1disk ? > > > > Regards, > > Kaushik > > > > Confidentiality Notice: The information contained in this message, > including any attachments hereto, may be confidential and is intended to be > read only by the individual or entity to whom this message is addressed. If > the reader of this message is not the intended recipient or an agent or > designee of the intended recipient, please note that any review, use, > disclosure or distribution of this message or its attachments, in any form, > is strictly prohibited. If you have received this message in error, please > immediately notify the sender and/or [email protected] and > delete or destroy any copy of this message and its attachments. >
