HI suresh babu : how many data nodes do you have?
2014-02-07 suresh babu <[email protected]>: > refreshing the thread, > > Can you please suggest any inputs for the hardware configuration(for the > below mentioned use case). > > > > > On Wed, Feb 5, 2014 at 10:31 AM, suresh babu <[email protected]> > wrote: > > > Please find the data requirements for our use case below : > > > > Raw data processing > > ---------------------------------- > > 1. Data is populated into hdfs , after etl around 3 billion puts per day > > in to hbase > > > > 2. Oldest data after X days to be deleted from hbase > > > > Aggregates processing > > ---------------------------------- > > 3 billion reads per day ... Large scan or reads > > > > KV size around 1 KB Daily Processing, raw and aggregates, via M/R jobs > > Hive queries in future, but not of immediate focus > > On Feb 5, 2014 12:48 AM, "Vladimir Rodionov" <[email protected]> > > wrote: > > > >> Yes, > >> > >> 1. What is the expected avg and peak load in > writes/updates/deletes/reads? > >> 2. What is the average size of a KV? > >> 3. Reads/small scans/medium/large scan %% > >> 4. Do you plan M/R jobs, Hive query? > >> > >> > >> Best regards, > >> Vladimir Rodionov > >> Principal Platform Engineer > >> Carrier IQ, www.carrieriq.com > >> e-mail: [email protected] > >> > >> ________________________________________ > >> From: Nick Xie [[email protected]] > >> Sent: Tuesday, February 04, 2014 10:02 AM > >> To: [email protected] > >> Subject: Re: Regarding Hardware configuration for HBase cluster > >> > >> I guess you'd better describe a little bit more about your applications. > >> Does the data increase over the time at all? > >> > >> Nick > >> > >> > >> On Tue, Feb 4, 2014 at 5:22 AM, suresh babu <[email protected]> > >> wrote: > >> > >> > Hi folks, > >> > > >> > We are trying to setup HBase cluster for the following requirement: > >> > > >> > We have to maintain data of size around 800TB, > >> > > >> > For the above requirement,please suggest me the best hardware > >> configuration > >> > details like > >> > > >> > 1)how many disks to consider for machine and the capacity of disks > ,for > >> > example, 16/24 disks per node with 1/2TB capacity per each disk > >> > > >> > 2) which compression method is suited for production environment , > >> space is > >> > not a major limitation , but speed is of prime concern for my use case > >> > > >> > 3) how many CPU Cores should be configured for each node/machine ? Or > >> > ideal ratio of number of cores to the number of disks,for example > >> > 1core/1disk ? > >> > > >> > Regards, > >> > Kaushik > >> > > >> > >> Confidentiality Notice: The information contained in this message, > >> including any attachments hereto, may be confidential and is intended > to be > >> read only by the individual or entity to whom this message is > addressed. If > >> the reader of this message is not the intended recipient or an agent or > >> designee of the intended recipient, please note that any review, use, > >> disclosure or distribution of this message or its attachments, in any > form, > >> is strictly prohibited. If you have received this message in error, > please > >> immediately notify the sender and/or [email protected] and > >> delete or destroy any copy of this message and its attachments. > >> > > > -- Best Regards 亦思科技 is-land Systems Inc. Tel:03-5630345 Ext.14 Fax:03-5631345 e-MAIL:[email protected] 何永安 Yung An He
