Ramu, If your working set of data fits into 192GB you may get additional boost by utilizing OS page cache, or wait until 0.98 release which introduces new bucket cache implementation (port of Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not released yet but is due soon). Both caches stores data off-heap, but Facebook version can store encoded and compressed data and vanilla bucket cache does not. There are some options how to utilize efficiently available RAM (at least in upcoming HBase releases) . If your data set does not fit RAM then your only hope is your 24 SAS drives. Depending on your RAID settings, disk IO perf, HDFS configuration (I think the latest Hadoop is preferable here).
OS page cache is most vulnerable and volatile, it can not be controlled and can be easily polluted by either some other processes or by HBase itself (long scan). With Block cache you have more control but the first truly usable *official* implementation is going to be a part of 0.98 release. As far as I understand, your use case would definitely covered by something similar to BigTable ScanCache (RowCache) , but there is no such cache in HBase yet. One major advantage of RowCache vs BlockCache (apart from being much more efficient in RAM usage) is resilience to Region compactions. Each minor Region compaction invalidates partially Region's data in BlockCache and major compaction invalidates this Region's data completely. This is not the case with RowCache (would it be implemented). Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: [email protected] ________________________________________ From: Ramu M S [[email protected]] Sent: Monday, October 07, 2013 5:25 PM To: [email protected] Subject: Re: HBase Random Read latency > 100ms Vladimir, Yes. I am fully aware of the HDD limitation and wrong configurations wrt RAID. Unfortunately, the hardware is leased from others for this work and I wasn't consulted to decide the h/w specification for the tests that I am doing now. Even the RAID cannot be turned off or set to RAID-0 Production system is according to the Hadoop needs (100 Nodes with 16 Core CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely turned off, so we are creating 1 Virtual Disk containing only 1 Physical Disk and the VD RAID level set to* *RAID-0). These systems are still not available. If you have any suggestion on the production setup, I will be glad to hear. Also, as pointed out earlier, we are planning to use HBase also as an in memory KV store to access the latest data. That's why RAM was considered huge in this configuration. But looks like we would run into more problems than any gains from this. Keeping that aside, I was trying to get the maximum out of the current cluster or as you said Is 500-1000 OPS the max I could get out of this setup? Regards, Ramu Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or [email protected] and delete or destroy any copy of this message and its attachments.
