Hi All, Sorry. There was some mistake in the tests (Clients were not reduced, forgot to change the parameter before running tests).
With 8 Clients and, SCR Enabled : Average Latency is 25 ms, IO Wait % is around 8 SCR Disabled: Average Latency is 10 ms, IO Wait % is around 2 Still, SCR disabled gives better results, which confuse me. Can anyone clarify? Also, I tried setting the parameter (hbase.regionserver.checksum.verify as true) Lars suggested with SCR disabled. Average Latency is around 9.8 ms, a fraction lesser. Thanks Ramu On Wed, Oct 9, 2013 at 3:32 PM, Ramu M S <[email protected]> wrote: > Hi All, > > I just ran only 8 parallel clients, > > With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8 > With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2 > > I always thought SCR enabled, allows a client co-located with the DataNode > to read HDFS file blocks directly. This gives a performance boost to > distributed clients that are aware of locality. > > Is my understanding wrong OR it doesn't apply to my scenario? > > Meanwhile I will try setting the parameter suggested by Lars and post you > the results. > > Thanks, > Ramu > > > On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl <[email protected]> wrote: > >> Good call. >> Could try to enable hbase.regionserver.checksum.verify, which will cause >> HBase to do its own checksums rather than relying on HDFS (and which saves >> 1 IO per block get). >> >> I do think you can expect the index blocks to be cached at all times. >> >> -- Lars >> ________________________________ >> From: Vladimir Rodionov <[email protected]> >> To: "[email protected]" <[email protected]> >> Sent: Tuesday, October 8, 2013 8:44 PM >> Subject: RE: HBase Random Read latency > 100ms >> >> >> Upd. >> >> Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO >> (data + .crc) in a worst case. I think if Bloom Filter is enabled than >> it is going to be 6 File IO in a worst case (large data set), therefore >> you will have not 5 IO requests in queue but up to 20-30 IO requests in a >> queue >> This definitely explains > 100ms avg latency. >> >> >> >> Best regards, >> Vladimir Rodionov >> Principal Platform Engineer >> Carrier IQ, www.carrieriq.com >> e-mail: [email protected] >> >> ________________________________________ >> >> From: Vladimir Rodionov >> Sent: Tuesday, October 08, 2013 7:24 PM >> To: [email protected] >> Subject: RE: HBase Random Read latency > 100ms >> >> Ramu, >> >> You have 8 server boxes and 10 client. You have 40 requests in parallel - >> 5 per RS/DN? >> >> You have 5 requests on random reads in a IO queue of your single RAID1. >> With avg read latency of 10 ms, 5 requests in queue will give us 30ms. Add >> some overhead >> of HDFS + HBase and you will probably have your issue explained ? >> >> Your bottleneck is your disk system, I think. When you serve most of >> requests from disks as in your large data set scenario, make sure you have >> adequate disk sub-system and >> that it is configured properly. Block Cache and OS page can not help you >> in this case as working data set is larger than both caches. >> >> Good performance numbers in small data set scenario are explained by the >> fact that data fits into OS page cache and Block Cache - you do not read >> data from disk even if >> you disable block cache. >> >> >> Best regards, >> Vladimir Rodionov >> Principal Platform Engineer >> Carrier IQ, www.carrieriq.com >> e-mail: [email protected] >> >> ________________________________________ >> From: Ramu M S [[email protected]] >> Sent: Tuesday, October 08, 2013 6:00 PM >> To: [email protected] >> Subject: Re: HBase Random Read latency > 100ms >> >> Hi All, >> >> After few suggestions from the mails earlier I changed the following, >> >> 1. Heap Size to 16 GB >> 2. Block Size to 16KB >> 3. HFile size to 8 GB (Table now has 256 regions, 32 per server) >> 4. Data Locality Index is 100 in all RS >> >> I have clients running in 10 machines, each with 4 threads. So total 40. >> This is same in all tests. >> >> Result: >> 1. Average latency is still >100ms. >> 2. Heap occupancy is around 2-2.5 GB in all RS >> >> Few more tests carried out yesterday, >> >> TEST 1: Small data set (100 Million records, each with 724 bytes). >> =========================================== >> Configurations: >> 1. Heap Size to 1 GB >> 2. Block Size to 16KB >> 3. HFile size to 1 GB (Table now has 128 regions, 16 per server) >> 4. Data Locality Index is 100 in all RS >> >> I disabled Block Cache on the table, to make sure I read everything from >> disk, most of the time. >> >> Result: >> 1. Average Latency is 8ms and throughput went up to 6K/Sec per RS. >> 2. With Block Cache enabled again, I got average latency around 2ms >> and throughput of 10K/Sec per RS. >> Heap occupancy around 650 MB >> 3. Increased the Heap to 16GB, with Block Cache still enabled, I got >> average latency around 1 ms and throughput 20K/Sec per RS >> Heap Occupancy around 2-2.5 GB in all RS >> >> TEST 2: Large Data set (1.8 Billion records, each with 724 bytes) >> ================================================== >> Configurations: >> 1. Heap Size to 1 GB >> 2. Block Size to 16KB >> 3. HFile size to 1 GB (Table now has 2048 regions, 256 per server) >> 4. Data Locality Index is 100 in all RS >> >> Result: >> 1. Average Latency is > 500ms to start with and gradually decreases, but >> even after around 100 Million reads it is still >100 ms >> 2. Block Cache = TRUE/FALSE does not make any difference here. Even Heap >> Size (1GB / 16GB) does not make any difference. >> 3. Heap occupancy is around 2-2.5 GB under 16GB Heap and around 650 MB >> under 1GB Heap. >> >> GC Time in all of the scenarios is around 2ms/Second, as shown in the >> Cloudera Manager. >> >> Reading most of the items from Disk in less data scenario gives better >> results and very low latencies. >> >> Number of regions per RS and HFile size does make a huge difference in my >> Cluster. >> Keeping 100 Regions per RS as max(Most of the discussions suggest this), >> should I restrict the HFile size to 1GB? and thus reducing the storage >> capacity (From 700 GB to 100GB per RS)? >> >> Please advice. >> >> Thanks, >> Ramu >> >> >> On Wed, Oct 9, 2013 at 4:58 AM, Vladimir Rodionov >> <[email protected]>wrote: >> >> > What are your current heap and block cache sizes? >> > >> > Best regards, >> > Vladimir Rodionov >> > Principal Platform Engineer >> > Carrier IQ, www.carrieriq.com >> > e-mail: [email protected] >> > >> > ________________________________________ >> > From: Ramu M S [[email protected]] >> > Sent: Monday, October 07, 2013 10:55 PM >> > To: [email protected] >> > Subject: Re: HBase Random Read latency > 100ms >> > >> > Hi All, >> > >> > Average Latency is still around 80ms. >> > I have done the following, >> > >> > 1. Enabled Snappy Compression >> > 2. Reduce the HFile size to 8 GB >> > >> > Should I attribute these results to bad Disk Configuration OR anything >> else >> > to investigate? >> > >> > - Ramu >> > >> > >> > On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <[email protected]> wrote: >> > >> > > Vladimir, >> > > >> > > Thanks for the Insights into Future Caching features. Looks very >> > > interesting. >> > > >> > > - Ramu >> > > >> > > >> > > On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov < >> > > [email protected]> wrote: >> > > >> > >> Ramu, >> > >> >> > >> If your working set of data fits into 192GB you may get additional >> boost >> > >> by utilizing OS page cache, or wait until >> > >> 0.98 release which introduces new bucket cache implementation (port >> of >> > >> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not >> > released >> > >> yet >> > >> but is due soon). Both caches stores data off-heap, but Facebook >> version >> > >> can store encoded and compressed data and vanilla bucket cache does >> not. >> > >> There are some options how to utilize efficiently available RAM (at >> > least >> > >> in upcoming HBase releases) >> > >> . If your data set does not fit RAM then your only hope is your 24 >> SAS >> > >> drives. Depending on your RAID settings, disk IO perf, HDFS >> > configuration >> > >> (I think the latest Hadoop is preferable here). >> > >> >> > >> OS page cache is most vulnerable and volatile, it can not be >> controlled >> > >> and can be easily polluted by either some other processes or by HBase >> > >> itself (long scan). >> > >> With Block cache you have more control but the first truly usable >> > >> *official* implementation is going to be a part of 0.98 release. >> > >> >> > >> As far as I understand, your use case would definitely covered by >> > >> something similar to BigTable ScanCache (RowCache) , but there is no >> > such >> > >> cache in HBase yet. >> > >> One major advantage of RowCache vs BlockCache (apart from being much >> > more >> > >> efficient in RAM usage) is resilience to Region compactions. Each >> minor >> > >> Region compaction invalidates partially >> > >> Region's data in BlockCache and major compaction invalidates this >> > >> Region's data completely. This is not the case with RowCache (would >> it >> > be >> > >> implemented). >> > >> >> > >> Best regards, >> > >> Vladimir Rodionov >> > >> Principal Platform Engineer >> > >> Carrier IQ, www.carrieriq.com >> > >> e-mail: [email protected] >> > >> >> > >> ________________________________________ >> > >> From: Ramu M S [[email protected]] >> > >> Sent: Monday, October 07, 2013 5:25 PM >> > >> To: [email protected] >> > >> Subject: Re: HBase Random Read latency > 100ms >> > >> >> > >> Vladimir, >> > >> >> > >> Yes. I am fully aware of the HDD limitation and wrong configurations >> wrt >> > >> RAID. >> > >> Unfortunately, the hardware is leased from others for this work and I >> > >> wasn't consulted to decide the h/w specification for the tests that >> I am >> > >> doing now. Even the RAID cannot be turned off or set to RAID-0 >> > >> >> > >> Production system is according to the Hadoop needs (100 Nodes with 16 >> > Core >> > >> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely >> turned >> > >> off, so we are creating 1 Virtual Disk containing only 1 Physical >> Disk >> > and >> > >> the VD RAID level set to* *RAID-0). These systems are still not >> > >> available. If >> > >> you have any suggestion on the production setup, I will be glad to >> hear. >> > >> >> > >> Also, as pointed out earlier, we are planning to use HBase also as >> an in >> > >> memory KV store to access the latest data. >> > >> That's why RAM was considered huge in this configuration. But looks >> like >> > >> we >> > >> would run into more problems than any gains from this. >> > >> >> > >> Keeping that aside, I was trying to get the maximum out of the >> current >> > >> cluster or as you said Is 500-1000 OPS the max I could get out of >> this >> > >> setup? >> > >> >> > >> Regards, >> > >> Ramu >> > >> >> > >> >> > >> >> > >> Confidentiality Notice: The information contained in this message, >> > >> including any attachments hereto, may be confidential and is intended >> > to be >> > >> read only by the individual or entity to whom this message is >> > addressed. If >> > >> the reader of this message is not the intended recipient or an agent >> or >> > >> designee of the intended recipient, please note that any review, use, >> > >> disclosure or distribution of this message or its attachments, in any >> > form, >> > >> is strictly prohibited. If you have received this message in error, >> > please >> > >> immediately notify the sender and/or [email protected] and >> > >> delete or destroy any copy of this message and its attachments. >> > >> >> > > >> > > >> > >> > Confidentiality Notice: The information contained in this message, >> > including any attachments hereto, may be confidential and is intended >> to be >> > read only by the individual or entity to whom this message is >> addressed. If >> > the reader of this message is not the intended recipient or an agent or >> > designee of the intended recipient, please note that any review, use, >> > disclosure or distribution of this message or its attachments, in any >> form, >> > is strictly prohibited. If you have received this message in error, >> please >> > immediately notify the sender and/or [email protected] and >> > delete or destroy any copy of this message and its attachments. >> > >> >> Confidentiality Notice: The information contained in this message, >> including any attachments hereto, may be confidential and is intended to be >> read only by the individual or entity to whom this message is addressed. If >> the reader of this message is not the intended recipient or an agent or >> designee of the intended recipient, please note that any review, use, >> disclosure or distribution of this message or its attachments, in any form, >> is strictly prohibited. If you have received this message in error, please >> immediately notify the sender and/or [email protected] and >> delete or destroy any copy of this message and its attachments. >> > >
