Re: HBase Random Read latency > 100ms

Ramu M S Wed, 09 Oct 2013 00:13:14 -0700

Hi All,

Sorry. There was some mistake in the tests (Clients were not reduced,
forgot to change the parameter before running tests).


With 8 Clients and,

SCR Enabled : Average Latency is 25 ms, IO Wait % is around 8
SCR Disabled: Average Latency is 10 ms, IO Wait % is around 2

Still, SCR disabled gives better results, which confuse me. Can anyone
clarify?

Also, I tried setting the parameter (hbase.regionserver.checksum.verify as
true) Lars suggested with SCR disabled.
Average Latency is around 9.8 ms, a fraction lesser.

Thanks
Ramu


On Wed, Oct 9, 2013 at 3:32 PM, Ramu M S <[email protected]> wrote:

> Hi All,
>
> I just ran only 8 parallel clients,
>
> With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8
> With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2
>
> I always thought SCR enabled, allows a client co-located with the DataNode
> to read HDFS file blocks directly. This gives a performance boost to
> distributed clients that are aware of locality.
>
> Is my understanding wrong OR it doesn't apply to my scenario?
>
> Meanwhile I will try setting the parameter suggested by Lars and post you
> the results.
>
> Thanks,
> Ramu
>
>
> On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl <[email protected]> wrote:
>
>> Good call.
>> Could try to enable hbase.regionserver.checksum.verify, which will cause
>> HBase to do its own checksums rather than relying on HDFS (and which saves
>> 1 IO per block get).
>>
>> I do think you can expect the index blocks to be cached at all times.
>>
>> -- Lars
>> ________________________________
>> From: Vladimir Rodionov <[email protected]>
>> To: "[email protected]" <[email protected]>
>> Sent: Tuesday, October 8, 2013 8:44 PM
>> Subject: RE: HBase Random Read latency > 100ms
>>
>>
>> Upd.
>>
>> Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO
>> (data + .crc) in a worst case. I think if Bloom Filter is enabled than
>> it is going to be 6 File IO in a worst case (large data set), therefore
>> you will have not 5 IO requests in queue but up to 20-30 IO requests in a
>> queue
>> This definitely explains > 100ms avg latency.
>>
>>
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: [email protected]
>>
>> ________________________________________
>>
>> From: Vladimir Rodionov
>> Sent: Tuesday, October 08, 2013 7:24 PM
>> To: [email protected]
>> Subject: RE: HBase Random Read latency > 100ms
>>
>> Ramu,
>>
>> You have 8 server boxes and 10 client. You have 40 requests in parallel -
>> 5 per RS/DN?
>>
>> You have 5 requests on random reads in a IO queue of your single RAID1.
>> With avg read latency of 10 ms, 5 requests in queue will give us 30ms. Add
>> some overhead
>> of HDFS + HBase and you will probably have your issue explained ?
>>
>> Your bottleneck is your disk system, I think. When you serve most of
>> requests from disks as in your large data set scenario, make sure you have
>> adequate disk sub-system and
>> that it is configured properly. Block Cache and OS page can not help you
>> in this case as working data set is larger than both caches.
>>
>> Good performance numbers in small data set scenario are explained by the
>> fact that data fits into OS page cache and Block Cache - you do not read
>> data from disk even if
>> you disable block cache.
>>
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: [email protected]
>>
>> ________________________________________
>> From: Ramu M S [[email protected]]
>> Sent: Tuesday, October 08, 2013 6:00 PM
>> To: [email protected]
>> Subject: Re: HBase Random Read latency > 100ms
>>
>> Hi All,
>>
>> After few suggestions from the mails earlier I changed the following,
>>
>> 1. Heap Size to 16 GB
>> 2. Block Size to 16KB
>> 3. HFile size to 8 GB (Table now has 256 regions, 32 per server)
>> 4. Data Locality Index is 100 in all RS
>>
>> I have clients running in 10 machines, each with 4 threads. So total 40.
>> This is same in all tests.
>>
>> Result:
>>            1. Average latency is still >100ms.
>>            2. Heap occupancy is around 2-2.5 GB in all RS
>>
>> Few more tests carried out yesterday,
>>
>> TEST 1: Small data set (100 Million records, each with 724 bytes).
>> ===========================================
>> Configurations:
>> 1. Heap Size to 1 GB
>> 2. Block Size to 16KB
>> 3. HFile size to 1 GB (Table now has 128 regions, 16 per server)
>> 4. Data Locality Index is 100 in all RS
>>
>> I disabled Block Cache on the table, to make sure I read everything from
>> disk, most of the time.
>>
>> Result:
>>    1. Average Latency is 8ms and throughput went up to 6K/Sec per RS.
>>    2. With Block Cache enabled again, I got average latency around 2ms
>> and throughput of 10K/Sec per RS.
>>        Heap occupancy around 650 MB
>>    3. Increased the Heap to 16GB, with Block Cache still enabled, I got
>> average latency around 1 ms and throughput 20K/Sec per RS
>>        Heap Occupancy around 2-2.5 GB in all RS
>>
>> TEST 2: Large Data set (1.8 Billion records, each with 724 bytes)
>> ==================================================
>> Configurations:
>> 1. Heap Size to 1 GB
>> 2. Block Size to 16KB
>> 3. HFile size to 1 GB (Table now has 2048 regions, 256 per server)
>> 4. Data Locality Index is 100 in all RS
>>
>> Result:
>>   1. Average Latency is > 500ms to start with and gradually decreases, but
>> even after around 100 Million reads it is still >100 ms
>>   2. Block Cache = TRUE/FALSE does not make any difference here. Even Heap
>> Size (1GB / 16GB) does not make any difference.
>>   3. Heap occupancy is around 2-2.5 GB under 16GB Heap and around 650 MB
>> under 1GB Heap.
>>
>> GC Time in all of the scenarios is around 2ms/Second, as shown in the
>> Cloudera Manager.
>>
>> Reading most of the items from Disk in less data scenario gives better
>> results and very low latencies.
>>
>> Number of regions per RS and HFile size does make a huge difference in my
>> Cluster.
>> Keeping 100 Regions per RS as max(Most of the discussions suggest this),
>> should I restrict the HFile size to 1GB? and thus reducing the storage
>> capacity (From 700 GB to 100GB per RS)?
>>
>> Please advice.
>>
>> Thanks,
>> Ramu
>>
>>
>> On Wed, Oct 9, 2013 at 4:58 AM, Vladimir Rodionov
>> <[email protected]>wrote:
>>
>> > What are your current heap and block cache sizes?
>> >
>> > Best regards,
>> > Vladimir Rodionov
>> > Principal Platform Engineer
>> > Carrier IQ, www.carrieriq.com
>> > e-mail: [email protected]
>> >
>> > ________________________________________
>> > From: Ramu M S [[email protected]]
>> > Sent: Monday, October 07, 2013 10:55 PM
>> > To: [email protected]
>> > Subject: Re: HBase Random Read latency > 100ms
>> >
>> > Hi All,
>> >
>> > Average Latency is still around 80ms.
>> > I have done the following,
>> >
>> > 1. Enabled Snappy Compression
>> > 2. Reduce the HFile size to 8 GB
>> >
>> > Should I attribute these results to bad Disk Configuration OR anything
>> else
>> > to investigate?
>> >
>> > - Ramu
>> >
>> >
>> > On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <[email protected]> wrote:
>> >
>> > > Vladimir,
>> > >
>> > > Thanks for the Insights into Future Caching features. Looks very
>> > > interesting.
>> > >
>> > > - Ramu
>> > >
>> > >
>> > > On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <
>> > > [email protected]> wrote:
>> > >
>> > >> Ramu,
>> > >>
>> > >> If your working set of data fits into 192GB you may get additional
>> boost
>> > >> by utilizing OS page cache, or wait until
>> > >> 0.98 release which introduces new bucket cache implementation (port
>> of
>> > >> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not
>> > released
>> > >> yet
>> > >> but is due soon). Both caches stores data off-heap, but Facebook
>> version
>> > >> can store encoded and compressed data and vanilla bucket cache does
>> not.
>> > >> There are some options how to utilize efficiently available RAM (at
>> > least
>> > >> in upcoming HBase releases)
>> > >> . If your data set does not fit RAM then your only hope is your 24
>> SAS
>> > >> drives. Depending on your RAID settings, disk IO perf, HDFS
>> > configuration
>> > >> (I think the latest Hadoop is preferable here).
>> > >>
>> > >> OS page cache is most vulnerable and volatile, it can not be
>> controlled
>> > >> and can be easily polluted by either some other processes or by HBase
>> > >> itself (long scan).
>> > >> With Block cache you have more control but the first truly usable
>> > >> *official* implementation is going to be a part of 0.98 release.
>> > >>
>> > >> As far as I understand, your use case would definitely covered by
>> > >> something similar to BigTable ScanCache (RowCache) , but there is no
>> > such
>> > >> cache in HBase yet.
>> > >> One major advantage of RowCache vs BlockCache (apart from being much
>> > more
>> > >> efficient in RAM usage) is resilience to Region compactions. Each
>> minor
>> > >> Region compaction invalidates partially
>> > >> Region's data in BlockCache and major compaction invalidates this
>> > >> Region's data completely. This is not the case with RowCache (would
>> it
>> > be
>> > >> implemented).
>> > >>
>> > >> Best regards,
>> > >> Vladimir Rodionov
>> > >> Principal Platform Engineer
>> > >> Carrier IQ, www.carrieriq.com
>> > >> e-mail: [email protected]
>> > >>
>> > >> ________________________________________
>> > >> From: Ramu M S [[email protected]]
>> > >> Sent: Monday, October 07, 2013 5:25 PM
>> > >> To: [email protected]
>> > >> Subject: Re: HBase Random Read latency > 100ms
>> > >>
>> > >> Vladimir,
>> > >>
>> > >> Yes. I am fully aware of the HDD limitation and wrong configurations
>> wrt
>> > >> RAID.
>> > >> Unfortunately, the hardware is leased from others for this work and I
>> > >> wasn't consulted to decide the h/w specification for the tests that
>> I am
>> > >> doing now. Even the RAID cannot be turned off or set to RAID-0
>> > >>
>> > >> Production system is according to the Hadoop needs (100 Nodes with 16
>> > Core
>> > >> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely
>> turned
>> > >> off, so we are creating 1 Virtual Disk containing only 1 Physical
>> Disk
>> > and
>> > >> the VD RAID level set to* *RAID-0). These systems are still not
>> > >> available. If
>> > >> you have any suggestion on the production setup, I will be glad to
>> hear.
>> > >>
>> > >> Also, as pointed out earlier, we are planning to use HBase also as
>> an in
>> > >> memory KV store to access the latest data.
>> > >> That's why RAM was considered huge in this configuration. But looks
>> like
>> > >> we
>> > >> would run into more problems than any gains from this.
>> > >>
>> > >> Keeping that aside, I was trying to get the maximum out of the
>> current
>> > >> cluster or as you said Is 500-1000 OPS the max I could get out of
>> this
>> > >> setup?
>> > >>
>> > >> Regards,
>> > >> Ramu
>> > >>
>> > >>
>> > >>
>> > >> Confidentiality Notice:  The information contained in this message,
>> > >> including any attachments hereto, may be confidential and is intended
>> > to be
>> > >> read only by the individual or entity to whom this message is
>> > addressed. If
>> > >> the reader of this message is not the intended recipient or an agent
>> or
>> > >> designee of the intended recipient, please note that any review, use,
>> > >> disclosure or distribution of this message or its attachments, in any
>> > form,
>> > >> is strictly prohibited.  If you have received this message in error,
>> > please
>> > >> immediately notify the sender and/or [email protected] and
>> > >> delete or destroy any copy of this message and its attachments.
>> > >>
>> > >
>> > >
>> >
>> > Confidentiality Notice:  The information contained in this message,
>> > including any attachments hereto, may be confidential and is intended
>> to be
>> > read only by the individual or entity to whom this message is
>> addressed. If
>> > the reader of this message is not the intended recipient or an agent or
>> > designee of the intended recipient, please note that any review, use,
>> > disclosure or distribution of this message or its attachments, in any
>> form,
>> > is strictly prohibited.  If you have received this message in error,
>> please
>> > immediately notify the sender and/or [email protected] and
>> > delete or destroy any copy of this message and its attachments.
>> >
>>
>> Confidentiality Notice:  The information contained in this message,
>> including any attachments hereto, may be confidential and is intended to be
>> read only by the individual or entity to whom this message is addressed. If
>> the reader of this message is not the intended recipient or an agent or
>> designee of the intended recipient, please note that any review, use,
>> disclosure or distribution of this message or its attachments, in any form,
>> is strictly prohibited.  If you have received this message in error, please
>> immediately notify the sender and/or [email protected] and
>> delete or destroy any copy of this message and its attachments.
>>
>
>

Re: HBase Random Read latency > 100ms

Reply via email to