Re: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)

Ted Yu Fri, 31 Jan 2014 17:45:23 -0800

bq. #3. Custom compaction

Stripe compaction would be in the upcoming 0.98.0 release.
See HBASE-7667 Support stripe compaction


Cheers


On Fri, Jan 31, 2014 at 5:29 PM, Vladimir Rodionov
<[email protected]>wrote:

>
> #1 Use GZ compression instead of SNAPPY - usually it gives you additional
> 1.5 x
>
> Block Cache hit rate 50% is very low, actually and it is strange. On every
> GET there will be at least 3 accesses to block cache:
>
> get INDEX block, get BLOOM block, get DATA block. Therefore, everything
> below 66% is actually, - nothing.
>
> #2: Try to increase block cache size and see what will happen?
>
> Your Bloomfilter does not work actually, because you have zillion of
> versions. In this case, the only thing which can help you:
>
> major compaction of regions... or better -
>
> #3. Custom compaction, which will create non-overlapping, by timestamp,
> store files. Yes, its hard.
>
> #4 Disable CRC32 check in HDFS and enable inline CRC in HBase - this will
> save you 50% of IOPS.
> https://issues.apache.org/jira/browse/HBASE-5074
>
> #5 Enable short circuit reads (See HBase book on short circuit reads)
>
> #6 For your use case, probably,  the good idea to try SSDs.
>
> and finally,
>
> #7 the rule of thumb is to have your hot data set in RAM. Does not fit?
> Increase RAM, increase # of servers.
>
> btw, what is the average size of GET result and do you really touch every
> key in your data set with the same probability?
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [email protected]
>
> ________________________________________
> From: Jan Schellenberger [[email protected]]
> Sent: Friday, January 31, 2014 3:12 PM
> To: [email protected]
> Subject: Slow Get Performance (or how many disk I/O does it take for one
> non-cached read?)
>
> I am running a cluster and getting slow performance - about 50
> reads/sec/node
> or about 800 reads/sec for the cluster.  The data is too big to fit into
> memory and my access pattern is completely random reads which is presumably
> difficult for hbase.  Is my read speed reasonable?  I feel like typical
> read
> speeds I've seen reported are much higher?
>
>
>
> Hardware/Software Configuration:
> 17 nodes + 1 master
> 8 cores
> 24 gigs ram
> 4x1TB 3.5" hard drives (I know this is low for hbase - we're working on
> getting more disks)
> running Cloudera CDH 4.3 with hbase .94.6
> Most configurations are default except I'm using 12GB heap space/region
> server and the block cache is .4 instead of .25 but neither of these two
> things makes much of a difference.   I am NOT having a GC issue.  Latencies
> are around 40ms and 99% is 200ms.
>
>
> Dataset Description:
> 6 tables ~300GB each (uncompressed) or 120GB each compressed <- compression
> speeds things up a bit.
> I just ran a major compaction so block locality is 100%
> Each Table has a single column family and a single column ("c:d").
> keys are short strigs ~10-20 characters.
> values are short json ~500 characters
> 100% Gets.  No Puts
> I am heavily using time stamping.  maxversions is set to Integer.MAXINT.
>  My
> gets have a maxretrieved of 200.  A typical row would have < 10 versions on
> average though.  <1% of queries would max out at 200 versions returned.
>
> Here are table configurations (I've also tried Snappy compression)
> {NAME => 'TABLE1', FAMILIES => [{NAME => 'c', DATA_BLOCK_ENCODING => 'NONE'
>  , BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS =>
> '2147483647',
> COMPR
>  ESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647',
> KEEP_DELETED_CELLS =>
>   'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', ENCODE_ON_DISK =>
> 'true', BLOCKCACHE => 'true'}]}
>
>
> I am using the master node to query (with 20 threads) and get about 800
> Gets/second.  Each worker node is completely swamped by disk i/o - I'm
> seeing 80 io/sec with iostat for each of the 4 disk with a throughput of
> about 10MB/sec each.  So this means it's reading roughly 120kB/transfer and
> it's taking about 8 Hard Disk I/O's per Get request.  Does that seem
> reasonable?  I've read the HFILE specs and I feel if the block index is
> loaded into memory, it should take 1 hard disk read to read the proper
> block
> with my row.
>
>
> The region servers have a blockCacheHitRatio of about 33% (no compression)
> or 50% (snappy compression)
>
> Here are some regionserver stats while I'm running queries.  This is the
> uncompressed table version and queries are only 38/sec
>
> requestsPerSecond=38, numberOfOnlineRegions=212,
>  numberOfStores=212, numberOfStorefiles=212, storefileIndexSizeMB=0,
> rootIndexSizeKB=190, totalStaticIndexSizeKB=172689,
> totalStaticBloomSizeKB=79692, memstoreSizeMB=0, mbInMemoryWithoutWAL=0,
> numberOfPutsWithoutWAL=0, readRequestsCount=1865459,
> writeRequestsCount=0, compactionQueueSize=0, flushQueueSize=0,
> usedHeapMB=4565, maxHeapMB=12221, blockCacheSizeMB=4042.53,
> blockCacheFreeMB=846.07, blockCacheCount=62176,
> blockCacheHitCount=5389770, blockCacheMissCount=9909385,
> blockCacheEvictedCount=2744919, blockCacheHitRatio=35%,
> blockCacheHitCachingRatio=65%, hdfsBlocksLocalityIndex=99,
> slowHLogAppendCount=0, fsReadLatencyHistogramMean=1570049.34,
> fsReadLatencyHistogramCount=1239690.00,
> fsReadLatencyHistogramMedian=20859045.50,
> fsReadLatencyHistogram75th=35791318.75,
> fsReadLatencyHistogram95th=97093132.05,
> fsReadLatencyHistogram99th=179688655.93,
> fsReadLatencyHistogram999th=312277183.40,
> fsPreadLatencyHistogramMean=35548585.63,
> fsPreadLatencyHistogramCount=2803268.00,
> fsPreadLatencyHistogramMedian=37662144.00,
> fsPreadLatencyHistogram75th=55991186.50,
> fsPreadLatencyHistogram95th=116227275.50,
> fsPreadLatencyHistogram99th=173173999.27,
> fsPreadLatencyHistogram999th=273812341.79,
> fsWriteLatencyHistogramMean=1523660.72,
> fsWriteLatencyHistogramCount=1225000.00,
> fsWriteLatencyHistogramMedian=226540.50,
> fsWriteLatencyHistogram75th=380366.00,
> fsWriteLatencyHistogram95th=2193516.80,
> fsWriteLatencyHistogram99th=4290208.93,
> fsWriteLatencyHistogram999th=6926850.53
>
>
>
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Slow-Get-Performance-or-how-many-disk-I-O-does-it-take-for-one-non-cached-read-tp4055545.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or [email protected] and
> delete or destroy any copy of this message and its attachments.
>

Re: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)

Reply via email to