[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ]
Danil Lipovoy commented on HBASE-23887: --------------------------------------- All tests below have done on: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of autoscaling (see describe [here|https://issues.apache.org/jira/browse/HBASE-23887?focusedCommentId=17110503&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17110503]): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (bytesFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables: tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _operationcount=50000000 (for tbl4 just 500000 because there is batch 100)_ _readproportion=1_ _requestdistribution=uniform_ Workload scenario "z": _operationcount=50000000 (for tbl4 just 500000 because there is batch 100)_ _readproportion=1_ _requestdistribution=zipfian_ Workload scenario "l": _operationcount=50000000 (for tbl4 just 500000 because there is batch 100)_ _readproportion=1_ _requestdistribution=latest_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature. *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 <- no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 <- start reading but *count.limit* haven't reach. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 <- *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching DataBlock (%): 91 LruBlockCache: BlockCache evicted (MB): 7722, overhead (%): 3761, heavy eviction counter: 7, current caching DataBlock (%): 88 LruBlockCache: BlockCache evicted (MB): 7840, overhead (%): 3820, heavy eviction counter: 8, current caching DataBlock (%): 85 LruBlockCache: BlockCache evicted (MB): 8032, overhead (%): 3916, heavy eviction counter: 9, current caching DataBlock (%): 82 LruBlockCache: BlockCache evicted (MB): 7687, overhead (%): 3743, heavy eviction counter: 10, current caching DataBlock (%): 79 LruBlockCache: BlockCache evicted (MB): 7458, overhead (%): 3629, heavy eviction counter: 11, current caching DataBlock (%): 76 LruBlockCache: BlockCache evicted (MB): 7343, overhead (%): 3571, heavy eviction counter: 12, current caching DataBlock (%): 73 LruBlockCache: BlockCache evicted (MB): 6769, overhead (%): 3284, heavy eviction counter: 13, current caching DataBlock (%): 70 LruBlockCache: BlockCache evicted (MB): 6655, overhead (%): 3227, heavy eviction counter: 14, current caching DataBlock (%): 67 LruBlockCache: BlockCache evicted (MB): 6080, overhead (%): 2940, heavy eviction counter: 15, current caching DataBlock (%): 64 LruBlockCache: BlockCache evicted (MB): 5851, overhead (%): 2825, heavy eviction counter: 16, current caching DataBlock (%): 61 LruBlockCache: BlockCache evicted (MB): 5277, overhead (%): 2538, heavy eviction counter: 17, current caching DataBlock (%): 58 LruBlockCache: BlockCache evicted (MB): 4933, overhead (%): 2366, heavy eviction counter: 18, current caching DataBlock (%): 55 LruBlockCache: BlockCache evicted (MB): 4359, overhead (%): 2079, heavy eviction counter: 19, current caching DataBlock (%): 52 LruBlockCache: BlockCache evicted (MB): 4015, overhead (%): 1907, heavy eviction counter: 20, current caching DataBlock (%): 49 LruBlockCache: BlockCache evicted (MB): 3556, overhead (%): 1678, heavy eviction counter: 21, current caching DataBlock (%): 46 LruBlockCache: BlockCache evicted (MB): 3097, overhead (%): 1448, heavy eviction counter: 22, current caching DataBlock (%): 43 LruBlockCache: BlockCache evicted (MB): 2638, overhead (%): 1219, heavy eviction counter: 23, current caching DataBlock (%): 40 LruBlockCache: BlockCache evicted (MB): 2179, overhead (%): 989, heavy eviction counter: 24, current caching DataBlock (%): 37 LruBlockCache: BlockCache evicted (MB): 1835, overhead (%): 817, heavy eviction counter: 25, current caching DataBlock (%): 34 LruBlockCache: BlockCache evicted (MB): 1491, overhead (%): 645, heavy eviction counter: 26, current caching DataBlock (%): 31 LruBlockCache: BlockCache evicted (MB): 1032, overhead (%): 416, heavy eviction counter: 27, current caching DataBlock (%): 28 LruBlockCache: BlockCache evicted (MB): 688, overhead (%): 244, heavy eviction counter: 28, current caching DataBlock (%): 25 LruBlockCache: BlockCache evicted (MB): 458, overhead (%): 129, heavy eviction counter: 29, current caching DataBlock (%): 22 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 30, current caching DataBlock (%): 23 <- wow, too low! up 1 % LruBlockCache: BlockCache evicted (MB): 114, overhead (%): -43, heavy eviction counter: 30, current caching DataBlock (%): 28 <- accidental fluctuation? plus 5 LruBlockCache: BlockCache evicted (MB): 344, overhead (%): 72, heavy eviction counter: 31, current caching DataBlock (%): 27 <- now ok, continue slow down LruBlockCache: BlockCache evicted (MB): 344, overhead (%): 72, heavy eviction counter: 32, current caching DataBlock (%): 26 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 33, current caching DataBlock (%): 27 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 34, current caching DataBlock (%): 28 LruBlockCache: BlockCache evicted (MB): 344, overhead (%): 72, heavy eviction counter: 35, current caching DataBlock (%): 27 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 36, current caching DataBlock (%): 28 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 37, current caching DataBlock (%): 29 LruBlockCache: BlockCache evicted (MB): 458, overhead (%): 129, heavy eviction counter: 38, current caching DataBlock (%): 26 LruBlockCache: BlockCache evicted (MB): 344, overhead (%): 72, heavy eviction counter: 39, current caching DataBlock (%): 25 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 40, current caching DataBlock (%): 26 Take a look how to work eviction process. We can see the same place on the second graph (at the beginning): !eviction_100p.png! Of course GC much less when we use the feature: !gc_100p.png! So, all results YCSB I collected into table view: | |*original*|*feature*|*%*| |tbl1-u (ops/sec)|33,191|46,587|140| |tbl2-u (ops/sec)|41,959|62,695|149| |tbl3-u (ops/sec)|41,485|61,407|148| |tbl4-u (ops/sec)|382|638|167| |tbl1-z (ops/sec)|51,077|60,264|118| |tbl2-z (ops/sec)|57,103|70,809|124| |tbl3-z (ops/sec)|59,796|69,426|116| |tbl4-z (ops/sec)|500|724|145| |tbl1-l (ops/sec)|71,857|77,682|108| |tbl2-l (ops/sec)|74,836|82,893|111| |tbl3-l (ops/sec)|74,573|78,871|106| |tbl4-l (ops/sec)|647|821|127| | |*original*|*feature*|*%*| |tbl1-u AverageLatency(us)|1,503|1,071|71| |tbl2-u AverageLatency(us)|1,189|795|67| |tbl3-u AverageLatency(us)|1,203|812|68| |tbl4-u AverageLatency(us)|65,285|39,134|60| |tbl1-z AverageLatency(us)|976|827|85| |tbl2-z AverageLatency(us)|873|704|81| |tbl3-z AverageLatency(us)|834|718|86| |tbl4-z AverageLatency(us)|49,831|34,435|69| |tbl1-l AverageLatency(us)|694|641|92| |tbl2-l AverageLatency(us)|666|601|90| |tbl3-l AverageLatency(us)|668|632|95| |tbl4-l AverageLatency(us)|38,501|30,342|79| | |*original*|*feature*|*%*| |tbl1-u 95thPercentileLatency(us)|2,231|2,071|93| |tbl2-u 95thPercentileLatency(us)|1,134|1,044|92| |tbl3-u 95thPercentileLatency(us)|1,274|1,136|89| |tbl4-u 95thPercentileLatency(us)|340,991|54,111|16| |tbl1-z 95thPercentileLatency(us)|1,459|1,521|104| |tbl2-z 95thPercentileLatency(us)|891|896|101| |tbl3-z 95thPercentileLatency(us)|931|968|104| |tbl4-z 95thPercentileLatency(us)|316,159|55,135|17| |tbl1-l 95thPercentileLatency(us)|992|997|101| |tbl2-l 95thPercentileLatency(us)|773|746|97| |tbl3-l 95thPercentileLatency(us)|801|833|104| |tbl4-l 95thPercentileLatency(us)|67,583|54,143|80| > BlockCache performance improve by reduce eviction rate > ------------------------------------------------------ > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance > Reporter: Danil Lipovoy > Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, cmp.png, evict_BC100_vs_BC23.png, > eviction_100p.png, eviction_100p.png, eviction_100p.png, gc_100p.png, > read_requests_100pBC_vs_23pBC.png, requests_100p.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to > evicted each time that start to avoid of putting data to BlockCache > By default: if 10 times (100 secunds) evicted more than 10 MB (each time) > then we start to skip 50% of data blocks. > When heavy evitions process end then new logic off and will put into > BlockCache all blocks again. > > Descriptions of the test: > 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem. > 4 RegionServers > 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF) > Total BlockCache Size = 48 Gb (8 % of data in HFiles) > Random read in 20 threads > > I am going to make Pull Request, hope it is right way to make some > contribution in this cool product. > -- This message was sent by Atlassian Jira (v8.3.4#803005)