[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110503#comment-17110503 ]
Danil Lipovoy edited comment on HBASE-23887 at 5/18/20, 6:05 PM: ----------------------------------------------------------------- Hi guys! I was thinking about weak point - we have to set *hbase.lru.cache.data.block.percent* and can't be sure it is enough or not (just approximately). What do you think about the next approach: Use just the parameter - *hbase.lru.cache.heavy.eviction.bytes.size.limit* ** and calc online how many evition bytes above it. And use this information to decide, how aggressive reduction should be. For example: We set *hbase.lru.cache.heavy.eviction.bytes.size.limit* = 200 Mb. When starts heavy reading the real eviction volume can be 500 Mb in 10 seconds (total). Then we calc 500 * 100 / 200 = 250 % If the value more than 100, then start to skip 5% of blocks. In the next 10 seconds the real eviction volume could be ~475 Mb (500 * 0,95). Calc the value 475 * 100 / 200 = 238% Still too much, then skip on 5% more -> 10% of blocks. Then in the next 10 seconds the real eviction volume could be ~451 Mb (500 * 0,9). And so on. After 9 iteration we went to 300 Mb and it is 150%. Then we could use less aggressive reduction. Just 1% for instead of 5%. It means we will skip 41% and get 295 Mb (500 * 0,59). Calc the value 295 * 100 / 200 = 148% And reduce it more while reach 130%. There we could stop reducing because if we got 99% then *hbase.lru.cache.heavy.eviction.count.limit* would set 0 and reset all to begin state (like no skipping at all). |Time (sec)|Evicted (Mb)|Skip (%)|Above limit (%)| | |0|500|0|250| | |10|475|5|238|< 5% reduction| |20|450|10|225| | |30|425|15|213| | |40|400|20|200| | |50|375|25|188| | |60|350|30|175| | |70|325|35|163| | |80|300|40|150| | |90|295|41|148|< start by 1%| |100|290|42|145| | |110|285|43|143| | |120|280|44|140| | |130|275|45|138| | |140|270|46|135| | |150|265|47|133| | |160|260|48|130|< enough| |170|260|48|130| | What do you think? was (Author: pustota): Hi guys! I was thinking about weak point - we have to set *hbase.lru.cache.data.block.percent* and can't be sure it is enough or not (just approximately). What do you think about the next approach: Use just the parameter - *hbase.lru.cache.heavy.eviction.bytes.size.limit* ** and calc online how many evition bytes above it. And use this information to decide, how aggressive reduction should be. For example: We set *hbase.lru.cache.heavy.eviction.bytes.size.limit* = 200 Mb. When starts heavy reading the real eviction volume can be 500 Mb in 10 seconds (total). Then we calc 500 * 100 / 200 = 250 % If the value more than 100, then start to skip 5% of blocks. In the next 10 seconds the real eviction volume could be ~475 Mb (500 * 0,95). Calc the value 475 * 100 / 200 = 238% Still too much, then skip on 5% more -> 10% of blocks. Then in the next 10 seconds the real eviction volume could be ~451 Mb (500 * 0,9). And so on. After 9 iteration we went to 300 Mb and it is 150%. Then we could use less aggressive reduction. Just 1% for instead of 5%. It means we will skip 41% and get 295 Mb (500 * 0,59). Calc the value 295 * 100 / 200 = 148% And reduce it more while reach 130%. There we could stop reducing because if we got 99% then *hbase.lru.cache.heavy.eviction.count.limit* would set 0 and reset all to begin state (like no skipping at all). |Evicted (Mb)|Skip (%)|Above limit (%)| | |500|0|250| | |475|5|238|< 5% reduction| |450|10|225| | |425|15|213| | |400|20|200| | |375|25|188| | |350|30|175| | |325|35|163| | |300|40|150| | |295|41|148|< by 1%| |290|42|145| | |285|43|143| | |280|44|140| | |275|45|138| | |270|46|135| | |265|47|133| | |260|48|130|< enough| |260|48|130| | What do you think? > BlockCache performance improve by reduce eviction rate > ------------------------------------------------------ > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance > Reporter: Danil Lipovoy > Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, cmp.png, evict_BC100_vs_BC23.png, > read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to > evicted each time that start to avoid of putting data to BlockCache > By default: if 10 times (100 secunds) evicted more than 10 MB (each time) > then we start to skip 50% of data blocks. > When heavy evitions process end then new logic off and will put into > BlockCache all blocks again. > > Descriptions of the test: > 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem. > 4 RegionServers > 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF) > Total BlockCache Size = 48 Gb (8 % of data in HFiles) > Random read in 20 threads > > I am going to make Pull Request, hope it is right way to make some > contribution in this cool product. > -- This message was sent by Atlassian Jira (v8.3.4#803005)