[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281245#comment-17281245 ] Danil Lipovoy edited comment on HBASE-23887 at 2/8/21, 5:34 PM: [~vjasani] could you please take a look at the PR [https://github.com/apache/hbase/pull/2934] ? was (Author: pustota): [~vjasani] could you please take a look at the PR [https://github.com/apache/hbase/pull/2934] ? There were some problems with rebase of previous PR so I made the new. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, PR#1257.diff, > cmp.png, evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to > evicted each time that start to avoid of putting data to BlockCache > By default: if 10 times (100 secunds) evicted more than 10 MB (each time) > then we start to skip 50% of data blocks. > When heavy evitions process end then new logic off and will put into > BlockCache all blocks again. > > Descriptions of the test: > 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem. > 4 RegionServers > 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF) > Total BlockCache Size = 48 Gb (8 % of data in HFiles) > Random read in 20 threads > > I am going to ma
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260764#comment-17260764 ] Viraj Jasani edited comment on HBASE-23887 at 1/8/21, 5:43 AM: --- [~pustota] Let's just make this an extended LruCache rather than making changes in LruCache directly (same as suggested earlier by many reviewers). This is what you can do: # In hbase-server, create new class AdaptiveLruBlockCache (package org.apache.hadoop.hbase.io.hfile) that would implement FirstLevelBlockCache, and keep it InterfaceAudience.Private (just like LruBlockCache). # Copy entire code from LruBlockCache to AdaptiveLruBlockCache (just update references of LruBlockCache with AdaptiveLruBlockCache) # Add Javadoc to AdaptiveLruBlockCache class (provide all details about perf improvement, how it is 300% faster, which kind of distribution should choose this etc) # BlockCacheFactory has method: createFirstLevelCache(), add one more option for adaptive blockCache and with some value (say "adaptiveLRU"). In that method, value "LRU" would initialize LruBlockCache, value "TinyLFU" would initialize TinyLfuBlockCache. Similarly, "adaptiveLRU" should initialize AdaptiveLruBlockCache. # Make all changes that you have done in current PR#1257 in LruBlockCache to AdaptiveLruBlockCache. And keep LruBlockCache unchanged. # Provide some documents in dev-support/design-docs, as Sean has mentioned above. You can also refer to those docs in Javadoc of AdaptiveLruBlockCache. If you follow this, I don't think you will need to make any changes in CombinedBlockCache, InclusiveCombinedBlockCache, CacheConfig. Let's get your changes in as a new Lru Blockcache at least. If you feel certain configurable changes should also be done in LruBlockCache class, we can consider it as part of separate Jira. We do not wish to block your changes. However, since this is changing the way we cache (of course it is improved version), it better go as a configurable opt-in feature. With changes mentioned in step 4 above, user can choose this new FirstLevelBlockCache implementation (an improved LruBlockCache) by providing value "adaptiveLRU" to config "hfile.block.cache.policy", and that's it. As an example, please take a look at how TinyLfuBlockCache is implemented and how it is instantiated (as configurable cache). It does not require any changes in CombinedBlockCache or InclusiveCombinedBlockCache because we are just providing a new L1 cache. Let me know what you think. Thanks for working on this. I know it's been a lot of time, let's get your changes in. was (Author: vjasani): [~pustota] Let's just make this an extended LruCache rather than making changes in LruCache directly (same as suggested earlier by many reviewers). This is what you can do: # In hbase-server, create new class AdaptiveLruBlockCache (package org.apache.hadoop.hbase.io.hfile) that would implement FirstLevelBlockCache, and keep it InterfaceAudience.Private (just like LruBlockCache). # Copy entire code from LruBlockCache to AdaptiveLruBlockCache (just update references of LruBlockCache with AdaptiveLruBlockCache) # Add Javadoc to AdaptiveLruBlockCache class (provide all details about perf improvement, how it is 3% faster) # BlockCacheFactory has method: createFirstLevelCache(), add one more option for adaptive blockCache and with some value (say "adaptiveLRU"). In that method, value "LRU" would initialize LruBlockCache, value "TinyLFU" would initialize TinyLfuBlockCache. Similarly, "adaptiveLRU" should initialize AdaptiveLruBlockCache. # Make all changes that you have done in current PR#1257 in LruBlockCache to AdaptiveLruBlockCache. And keep LruBlockCache unchanged. # Provide some documents in dev-support/design-docs, as Sean has mentioned above. You can also refer to those docs in Javadoc of AdaptiveLruBlockCache. If you follow this, I don't think you will need to make any changes in CombinedBlockCache, InclusiveCombinedBlockCache, CacheConfig. Let's get your changes in as a new Lru Blockcache at least. If you feel certain configurable changes should also be done in LruBlockCache class, we can consider it as part of separate Jira. We do not wish to block your changes. However, since this is changing the way we cache (of course it is improved version), it better go as a configurable opt-in feature. With changes mentioned in step 4 above, user can choose this new FirstLevelBlockCache implementation (an improved LruBlockCache) by providing value "adaptiveLRU" to config "hfile.block.cache.policy", and that's it. As an example, please take a look at how TinyLfuBlockCache is implemented and how it is instantiated (as configurable cache). It does not require any changes in CombinedBlockCache or InclusiveCombinedBlockCache because we are just providing a new L1 cache. Let me know what you think. Thank
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210050#comment-17210050 ] Danil Lipovoy edited comment on HBASE-23887 at 10/8/20, 8:04 AM: - Looks like nobody wants to develop HBASE-25123 (I think it would make the code much more complicated), so easier and simple just merge this PR. All complicated math in one function EvictionThread->run() and well documented. was (Author: pustota): Looks like nobody wants to develop HBASE-25123 (I think it would make the code much more complicated), so easier and simple just merge this PR. All complicated math just in one function EvictionThread->run() and well documented. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to > evicted each time that start to avoid of putting data to BlockCache > By default: if 10 times (100 secunds) evicted more than 10 MB (each time) > then we start to skip 50% of data blocks. > When heavy evitions process end then new logic off and will put into > BlockCache all blocks again. > > Descriptions of the test: > 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem. > 4 RegionServers > 4 t
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210050#comment-17210050 ] Danil Lipovoy edited comment on HBASE-23887 at 10/8/20, 8:04 AM: - Looks like nobody wants to develop HBASE-25123 (I think it would make the code much more complicated), so easier and simple just merge this PR. All complicated math just in one function EvictionThread->run() and well documented. was (Author: pustota): Looks like nobody wants to develop HBASE-25123 (I think it would make the code much more complicated), so easier and simple just merge this PR. All complicated math just in one function evic() and well documented. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to > evicted each time that start to avoid of putting data to BlockCache > By default: if 10 times (100 secunds) evicted more than 10 MB (each time) > then we start to skip 50% of data blocks. > When heavy evitions process end then new logic off and will put into > BlockCache all blocks again. > > Descriptions of the test: > 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem. > 4 RegionServers > 4 tables by 6
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 10:52 AM: -- Thank you for your interest) I created HBASE-25123 and maybe somebody will release possibility to set classes L1 realisation. was (Author: pustota): Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and nothing happen. We are just discussing what would be good to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) So I created HBASE-25123 and maybe somebody will release possibility to set classes L1 realisation. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to >
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 10:49 AM: -- Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and nothing happen. We are just discussing what would be good to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) So I created HBASE-25123 and maybe somebody will release possibility to set classes L1 realisation. was (Author: pustota): Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We are just discussing what would be good to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) So I created HBASE-25123 and maybe somebody will release possibility to set classes L1 realisation. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlo
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 10:44 AM: -- Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We are just discussing what would be good to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) So I created HBASE-25123 and maybe somebody will release possibility to set classes L1 realisation. was (Author: pustota): Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We are just discussing what would be good to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) So I created HBASE-25123 and maybe somebody released possibility to set classes L1 realisation. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: >
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 10:43 AM: -- Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We are just discussing what would be good to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) So I created HBASE-25123 and maybe somebody released possibility to set classes L1 realisation. was (Author: pustota): Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We are just discussing what would be good me to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean >
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 10:29 AM: -- Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We just discussing what would be good me to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) was (Author: pustota): Thank you for interesting) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case something going wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabl
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 10:31 AM: -- Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We are just discussing what would be good me to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case although is saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) was (Author: pustota): Thank you for your interest) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. We just discussing what would be good me to do (more performance tests, resolve conflicts etc) and I can't see the end of that. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case saving huge amount of CPU but something went wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.ge
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 7:48 AM: - Thank you for interesting) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case something going wrong and I have no idea why. Maybe because HBASE-24915 has priority "major" but I set just "minor";)) was (Author: pustota): Thank you for interesting) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case something going wrong and I have no idea why. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204523#comment-17204523 ] Danil Lipovoy edited comment on HBASE-23887 at 9/30/20, 7:41 AM: - Thank you for interesting) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. I don't understand how open source development works, sometimes some created issue like HBASE-24915 which save 1% CPU and it is apply without discussion in 1 day. But in this case something going wrong and I have no idea why. was (Author: pustota): Thank you for interesting) But it looks like the feature will never merged because it has dragged on for 7 month and more then dozen developers watching on this and nothing happen. I don't understand how open source development works, sometimes some created issue like [HBASE-24915|https://issues.apache.org/jira/browse/HBASE-24915] which save 1% CPU and it is apply without discussion in 3 days. But in this case something going wrong and I have no idea why. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204279#comment-17204279 ] Sean Busbey edited comment on HBASE-23887 at 9/29/20, 8:44 PM: --- I don't want to keep pushing work on you, but if this change is adopted as an opt-in feature for a kind of cache then having that google doc as a markdown or pdf file in the {{dev-support/design-docs}} area will help a lot when folks need to reason about if using this feature is worthwhile. I think this an interesting approach to skewed key reads. I would expect it to help with zipfian workloads (like YCSB is supposed to do) because the "should we bother to cache a new block" is essentially trying to approximate the likelihood that a new read is from the tail of the distribution rather than the set of frequent items. if something is from the tail then it's not worth thrashing trying to chase a cache hit that is very unlikely to come later. was (Author: busbey): I don't want to keep pushing work on you, but if this change is adopted as an opt-in feature for a kind of cache then having that google doc as a markdown or pdf file in the {{dev-support/design-docs}} area will help a lot when folks need to reason about if using this feature is worthwhile. I think this an interesting approach to skewed key reads. I would expect it to help with zipfian workloads (like YCSB is supposed to do) because the "should we bother to cache a new block" is essentially trying to approximate the likelihood that a new read is from the tail of the distribution rather than the set of frequent items. > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Assignee: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, > evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, > eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, > image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, > image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, > image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, > image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, > image-2020-09-23-09-48-59-714.png, image-2020-09-23-10-06-11-189.png, > ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, > requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, > scan_and_gets.png, scan_and_gets2.png, wave.png, ycsb_logs.zip > > > Hi! > I first time here, correct me please if something wrong. > All latest information is here: > [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing] > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > --- > Some information below isn't actual > --- > > > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200618#comment-17200618 ] Danil Lipovoy edited comment on HBASE-23887 at 9/23/20, 4:04 PM: - I am going to write some article about this and have done some test Cassandra vs HBase. Think it could be interesting. I run YCSB from 2 hosts (800 threads summary) on tables which size: HBase - 300 GB on HDFS (100 GB pure data) Cassandra - 250 GB (replication factor = 3) It means the volume approximately the same (HB a little bit more). The HB parameters: _dfs.client.short.circuit.num = 5 - this is another my improvement https://issues.apache.org/jira/browse/HDFS-15202 - it helps to speed up HB more_ _hbase.lru.cache.heavy.eviction.count.limit = 30 - it means the patch will work after 30 evictions (~5 minutes)_ _hbase.lru.cache.heavy.eviction.mb.size.limit = 300 - good target for eviction_ So, I aggregated logs YCSB and put this into Excel: !image-2020-09-23-10-06-11-189.png! At the beginning CS faster then HB. When _heavy.eviction.count.limit_ pass 30, then the feature was enabled and performance become the same. How it looks into the log of RegionServer: _2020-09-22 18:31:47,561 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 21, current caching DataBlock (%): 100_ _2020-09-22 18:31:57,808 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 22, current caching DataBlock (%): 100_ _2020-09-22 18:32:08,051 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 23, current caching DataBlock (%): 100_ _2020-09-22 18:32:18,155 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 24, current caching DataBlock (%): 100_ _2020-09-22 18:32:28,479 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 25, current caching DataBlock (%): 100_ _2020-09-22 18:32:38,754 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 26, current caching DataBlock (%): 100_ _2020-09-22 18:32:49,334 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 27, current caching DataBlock (%): 100_ _2020-09-22 18:32:59,712 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 28, current caching DataBlock (%): 100_ _2020-09-22 18:33:10,061 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 29, current caching DataBlock (%): 100_ _2020-09-22 18:33:20,220 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 30, current caching DataBlock (%): 100 <- the feature enabled here because reached hbase.lru.cache.heavy.eviction.count.limit = 30_ _2020-09-22 18:33:30,314 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 31, current caching DataBlock (%): 85_ _2020-09-22 18:33:41,390 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 32, current caching DataBlock (%): 70_ _2020-09-22 18:33:52,281 INFO org...: BlockCache evicted (MB): 5687, overhead (%): 1795, heavy eviction counter: 33, current caching DataBlock (%): 55_ _2020-09-22 18:34:03,394 INFO org...: BlockCache evicted (MB): 4136, overhead (%): 1278, heavy eviction counter: 34, current caching DataBlock (%): 43_ _2020-09-22 18:34:15,088 INFO org...: BlockCache evicted (MB): 2585, overhead (%): 761, heavy eviction counter: 35, current caching DataBlock (%): 36_ _2020-09-22 18:34:27,752 INFO org...: BlockCache evicted (MB): 1551, overhead (%): 417, heavy eviction counter: 36, current caching DataBlock (%): 32_ _2020-09-22 18:34:45,233 INFO org...: BlockCache evicted (MB): 940, overhead (%): 213, heavy eviction counter: 37, current caching DataBlock (%): 30_ _2020-09-22 18:34:55,364 INFO org...: BlockCache evicted (MB): 289, overhead (%): -4, heavy eviction counter: 37, current caching DataBlock (%): 31_ _2020-09-22 18:35:05,466 INFO org...: BlockCache evicted (MB): 240, overhead (%): -20, heavy eviction counter: 37, current caching DataBlock (%): 34_ _2020-09-22 18:35:15,564 INFO org...: BlockCache evicted (MB): 254, overhead (%): -16, heavy eviction counter: 37, current caching DataBlock (%): 36_ _2020-09-22 18:35:25,670 INFO org...: BlockCache evicted (MB): 279, overhead (%): -7, heavy eviction counter: 37, current caching DataBlock (%): 37_ _2020-09-22 18:35:35,801 INFO org...: BlockCache evicted (MB): 294, overhead (%): -2, heavy eviction counter: 37, current caching DataBlock (%): 38_ _2020-09-22 18:35:45,918 INFO org...: BlockCache evicted (MB): 309, overhead (%): 3, heavy eviction counter: 38, current caching DataBlock (%): 38_ _2020-09-22 18:35:56,027 INFO org...: BlockCache evicted (MB): 253
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200618#comment-17200618 ] Danil Lipovoy edited comment on HBASE-23887 at 9/23/20, 7:16 AM: - I am going to write some article about this and have done some test Cassandra vs HBase. Think it could be interesting. I run YCSB from 2 hosts (800 threads summary) on tables which size: HBase - 300 GB on HDFS (100 GB pure data) Cassandra - 250 GB (replication factor = 3) It means the volume approximately the same (HB a little bit more). The HB parameters: _dfs.client.short.circuit.num = 5 - this is another my improvement https://issues.apache.org/jira/browse/HDFS-15202 - it helps to speed up HB more_ _hbase.lru.cache.heavy.eviction.count.limit = 30 - it means the patch will work after 30 evictions (~5 minutes)_ _hbase.lru.cache.heavy.eviction.mb.size.limit = 300 - good target for eviction_ So, I aggregated logs YCSB and put this into Excel: !image-2020-09-23-10-06-11-189.png! At the beginning CS faster then HB. When _heavy.eviction.count.limit_ pass 30 was enables the improvement and performance become the same. How it looks into the log of RegionServer: _2020-09-22 18:31:47,561 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 21, current caching DataBlock (%): 100_ _2020-09-22 18:31:57,808 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 22, current caching DataBlock (%): 100_ _2020-09-22 18:32:08,051 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 23, current caching DataBlock (%): 100_ _2020-09-22 18:32:18,155 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 24, current caching DataBlock (%): 100_ _2020-09-22 18:32:28,479 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 25, current caching DataBlock (%): 100_ _2020-09-22 18:32:38,754 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 26, current caching DataBlock (%): 100_ _2020-09-22 18:32:49,334 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 27, current caching DataBlock (%): 100_ _2020-09-22 18:32:59,712 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 28, current caching DataBlock (%): 100_ _2020-09-22 18:33:10,061 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 29, current caching DataBlock (%): 100_ _2020-09-22 18:33:20,220 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 30, current caching DataBlock (%): 100 <- the feature enabled here because reached hbase.lru.cache.heavy.eviction.count.limit = 30_ _2020-09-22 18:33:30,314 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 31, current caching DataBlock (%): 85_ _2020-09-22 18:33:41,390 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 32, current caching DataBlock (%): 70_ _2020-09-22 18:33:52,281 INFO org...: BlockCache evicted (MB): 5687, overhead (%): 1795, heavy eviction counter: 33, current caching DataBlock (%): 55_ _2020-09-22 18:34:03,394 INFO org...: BlockCache evicted (MB): 4136, overhead (%): 1278, heavy eviction counter: 34, current caching DataBlock (%): 43_ _2020-09-22 18:34:15,088 INFO org...: BlockCache evicted (MB): 2585, overhead (%): 761, heavy eviction counter: 35, current caching DataBlock (%): 36_ _2020-09-22 18:34:27,752 INFO org...: BlockCache evicted (MB): 1551, overhead (%): 417, heavy eviction counter: 36, current caching DataBlock (%): 32_ _2020-09-22 18:34:45,233 INFO org...: BlockCache evicted (MB): 940, overhead (%): 213, heavy eviction counter: 37, current caching DataBlock (%): 30_ _2020-09-22 18:34:55,364 INFO org...: BlockCache evicted (MB): 289, overhead (%): -4, heavy eviction counter: 37, current caching DataBlock (%): 31_ _2020-09-22 18:35:05,466 INFO org...: BlockCache evicted (MB): 240, overhead (%): -20, heavy eviction counter: 37, current caching DataBlock (%): 34_ _2020-09-22 18:35:15,564 INFO org...: BlockCache evicted (MB): 254, overhead (%): -16, heavy eviction counter: 37, current caching DataBlock (%): 36_ _2020-09-22 18:35:25,670 INFO org...: BlockCache evicted (MB): 279, overhead (%): -7, heavy eviction counter: 37, current caching DataBlock (%): 37_ _2020-09-22 18:35:35,801 INFO org...: BlockCache evicted (MB): 294, overhead (%): -2, heavy eviction counter: 37, current caching DataBlock (%): 38_ _2020-09-22 18:35:45,918 INFO org...: BlockCache evicted (MB): 309, overhead (%): 3, heavy eviction counter: 38, current caching DataBlock (%): 38_ _2020-09-22 18:35:56,027 INFO org...: BlockCache evicted (MB): 253,
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200618#comment-17200618 ] Danil Lipovoy edited comment on HBASE-23887 at 9/23/20, 7:15 AM: - I am going to write some article about this and have done some test CS vs HB. Think it could be interesting. I run YCSB from 2 hosts (800 threads summary) on tables which size: HBase - 300 GB on HDFS (100 GB pure data) Cassandra - 250 GB (replication factor = 3) It means the volume approximately the same (HB a little bit more). The HB parameters: _dfs.client.short.circuit.num = 5 - this is another my improvement https://issues.apache.org/jira/browse/HDFS-15202 - it helps to speed up HB more_ _hbase.lru.cache.heavy.eviction.count.limit = 30 - it means the patch will work after 30 evictions (~5 minutes)_ _hbase.lru.cache.heavy.eviction.mb.size.limit = 300 - good target for eviction_ So, I aggregated logs YCSB and put this into Excel: !image-2020-09-23-10-06-11-189.png! At the beginning CS faster then HB. When _heavy.eviction.count.limit_ pass 30 was enables the improvement and performance become the same. How it looks into the log of RegionServer: _2020-09-22 18:31:47,561 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 21, current caching DataBlock (%): 100_ _2020-09-22 18:31:57,808 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 22, current caching DataBlock (%): 100_ _2020-09-22 18:32:08,051 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 23, current caching DataBlock (%): 100_ _2020-09-22 18:32:18,155 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 24, current caching DataBlock (%): 100_ _2020-09-22 18:32:28,479 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 25, current caching DataBlock (%): 100_ _2020-09-22 18:32:38,754 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 26, current caching DataBlock (%): 100_ _2020-09-22 18:32:49,334 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 27, current caching DataBlock (%): 100_ _2020-09-22 18:32:59,712 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 28, current caching DataBlock (%): 100_ _2020-09-22 18:33:10,061 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 29, current caching DataBlock (%): 100_ _2020-09-22 18:33:20,220 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 30, current caching DataBlock (%): 100 <- the feature enabled here because reached hbase.lru.cache.heavy.eviction.count.limit = 30_ _2020-09-22 18:33:30,314 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 31, current caching DataBlock (%): 85_ _2020-09-22 18:33:41,390 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 32, current caching DataBlock (%): 70_ _2020-09-22 18:33:52,281 INFO org...: BlockCache evicted (MB): 5687, overhead (%): 1795, heavy eviction counter: 33, current caching DataBlock (%): 55_ _2020-09-22 18:34:03,394 INFO org...: BlockCache evicted (MB): 4136, overhead (%): 1278, heavy eviction counter: 34, current caching DataBlock (%): 43_ _2020-09-22 18:34:15,088 INFO org...: BlockCache evicted (MB): 2585, overhead (%): 761, heavy eviction counter: 35, current caching DataBlock (%): 36_ _2020-09-22 18:34:27,752 INFO org...: BlockCache evicted (MB): 1551, overhead (%): 417, heavy eviction counter: 36, current caching DataBlock (%): 32_ _2020-09-22 18:34:45,233 INFO org...: BlockCache evicted (MB): 940, overhead (%): 213, heavy eviction counter: 37, current caching DataBlock (%): 30_ _2020-09-22 18:34:55,364 INFO org...: BlockCache evicted (MB): 289, overhead (%): -4, heavy eviction counter: 37, current caching DataBlock (%): 31_ _2020-09-22 18:35:05,466 INFO org...: BlockCache evicted (MB): 240, overhead (%): -20, heavy eviction counter: 37, current caching DataBlock (%): 34_ _2020-09-22 18:35:15,564 INFO org...: BlockCache evicted (MB): 254, overhead (%): -16, heavy eviction counter: 37, current caching DataBlock (%): 36_ _2020-09-22 18:35:25,670 INFO org...: BlockCache evicted (MB): 279, overhead (%): -7, heavy eviction counter: 37, current caching DataBlock (%): 37_ _2020-09-22 18:35:35,801 INFO org...: BlockCache evicted (MB): 294, overhead (%): -2, heavy eviction counter: 37, current caching DataBlock (%): 38_ _2020-09-22 18:35:45,918 INFO org...: BlockCache evicted (MB): 309, overhead (%): 3, heavy eviction counter: 38, current caching DataBlock (%): 38_ _2020-09-22 18:35:56,027 INFO org...: BlockCache evicted (MB): 253, overhead (%
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200618#comment-17200618 ] Danil Lipovoy edited comment on HBASE-23887 at 9/23/20, 7:13 AM: - I am going to write some article about this and have done some test CS vs HB. Think it could be interesting. I run YCSB from 2 hosts (800 threads summary) on tables which size: HBase - 300 GB on HDFS (100 GB pure data) Cassandra - 250 GB (replication factor = 3) It means the volume approximately the same (HB a little bit more). The HB parameters: _dfs.client.short.circuit.num = 5 - this is another my improvement https://issues.apache.org/jira/browse/HDFS-15202 - it helps to speed up HB more_ _hbase.lru.cache.heavy.eviction.count.limit = 30 - it means the patch will work after 30 evictions (~5 minutes)_ _hbase.lru.cache.heavy.eviction.mb.size.limit = 300 - good target for eviction_ So, I aggregated logs YCSB and put this into Excel: !image-2020-09-23-10-06-11-189.png! At the beginning CS faster then HB. When _heavy.eviction.count.limit_ pass 30 was enables the improvement and performance become the same. How it looks into the log of RegionServer: _2020-09-22 18:31:47,561 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 21, current caching DataBlock (%): 100_ _2020-09-22 18:31:57,808 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 22, current caching DataBlock (%): 100_ _2020-09-22 18:32:08,051 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 23, current caching DataBlock (%): 100_ _2020-09-22 18:32:18,155 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 24, current caching DataBlock (%): 100_ _2020-09-22 18:32:28,479 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 25, current caching DataBlock (%): 100_ _2020-09-22 18:32:38,754 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 26, current caching DataBlock (%): 100_ _2020-09-22 18:32:49,334 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 27, current caching DataBlock (%): 100_ _2020-09-22 18:32:59,712 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 28, current caching DataBlock (%): 100_ _2020-09-22 18:33:10,061 INFO org...: BlockCache evicted (MB): 7238, overhead (%): 2312, heavy eviction counter: 29, current caching DataBlock (%): 100_ _2020-09-22 18:33:20,220 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 30, current caching DataBlock (%): 100 <- the feature enabled_ _2020-09-22 18:33:30,314 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 31, current caching DataBlock (%): 85_ _2020-09-22 18:33:41,390 INFO org...: BlockCache evicted (MB): 6721, overhead (%): 2140, heavy eviction counter: 32, current caching DataBlock (%): 70_ _2020-09-22 18:33:52,281 INFO org...: BlockCache evicted (MB): 5687, overhead (%): 1795, heavy eviction counter: 33, current caching DataBlock (%): 55_ _2020-09-22 18:34:03,394 INFO org...: BlockCache evicted (MB): 4136, overhead (%): 1278, heavy eviction counter: 34, current caching DataBlock (%): 43_ _2020-09-22 18:34:15,088 INFO org...: BlockCache evicted (MB): 2585, overhead (%): 761, heavy eviction counter: 35, current caching DataBlock (%): 36_ _2020-09-22 18:34:27,752 INFO org...: BlockCache evicted (MB): 1551, overhead (%): 417, heavy eviction counter: 36, current caching DataBlock (%): 32_ _2020-09-22 18:34:45,233 INFO org...: BlockCache evicted (MB): 940, overhead (%): 213, heavy eviction counter: 37, current caching DataBlock (%): 30_ _2020-09-22 18:34:55,364 INFO org...: BlockCache evicted (MB): 289, overhead (%): -4, heavy eviction counter: 37, current caching DataBlock (%): 31_ _2020-09-22 18:35:05,466 INFO org...: BlockCache evicted (MB): 240, overhead (%): -20, heavy eviction counter: 37, current caching DataBlock (%): 34_ _2020-09-22 18:35:15,564 INFO org...: BlockCache evicted (MB): 254, overhead (%): -16, heavy eviction counter: 37, current caching DataBlock (%): 36_ _2020-09-22 18:35:25,670 INFO org...: BlockCache evicted (MB): 279, overhead (%): -7, heavy eviction counter: 37, current caching DataBlock (%): 37_ _2020-09-22 18:35:35,801 INFO org...: BlockCache evicted (MB): 294, overhead (%): -2, heavy eviction counter: 37, current caching DataBlock (%): 38_ _2020-09-22 18:35:45,918 INFO org...: BlockCache evicted (MB): 309, overhead (%): 3, heavy eviction counter: 38, current caching DataBlock (%): 38_ _2020-09-22 18:35:56,027 INFO org...: BlockCache evicted (MB): 253, overhead (%): -16, heavy eviction counter: 38, current caching DataBlock (%): 40_
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/22/20, 5:57 AM: - Is it ok for the summary doc? Sorry for a lot of mistakes, my english quite bad. Hope someone would correct the text. — Sometimes we are reading much more data than can fit into BlockCache and it is the cause of a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !image-2020-06-22-05-57-45-578.png! We could avoid this situation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times we have to run the eviction process that starts to avoid putting data to BlockCache. By default it is 2147483647 and actually equals to disable the feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to be about 1 minutes, then we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and want all of them in the cache we will have it (except for eviction of course). When we use long-term heavy reading the feature will be enabled after some time and bring better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintain it. Don't try to set it too small because it leads to premature exit from this mode. For powerful CPUs (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak systems (2-5 cores) may be good with 50-100 MB. How it works: we set the limit and after each ~10 second calculate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: _BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100_ < no eviction, 100% blocks is caching _BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97_ < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevents premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our reading is heavy for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we aren't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is the value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, then percent of caching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Maybe it is just short-term fluctuation and we will try to stay in this mode. It helps avoid premature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-18-35-48-366.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 124, then put 198, evict 124, put 223, evict 198 A l
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135251#comment-17135251 ] Danil Lipovoy edited comment on HBASE-23887 at 6/14/20, 7:04 PM: - And one more test. Before there are two different tables for scan and other for gets. Now the table was the same: !scan_and_gets2.png! The ratio looks different because reading the same blocks. evicted (MB): 0, ratio 0.0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): 0, ratio 0.0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): 2170, ratio 1.09, overhead (%): 985, heavy eviction counter: 1, current caching DataBlock (%): 91 < start evicted (MB): 3763, ratio 1.08, overhead (%): 1781, heavy eviction counter: 2, current caching DataBlock (%): 76 evicted (MB): 3306, ratio 1.07, overhead (%): 1553, heavy eviction counter: 3, current caching DataBlock (%): 61 evicted (MB): 2508, ratio 1.06, overhead (%): 1154, heavy eviction counter: 4, current caching DataBlock (%): 50 evicted (MB): 1824, ratio 1.04, overhead (%): 812, heavy eviction counter: 5, current caching DataBlock (%): 42 evicted (MB): 1482, ratio 1.03, overhead (%): 641, heavy eviction counter: 6, current caching DataBlock (%): 36 evicted (MB): 1140, ratio 1.01, overhead (%): 470, heavy eviction counter: 7, current caching DataBlock (%): 32 evicted (MB): 913, ratio 1.0, overhead (%): 356, heavy eviction counter: 8, current caching DataBlock (%): 29 evicted (MB): 912, ratio 0.89, overhead (%): 356, heavy eviction counter: 9, current caching DataBlock (%): 26 evicted (MB): 684, ratio 0.76, overhead (%): 242, heavy eviction counter: 10, current caching DataBlock (%): 24 evicted (MB): 684, ratio 0.61, overhead (%): 242, heavy eviction counter: 11, current caching DataBlock (%): 22 evicted (MB): 456, ratio 0.51, overhead (%): 128, heavy eviction counter: 12, current caching DataBlock (%): 21 evicted (MB): 456, ratio 0.42, overhead (%): 128, heavy eviction counter: 13, current caching DataBlock (%): 20 evicted (MB): 456, ratio 0.33, overhead (%): 128, heavy eviction counter: 14, current caching DataBlock (%): 19 evicted (MB): 342, ratio 0.33, overhead (%): 71, heavy eviction counter: 15, current caching DataBlock (%): 19 evicted (MB): 342, ratio 0.32, overhead (%): 71, heavy eviction counter: 16, current caching DataBlock (%): 19 evicted (MB): 342, ratio 0.31, overhead (%): 71, heavy eviction counter: 17, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.3, overhead (%): 14, heavy eviction counter: 18, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.29, overhead (%): 14, heavy eviction counter: 19, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.27, overhead (%): 14, heavy eviction counter: 20, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.25, overhead (%): 14, heavy eviction counter: 21, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.24, overhead (%): 14, heavy eviction counter: 22, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.22, overhead (%): 14, heavy eviction counter: 23, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.21, overhead (%): 14, heavy eviction counter: 24, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.2, overhead (%): 14, heavy eviction counter: 25, current caching DataBlock (%): 19 evicted (MB): 228, ratio 0.17, overhead (%): 14, heavy eviction counter: 26, current caching DataBlock (%): 19 evicted (MB): 456, ratio 0.17, overhead (%): 128, heavy eviction counter: 27, current caching DataBlock (%): 18 < added gets (but table the same) evicted (MB): 456, ratio 0.15, overhead (%): 128, heavy eviction counter: 28, current caching DataBlock (%): 17 evicted (MB): 342, ratio 0.13, overhead (%): 71, heavy eviction counter: 29, current caching DataBlock (%): 17 evicted (MB): 342, ratio 0.11, overhead (%): 71, heavy eviction counter: 30, current caching DataBlock (%): 17 evicted (MB): 342, ratio 0.09, overhead (%): 71, heavy eviction counter: 31, current caching DataBlock (%): 17 evicted (MB): 228, ratio 0.08, overhead (%): 14, heavy eviction counter: 32, current caching DataBlock (%): 17 evicted (MB): 228, ratio 0.07, overhead (%): 14, heavy eviction counter: 33, current caching DataBlock (%): 17 evicted (MB): 228, ratio 0.06, overhead (%): 14, heavy eviction counter: 34, current caching DataBlock (%): 17 evicted (MB): 228, ratio 0.05, overhead (%): 14, heavy eviction counter: 35, current caching DataBlock (%): 17 evicted (MB): 228, ratio 0.05, overhead (%): 14, heavy eviction counter: 36, current caching DataBlock (%): 17 evicted (MB): 228, ratio 0.04, overhead (%): 14, heavy eviction counter: 37, current caching DataBlock (%): 17 evicted (MB): 109, ratio 0.04,
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135184#comment-17135184 ] Danil Lipovoy edited comment on HBASE-23887 at 6/14/20, 2:51 PM: - And the log the second try (the feature is enabled): evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -905, ratio 0.0, overhead (%): -552, heavy eviction counter: 0, current caching DataBlock (%): 100 < start the test evicted (MB): 4676, ratio 1.07, overhead (%): 2238, heavy eviction counter: 1, current caching DataBlock (%): 85 evicted (MB): 4561, ratio 1.05, overhead (%): 2180, heavy eviction counter: 2, current caching DataBlock (%): 70 evicted (MB): 3535, ratio 1.04, overhead (%): 1667, heavy eviction counter: 3, current caching DataBlock (%): 55 evicted (MB): 2508, ratio 1.02, overhead (%): 1154, heavy eviction counter: 4, current caching DataBlock (%): 44 evicted (MB): 1824, ratio 0.88, overhead (%): 812, heavy eviction counter: 5, current caching DataBlock (%): 36 evicted (MB): 1255, ratio 0.61, overhead (%): 527, heavy eviction counter: 6, current caching DataBlock (%): 31 evicted (MB): 912, ratio 0.41, overhead (%): 356, heavy eviction counter: 7, current caching DataBlock (%): 28 evicted (MB): 684, ratio 0.32, overhead (%): 242, heavy eviction counter: 8, current caching DataBlock (%): 26 evicted (MB): 570, ratio 0.29, overhead (%): 185, heavy eviction counter: 9, current caching DataBlock (%): 25 evicted (MB): 342, ratio 0.24, overhead (%): 71, heavy eviction counter: 10, current caching DataBlock (%): 25 evicted (MB): 342, ratio 0.19, overhead (%): 71, heavy eviction counter: 11, current caching DataBlock (%): 25 evicted (MB): 342, ratio 0.14, overhead (%): 71, heavy eviction counter: 12, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.12, overhead (%): 14, heavy eviction counter: 13, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 14, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.08, overhead (%): 14, heavy eviction counter: 15, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.07, overhead (%): 14, heavy eviction counter: 16, current caching DataBlock (%): 25 evicted (MB): 223, ratio 0.06, overhead (%): 11, heavy eviction counter: 17, current caching DataBlock (%): 25 evicted (MB): 107, ratio 0.06, overhead (%): -47, heavy eviction counter: 17, current caching DataBlock (%): 30 < back pressure evicted (MB): 456, ratio 0.16, overhead (%): 128, heavy eviction counter: 18, current caching DataBlock (%): 29 evicted (MB): 456, ratio 0.19, overhead (%): 128, heavy eviction counter: 19, current caching DataBlock (%): 28 evicted (MB): 456, ratio 0.2, overhead (%): 128, heavy eviction counter: 20, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.19, overhead (%): 71, heavy eviction counter: 21, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.17, overhead (%): 71, heavy eviction counter: 22, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.16, overhead (%): 71, heavy eviction counter: 23, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.14, overhead (%): 71, heavy eviction counter: 24, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.13, overhead (%): 71, heavy eviction counter: 25, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.12, overhead (%): 14, heavy eviction counter: 26, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.12, overhead (%): 14, heavy eviction counter: 27, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.11, overhead (%): 14, heavy eviction counter: 28, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.11, overhead (%): 14, heavy eviction counter: 29, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 30, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 31, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 32, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.09, overhead (%): 14, heavy eviction counter: 33, current caching DataBlock (%): 27 evicted (MB): 1026, ratio 0.42, overhead (%): 413, heavy eviction counter: 34, current caching DataBlock (%): 23 < added gets evicted (MB): 1140, ratio 0.66, overhead (%): 470, heavy eviction counter: 35, current caching DataBlock (%): 19 evicted (MB): 913, ratio 0.75, overhead (
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135184#comment-17135184 ] Danil Lipovoy edited comment on HBASE-23887 at 6/14/20, 2:49 PM: - And the log the second try (the feature is enabled): evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -905, ratio 0.0, overhead (%): -552, heavy eviction counter: 0, current caching DataBlock (%): 100 < start the test evicted (MB): 4676, ratio 1.07, overhead (%): 2238, heavy eviction counter: 1, current caching DataBlock (%): 85 evicted (MB): 4561, ratio 1.05, overhead (%): 2180, heavy eviction counter: 2, current caching DataBlock (%): 70 evicted (MB): 3535, ratio 1.04, overhead (%): 1667, heavy eviction counter: 3, current caching DataBlock (%): 55 evicted (MB): 2508, ratio 1.02, overhead (%): 1154, heavy eviction counter: 4, current caching DataBlock (%): 44 evicted (MB): 1824, ratio 0.88, overhead (%): 812, heavy eviction counter: 5, current caching DataBlock (%): 36 evicted (MB): 1255, ratio 0.61, overhead (%): 527, heavy eviction counter: 6, current caching DataBlock (%): 31 evicted (MB): 912, ratio 0.41, overhead (%): 356, heavy eviction counter: 7, current caching DataBlock (%): 28 evicted (MB): 684, ratio 0.32, overhead (%): 242, heavy eviction counter: 8, current caching DataBlock (%): 26 evicted (MB): 570, ratio 0.29, overhead (%): 185, heavy eviction counter: 9, current caching DataBlock (%): 25 evicted (MB): 342, ratio 0.24, overhead (%): 71, heavy eviction counter: 10, current caching DataBlock (%): 25 evicted (MB): 342, ratio 0.19, overhead (%): 71, heavy eviction counter: 11, current caching DataBlock (%): 25 evicted (MB): 342, ratio 0.14, overhead (%): 71, heavy eviction counter: 12, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.12, overhead (%): 14, heavy eviction counter: 13, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 14, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.08, overhead (%): 14, heavy eviction counter: 15, current caching DataBlock (%): 25 evicted (MB): 228, ratio 0.07, overhead (%): 14, heavy eviction counter: 16, current caching DataBlock (%): 25 evicted (MB): 223, ratio 0.06, overhead (%): 11, heavy eviction counter: 17, current caching DataBlock (%): 25 evicted (MB): 107, ratio 0.06, overhead (%): -47, heavy eviction counter: 17, current caching DataBlock (%): 30 < back pressure evicted (MB): 456, ratio 0.16, overhead (%): 128, heavy eviction counter: 18, current caching DataBlock (%): 29 evicted (MB): 456, ratio 0.19, overhead (%): 128, heavy eviction counter: 19, current caching DataBlock (%): 28 evicted (MB): 456, ratio 0.2, overhead (%): 128, heavy eviction counter: 20, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.19, overhead (%): 71, heavy eviction counter: 21, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.17, overhead (%): 71, heavy eviction counter: 22, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.16, overhead (%): 71, heavy eviction counter: 23, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.14, overhead (%): 71, heavy eviction counter: 24, current caching DataBlock (%): 27 evicted (MB): 342, ratio 0.13, overhead (%): 71, heavy eviction counter: 25, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.12, overhead (%): 14, heavy eviction counter: 26, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.12, overhead (%): 14, heavy eviction counter: 27, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.11, overhead (%): 14, heavy eviction counter: 28, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.11, overhead (%): 14, heavy eviction counter: 29, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 30, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 31, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.1, overhead (%): 14, heavy eviction counter: 32, current caching DataBlock (%): 27 evicted (MB): 228, ratio 0.09, overhead (%): 14, heavy eviction counter: 33, current caching DataBlock (%): 27 evicted (MB): 1026, ratio 0.42, overhead (%): 413, heavy eviction counter: 34, current caching DataBlock (%): 23 < added gets evicted (MB): 1140, ratio 0.66, overhead (%): 470, heavy eviction counter: 35, current caching DataBlock (%): 19 evicted (MB): 913, ratio 0.75, overhead (
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135181#comment-17135181 ] Danil Lipovoy edited comment on HBASE-23887 at 6/14/20, 2:47 PM: - [~bharathv] I found out why scan was slow - the bottle neck was network (I sent requests form other PC). Now I scan local and the network is not the problem. So I run a test scenario: 1. Scan (25 threads, batch = 100) 2. After 5 minutes add multi-gets (25 threads, batch = 100) 3. After 5 minutes switch off multi-gets (only scan again) During the test have checked the ratio between single and multi caches (added it into log file). The first run with *hbase.lru.cache.heavy.eviction.count.limit* = 1 (disable the feature) and the second the limit = 0. !scan_and_gets.png! Take a look on the ratio (single/multi). Log file of the first run (the feature is disabled): evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): -2721, ratio 0.0, overhead (%): -1460, heavy eviction counter: 0, current caching DataBlock (%): 100 evicted (MB): 114, ratio 5.78, overhead (%): -43, heavy eviction counter: 0, current caching DataBlock (%): 100 < start the test evicted (MB): 5704, ratio 1.07, overhead (%): 2752, heavy eviction counter: 1, current caching DataBlock (%): 100 evicted (MB): 5245, ratio 1.06, overhead (%): 2522, heavy eviction counter: 2, current caching DataBlock (%): 100 evicted (MB): 4902, ratio 1.06, overhead (%): 2351, heavy eviction counter: 3, current caching DataBlock (%): 100 evicted (MB): 4788, ratio 1.06, overhead (%): 2294, heavy eviction counter: 4, current caching DataBlock (%): 100 evicted (MB): 5132, ratio 1.06, overhead (%): 2466, heavy eviction counter: 5, current caching DataBlock (%): 100 evicted (MB): 5018, ratio 1.07, overhead (%): 2409, heavy eviction counter: 6, current caching DataBlock (%): 100 evicted (MB): 5244, ratio 1.06, overhead (%): 2522, heavy eviction counter: 7, current caching DataBlock (%): 100 evicted (MB): 5019, ratio 1.07, overhead (%): 2409, heavy eviction counter: 8, current caching DataBlock (%): 100 evicted (MB): 4902, ratio 1.06, overhead (%): 2351, heavy eviction counter: 9, current caching DataBlock (%): 100 evicted (MB): 4904, ratio 1.06, overhead (%): 2352, heavy eviction counter: 10, current caching DataBlock (%): 100 evicted (MB): 5017, ratio 1.06, overhead (%): 2408, heavy eviction counter: 11, current caching DataBlock (%): 100 evicted (MB): 4563, ratio 1.06, overhead (%): 2181, heavy eviction counter: 12, current caching DataBlock (%): 100 evicted (MB): 4338, ratio 1.06, overhead (%): 2069, heavy eviction counter: 13, current caching DataBlock (%): 100 evicted (MB): 4789, ratio 1.06, overhead (%): 2294, heavy eviction counter: 14, current caching DataBlock (%): 100 evicted (MB): 4902, ratio 1.06, overhead (%): 2351, heavy eviction counter: 15, current caching DataBlock (%): 100 evicted (MB): 5130, ratio 1.06, overhead (%): 2465, heavy eviction counter: 16, current caching DataBlock (%): 100 evicted (MB): 5017, ratio 1.06, overhead (%): 2408, heavy eviction counter: 17, current caching DataBlock (%): 100 evicted (MB): 4795, ratio 1.06, overhead (%): 2297, heavy eviction counter: 18, current caching DataBlock (%): 100 evicted (MB): 4905, ratio 1.07, overhead (%): 2352, heavy eviction counter: 19, current caching DataBlock (%): 100 evicted (MB): 4911, ratio 1.06, overhead (%): 2355, heavy eviction counter: 20, current caching DataBlock (%): 100 evicted (MB): 5019, ratio 1.06, overhead (%): 2409, heavy eviction counter: 21, current caching DataBlock (%): 100 evicted (MB): 5134, ratio 1.06, overhead (%): 2467, heavy eviction counter: 22, current caching DataBlock (%): 100 evicted (MB): 5016, ratio 1.06, overhead (%): 2408, heavy eviction counter: 23, current caching DataBlock (%): 100 evicted (MB): 4450, ratio 1.06, overhead (%): 2125, heavy eviction counter: 24, current caching DataBlock (%): 100 evicted (MB): 4904, ratio 1.07, overhead (%): 2352, heavy eviction counter: 25, current caching DataBlock (%): 100 evicted (MB): 4561, ratio 1.06, overhead (%): 2180, heavy eviction counter: 26, current caching DataBlock (%): 100 evicted (MB): 4334, ratio 1.06, overhead (%): 2067, heavy eviction counter: 27, current caching DataBlock (%): 100 evicted (MB): 4789, ratio 1.06, overhead (%): 2294, heavy eviction counter: 28, current caching DataBlock (%): 100 evicted (MB): 4792, ratio 1.06, overhead (%): 2296, heavy eviction counter: 29, current caching DataBlock (%): 100 evicted (MB): 4903, ratio 1.07, overhead (%): 2351, heavy eviction counter: 30, current caching DataBlock (%): 100 evicted (MB): 4791,
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/10/20, 5:40 AM: - Is it ok for the summary doc? Sorry for a lot of mistakes, my english quite bad. Hope someone would correct the text. — Sometimes we are reading much more data than can fit into BlockCache and it is the cause of a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !BlockCacheEvictionProcess.gif! (I will actualize the name of param in the gif later) We could avoid this situation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times we have to run the eviction process that starts to avoid putting data to BlockCache. By default it is 2147483647 and actually equals to disable the feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to be about 1 minutes, then we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and want all of them in the cache we will have it (except for eviction of course). When we use long-term heavy reading the feature will be enabled after some time and bring better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintain it. Don't try to set it too small because it leads to premature exit from this mode. For powerful CPUs (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak systems (2-5 cores) may be good with 50-100 MB. How it works: we set the limit and after each ~10 second calculate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: _BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100_ < no eviction, 100% blocks is caching _BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97_ < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevents premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our reading is heavy for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we aren't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is the value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, then percent of caching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Maybe it is just short-term fluctuation and we will try to stay in this mode. It helps avoid premature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-18-35-48-366.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 12
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/8/20, 3:38 PM: Is it ok for the summary doc? Sorry for a lot of mistakes, my english quite bad. Hope someone would correct the text. — Sometimes we are reading much more data than can fit into BlockCache and it is the cause a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !BlockCacheEvictionProcess.gif! (I will actualize the name of param in the gif later) We could avoid this sitiuation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times have to run eviction process that start to avoid of putting data to BlockCache. By default it is 2147483647 and actually equals to disable feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to about 1 minutes, than we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and wanted all of them it the cache we will have it (except of evicted of course). When we use long-term heavy reading the featue will enabled after some time and bring better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintan it. Don't try to set it too small because it lead to premature exit from this mode. For powerful CPU (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak system (2-5 cores) maybe good with 50-100 MB. How it works: we set the limit and after each ~10 second caluclate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: _BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100_ < no eviction, 100% blocks is caching _BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97_ < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevent premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our heavy reading for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we don't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, than percent of chaching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Mayby it is just short-term fluctuation and we will try to stay in this mode. It help avoid permature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-18-35-48-366.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 124, then put 198, evict 124, put 223,
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/8/20, 3:36 PM: Is it ok for the summury doc? — Sometimes we are reading much more data than can fit into BlockCache and it is the cause a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !BlockCacheEvictionProcess.gif! (I will actualize the name of param in the gif later) We could avoid this sitiuation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times have to run eviction process that start to avoid of putting data to BlockCache. By default it is 2147483647 and actually equals to disable feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to about 1 minutes, than we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and wanted all of them it the cache we will have it (except of evicted of course). When we use long-term heavy reading the featue will enabled after some time and bring better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintan it. Don't try to set it too small because it lead to premature exit from this mode. For powerful CPU (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak system (2-5 cores) maybe good with 50-100 MB. How it works: we set the limit and after each ~10 second caluclate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: _BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100_ < no eviction, 100% blocks is caching _BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97_ < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevent premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our heavy reading for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we don't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, than percent of chaching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Mayby it is just short-term fluctuation and we will try to stay in this mode. It help avoid permature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-18-35-48-366.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 124, then put 198, evict 124, put 223, evict 198 A lot of work (5 actions and 2 evictions). With the feature and *hbase.lru
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/8/20, 3:29 PM: Is it ok for the summury doc? — Sometimes we are reading much more data than can fit into BlockCache and it is the cause a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !BlockCacheEvictionProcess.gif! (I will actualize the name of param in the gif later) We could avoid this sitiuation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times have to run eviction process that start to avoid of putting data to BlockCache. By default it is 2147483647 and actually equals to disable feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to about 1 minutes, than we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and wanted all of them it the cache we will have it (except of evicted of course). When we use long-term heavy reading the featue will enabled after some time and bring better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintan it. Don't try to set it too small because it lead to premature exit from this mode. For powerful CPU (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak system (2-5 cores) maybe good with 50-100 MB. How it works: we set the limit and after each ~10 second caluclate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: _BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100_ < no eviction, 100% blocks is caching _BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97_ < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevent premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our heavy reading for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we don't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, than percent of chaching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Mayby it is just short-term fluctuation and we will ty to stay in this mode. It help avoid permature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-17-38-52-579.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 124, then put 198, evict 124, put 223, evict 198 A lot of work (5 actions and 2 evictions). With the feature and *hbase.lru.
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/8/20, 3:23 PM: Is it ok for the summury doc? — Sometimes we are reading much more data than can fit into BlockCache and it is the cause a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !BlockCacheEvictionProcess.gif! (I will actualize the name of param in the gif later) We could avoid this sitiuation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times have to run eviction process that avoid of putting data to BlockCache. By default it is 2147483647 and actually equals to disable feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to about 1 minutes, than we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and wanted all of them it the cache we will have it (except of evicted of course). When we use long-term heavy reading the featue will enabled after some time and bring better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintan it. Don't try to set it too small because it lead to premature exit from this mode. For powerful CPU (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak system (2-5 cores) maybe good with 50-100 MB. How it works: we set the limit and after each ~10 second caluclate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 < no eviction, 100% blocks is caching BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97 < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevent premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our heavy reading for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we don't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, than percent of chaching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Mayby it is just short-term fluctuation and we will ty to stay in this mode. It help avoid permature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-17-38-52-579.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 124, then put 198, evict 124, put 223, evict 198 A lot of work (5 actions and 2 evictions). With the feature and hbase.lru.cache.heavy.ev
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/8/20, 2:42 PM: Is it ok for the summury doc? — Sometimes we are reading more data than can fit into BlockCache and it is the cause a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !BlockCacheEvictionProcess.gif! (I will actualize the name of param in the gif later) We could avoid this sitiuation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times have to run eviction process that avoid of putting data to BlockCache. By default it is 2147483647 and actually equals to disable feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to about 1 minutes, than we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and wanted all of them it the cache we will have it (except of evicted of course). When we use long-term heavy reading the featue will enabled after some time and birng better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintan it. Don't try to set it too small because it lead to premature exit from this mode. For powerful CPU (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak system (2-5 cores) maybe good with 50-100 MB. How it works: we set the limit and after each ~10 second caluclate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 < no eviction, 100% blocks is caching BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97 < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevent premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our heavy reading for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we don't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, than percent of chaching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Mayby it is just short-term fluctuation and we will ty to stay in this mode. It help avoid permature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-17-38-52-579.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 124, then put 198, evict 124, put 223, evict 198 A lot of work (5 actions and 2 evictions). With the feature and hbase.lru.cache.heavy.evictio
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128349#comment-17128349 ] Danil Lipovoy edited comment on HBASE-23887 at 6/8/20, 2:38 PM: Is it ok for the summury doc? — Sometimes we are reading more data than can fit into BlockCache and it is the cause a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. !BlockCacheEvictionProcess.gif! We could avoid this sitiuation via parameters: *hbase.lru.cache.heavy.eviction.count.limit* - set how many times have to run eviction process that avoid of putting data to BlockCache. By default it is 2147483647 and actually equals to disable feature of increasing performance. Because eviction runs about every 5 - 10 second (it depends of workload) and 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 years. Just after that time it will start to work. We can set this parameter to 0 and get working the feature right now. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to about 1 minutes, than we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and wanted all of them it the cache we will have it (except of evicted of course). When we use long-term heavy reading the featue will enabled after some time and birng better performance. *hbase.lru.cache.heavy.eviction.mb.size.limit* - set how many bytes desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintan it. Don't try to set it too small because it lead to premature exit from this mode. For powerful CPU (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak system (2-5 cores) maybe good with 50-100 MB. How it works: we set the limit and after each ~10 second caluclate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). So kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead, for example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks and fit evicted bytes closer to 100% (500 MB). The current situation we can found in the log of RegionServer: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 < no eviction, 100% blocks is caching BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97 < eviction begin, reduce of caching blocks It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 30-100% overhead, it prevent premature exit from this mode. *hbase.lru.cache.heavy.eviction.overhead.coefficient* - set how fast we want to get the result. If we know that our heavy reading for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we don't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, than percent of chaching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Mayby it is just short-term fluctuation and we will ty to stay in this mode. It help avoid permature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks. !image-2020-06-08-17-38-52-579.png! Finally, how to work reducing percent of caching blocks. Imagine we have very little cache, where can fit only 1 block and we are trying to read 3 blocks with offsets: 124 198 223 Without the feature, or when *hbase.lru.cache.heavy.eviction.count.limit* = 2147483647 we will put the block: 124, then put 198, evict 124, put 223, evict 198 A lot of work (5 actions and 2 evictions). With the feature and hbase.lru.cache.heavy.eviction.count.limit = 0 and the auto-scaling have reached to
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127416#comment-17127416 ] Danil Lipovoy edited comment on HBASE-23887 at 6/7/20, 5:19 AM: [~bharathv] I have found there is no difference in performance while we are scanning: !scan.png! The cause looks like low GC during the scan. So it doesn't matter what kind of BC to check. I think when we scan we got the bottle neck in another place (it is not obvious where) and that's why results the same. Another thing, I a little bit changed logic of calculation which of percent we have to skip (cache.cacheDataBlockPercent). I added the new param that help to control it: {color:#067d17}hbase.lru.cache.heavy.eviction.overhead.coefficient = 0.01{color} {color:#172b4d}(heavyEvictionOverheadCoefficient){color} {code:java} freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; ... if (heavyEvictionCount > cache.heavyEvictionCountLimit) { int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); ch = ch > 15 ? 15 : ch; ch = ch < 0 ? 0 : ch; cache.cacheDataBlockPercent -= ch; cache.cacheDataBlockPercent = cache.cacheDataBlockPercent < 1 ? 1 : cache.cacheDataBlockPercent; } {code} And when we go below *hbase.lru.cache.heavy.eviction.mb.size.limit* {color:#172b4d}We use backward pressure:{color} {code:java} if (mbFreedSum >= cache.heavyEvictionMbSizeLimit * 0.1) { // It help avoid exit during short-term fluctuation int ch = (int) (-freedDataOverheadPercent * 0.1 + 1); cache.cacheDataBlockPercent += ch; cache.cacheDataBlockPercent = cache.cacheDataBlockPercent > 100 ? 100 : cache.cacheDataBlockPercent; } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } {code} (full PR here: [https://github.com/apache/hbase/pull/1257/files]) {color:#172b4d}How it looks:{color} !image-2020-06-07-08-19-00-922.png! So, when GC works hard is reduce percent of cached blocks. When we jump below the level, it help come back: BlockCache evicted (MB): 4902, overhead (%): 2351, heavy eviction counter: 1, current caching DataBlock (%): 85 < too much, fast slow down BlockCache evicted (MB): 5700, overhead (%): 2750, heavy eviction counter: 2, current caching DataBlock (%): 70 BlockCache evicted (MB): 5930, overhead (%): 2865, heavy eviction counter: 3, current caching DataBlock (%): 55 BlockCache evicted (MB): 4446, overhead (%): 2123, heavy eviction counter: 4, current caching DataBlock (%): 40 BlockCache evicted (MB): 3078, overhead (%): 1439, heavy eviction counter: 5, current caching DataBlock (%): 26 BlockCache evicted (MB): 1710, overhead (%): 755, heavy eviction counter: 6, current caching DataBlock (%): 19 < easy BlockCache evicted (MB): 1026, overhead (%): 413, heavy eviction counter: 7, current caching DataBlock (%): 15 BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 8, current caching DataBlock (%): 14 < easy BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 9, current caching DataBlock (%): 14 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 10, current caching DataBlock (%): 14 BlockCache evicted (MB): 114, overhead (%): -43, heavy eviction counter: 10, current caching DataBlock (%): 19 < back pressure BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 11, current caching DataBlock (%): 18 BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 12, current caching DataBlock (%): 17 BlockCache evicted (MB): 456, overhead (%): 128, heavy eviction counter: 13, current caching DataBlock (%): 16 BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 14, current caching DataBlock (%): 16 BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 15, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 16, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 17, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 18, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 19, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 20, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 21, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 22, current caching DataBlock (%): 16 BlockCache evicted (MB): 114, overhead (%): -43, heavy eviction counter: 22, current caching DataBlock (%): 21 < back pressure BlockCache evicted (MB): 798, overhead (%): 299, h
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127416#comment-17127416 ] Danil Lipovoy edited comment on HBASE-23887 at 6/7/20, 5:11 AM: [~bharathv] I have found there is no difference in performance while we are scanning: !scan.png! The cause looks like low GC during the scan. So it doesn't matter what kind of BC to check. I think when we scan we got the bottle neck in another place (it is not obvious where) and that's why results the same. Another thing, I a little bit changed logic of calculation which of percent we have to skip (cache.cacheDataBlockPercent). I added the new param that help to control it: {color:#067d17}hbase.lru.cache.heavy.eviction.overhead.coefficient = 0.01{color} {color:#172b4d}(heavyEvictionOverheadCoefficient){color} {code:java} freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; ... if (heavyEvictionCount > cache.heavyEvictionCountLimit) { int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); ch = ch > 15 ? 15 : ch; ch = ch < 0 ? 0 : ch; cache.cacheDataBlockPercent -= ch; cache.cacheDataBlockPercent = cache.cacheDataBlockPercent < 1 ? 1 : cache.cacheDataBlockPercent; } {code} And when we go below *hbase.lru.cache.heavy.eviction.mb.size.limit* {color:#172b4d}We use backward pressure:{color} {code:java} if (mbFreedSum >= cache.heavyEvictionMbSizeLimit * 0.1) { // It help avoid exit during short-term fluctuation int ch = (int) (-freedDataOverheadPercent * 0.1 + 1); cache.cacheDataBlockPercent += ch; cache.cacheDataBlockPercent = cache.cacheDataBlockPercent > 100 ? 100 : cache.cacheDataBlockPercent; } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } {code} (full PR here: [https://github.com/apache/hbase/pull/1257/files]) {color:#172b4d}How it looks:{color} !image-2020-06-07-08-11-11-929.png! So, when GC works hard is reduce percent of cached blocks. When we jump below the level, it help come back: BlockCache evicted (MB): 4902, overhead (%): 2351, heavy eviction counter: 1, current caching DataBlock (%): 85 < too much, fast slow down BlockCache evicted (MB): 5700, overhead (%): 2750, heavy eviction counter: 2, current caching DataBlock (%): 70 BlockCache evicted (MB): 5930, overhead (%): 2865, heavy eviction counter: 3, current caching DataBlock (%): 55 BlockCache evicted (MB): 4446, overhead (%): 2123, heavy eviction counter: 4, current caching DataBlock (%): 40 BlockCache evicted (MB): 3078, overhead (%): 1439, heavy eviction counter: 5, current caching DataBlock (%): 26 BlockCache evicted (MB): 1710, overhead (%): 755, heavy eviction counter: 6, current caching DataBlock (%): 19 < easy BlockCache evicted (MB): 1026, overhead (%): 413, heavy eviction counter: 7, current caching DataBlock (%): 15 BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 8, current caching DataBlock (%): 14 < easy BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 9, current caching DataBlock (%): 14 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 10, current caching DataBlock (%): 14 BlockCache evicted (MB): 114, overhead (%): -43, heavy eviction counter: 10, current caching DataBlock (%): 19 < back pressure BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 11, current caching DataBlock (%): 18 BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 12, current caching DataBlock (%): 17 BlockCache evicted (MB): 456, overhead (%): 128, heavy eviction counter: 13, current caching DataBlock (%): 16 BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 14, current caching DataBlock (%): 16 BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 15, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 16, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 17, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 18, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 19, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 20, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 21, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 22, current caching DataBlock (%): 16 BlockCache evicted (MB): 114, overhead (%): -43, heavy eviction counter: 22, current caching DataBlock (%): 21 < back pressure BlockCache evicted (MB): 798, overhead (%)
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127416#comment-17127416 ] Danil Lipovoy edited comment on HBASE-23887 at 6/6/20, 6:42 PM: [~bharathv] I have found there is no difference in performance while we are scanning: !scan.png! The cause looks like low GC during the scan. So it doesn't matter what kind of BC to check. I think when we scan we got the bottle neck in another place (it is not obvious where) and that's why results the same. Another thing, I a little bit changed logic of calculation which of percent we have to skip (cache.cacheDataBlockPercent). I added the new param that help to control it: {color:#067d17}hbase.lru.cache.heavy.eviction.overhead.coefficient {color:#172b4d}= 0.01{color} {color:#172b4d}(heavyEvictionOverheadCoefficient){color}{color} {code:java} freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionMbSizeLimit) - 100; ... if (heavyEvictionCount > cache.heavyEvictionCountLimit) { int ch = (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient); ch = ch > 15 ? 15 : ch; ch = ch < 0 ? 0 : ch; cache.cacheDataBlockPercent -= ch; cache.cacheDataBlockPercent = cache.cacheDataBlockPercent < 1 ? 1 : cache.cacheDataBlockPercent; } {code} And when we go below {color:#067d17}*hbase.lru.cache.heavy.eviction.mb.size.limit*{color} {color:#172b4d}We use backward pressure:{color} {code:java} if (mbFreedSum >= cache.heavyEvictionMbSizeLimit * 0.1) { // It help avoid exit during short-term fluctuation int ch = (int) (-freedDataOverheadPercent * 0.1 + 1); cache.cacheDataBlockPercent += ch; cache.cacheDataBlockPercent = cache.cacheDataBlockPercent > 100 ? 100 : cache.cacheDataBlockPercent; } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } {code} (full PR here: https://github.com/apache/hbase/pull/1257/files) {color:#172b4d}How it looks:{color} {color:#172b4d}!graph.png!{color} So, when GC works hard is reduce percent of cached blocks. When we jump below the level, it help come back: BlockCache evicted (MB): 4902, overhead (%): 2351, heavy eviction counter: 1, current caching DataBlock (%): 85 < too much, fast slow down BlockCache evicted (MB): 5700, overhead (%): 2750, heavy eviction counter: 2, current caching DataBlock (%): 70 BlockCache evicted (MB): 5930, overhead (%): 2865, heavy eviction counter: 3, current caching DataBlock (%): 55 BlockCache evicted (MB): 4446, overhead (%): 2123, heavy eviction counter: 4, current caching DataBlock (%): 40 BlockCache evicted (MB): 3078, overhead (%): 1439, heavy eviction counter: 5, current caching DataBlock (%): 26 BlockCache evicted (MB): 1710, overhead (%): 755, heavy eviction counter: 6, current caching DataBlock (%): 19 < easy BlockCache evicted (MB): 1026, overhead (%): 413, heavy eviction counter: 7, current caching DataBlock (%): 15 BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 8, current caching DataBlock (%): 14 < easy BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 9, current caching DataBlock (%): 14 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 10, current caching DataBlock (%): 14 BlockCache evicted (MB): 114, overhead (%): -43, heavy eviction counter: 10, current caching DataBlock (%): 19 < back pressure BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 11, current caching DataBlock (%): 18 BlockCache evicted (MB): 570, overhead (%): 185, heavy eviction counter: 12, current caching DataBlock (%): 17 BlockCache evicted (MB): 456, overhead (%): 128, heavy eviction counter: 13, current caching DataBlock (%): 16 BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 14, current caching DataBlock (%): 16 BlockCache evicted (MB): 342, overhead (%): 71, heavy eviction counter: 15, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 16, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 17, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 18, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 19, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 20, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 21, current caching DataBlock (%): 16 BlockCache evicted (MB): 228, overhead (%): 14, heavy eviction counter: 22, current caching DataBlock (%): 16 BlockCache evicted (MB): 114, overhead (%): -43, heavy eviction counter: 22, current caching DataBlock (%): 21 < back pressure BlockCache e
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123362#comment-17123362 ] Danil Lipovoy edited comment on HBASE-23887 at 6/2/20, 5:54 AM: Did more tests with the same tables, but in this time _recordcount_ = count of records in the table and *hbase.lru.cache.heavy.eviction.count.limit* = 0 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 The results: !requests_new_100p.png! And YCSB stats: | |*original*|*feature*|*%*| |tbl1-u (ops/sec)|29,601|39,088|132| |tbl2-u (ops/sec)|38,793|61,692|159| |tbl3-u (ops/sec)|38,216|60,415|158| |tbl4-u (ops/sec)|325|657|202| |tbl1-z (ops/sec)|46,990|58,252|124| |tbl2-z (ops/sec)|54,401|72,484|133| |tbl3-z (ops/sec)|57,100|69,984|123| |tbl4-z (ops/sec)|452|763|169| |tbl1-l (ops/sec)|56,001|63,804|114| |tbl2-l (ops/sec)|68,700|76,074|111| |tbl3-l (ops/sec)|64,189|72,229|113| |tbl4-l (ops/sec)|619|897|145| | | | | | | | | | | | |*original*|*feature*|*%*| |tbl1-u AverageLatency(us)|1,686|1,276|76| |tbl2-u AverageLatency(us)|1,287|808|63| |tbl3-u AverageLatency(us)|1,306|825|63| |tbl4-u AverageLatency(us)|76,810|38,007|49| |tbl1-z AverageLatency(us)|1,061|856|81| |tbl2-z AverageLatency(us)|917|688|75| |tbl3-z AverageLatency(us)|873|712|82| |tbl4-z AverageLatency(us)|55,114|32,670|59| |tbl1-l AverageLatency(us)|890|781|88| |tbl2-l AverageLatency(us)|726|655|90| |tbl3-l AverageLatency(us)|777|690|89| |tbl4-l AverageLatency(us)|40,235|27,774|69| | | | | | | | | | | | |*original*|*feature*|*%*| |tbl1-u 95thPercentileLatency(us)|2,831|2,569|91| |tbl2-u 95thPercentileLatency(us)|1,266|1,073|85| |tbl3-u 95thPercentileLatency(us)|1,497|1,194|80| |tbl4-u 95thPercentileLatency(us)|370,943|49,471|13| |tbl1-z 95thPercentileLatency(us)|1,784|1,669|94| |tbl2-z 95thPercentileLatency(us)|918|871|95| |tbl3-z 95thPercentileLatency(us)|978|933|95| |tbl4-z 95thPercentileLatency(us)|336,639|48,863|15| |tbl1-l 95thPercentileLatency(us)|1,523|1,441|95| |tbl2-l 95thPercentileLatency(us)|820|825|101| |tbl3-l 95thPercentileLatency(us)|918|907|99| |tbl4-l 95thPercentileLatency(us)|77,951|48,575|62| was (Author: pustota): Did more tests with the same tables, but in this time _recordcount_ = count of records in the table and *hbase.lru.cache.heavy.eviction.count.limit* = 0 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 The results: !requests_new_100p.png! sdYCSB stats: | |*original*|*feature*|*%*| |tbl1-u (ops/sec)|29,601|39,088|132| |tbl2-u (ops/sec)|38,793|61,692|159| |tbl3-u (ops/sec)|38,216|60,415|158| |tbl4-u (ops/sec)|325|657|202| |tbl1-z (ops/sec)|46,990|58,252|124| |tbl2-z (ops/sec)|54,401|72,484|133| |tbl3-z (ops/sec)|57,100|69,984|123| |tbl4-z (ops/sec)|452|763|169| |tbl1-l (ops/sec)|56,001|63,804|114| |tbl2-l (ops/sec)|68,700|76,074|111| |tbl3-l (ops/sec)|64,189|72,229|113| |tbl4-l (ops/sec)|619|897|145| | | | | | | | | | | | |*original*|*feature*|*%*| |tbl1-u AverageLatency(us)|1,686|1,276|76| |tbl2-u AverageLatency(us)|1,287|808|63| |tbl3-u AverageLatency(us)|1,306|825|63| |tbl4-u AverageLatency(us)|76,810|38,007|49| |tbl1-z AverageLatency(us)|1,061|856|81| |tbl2-z AverageLatency(us)|917|688|75| |tbl3-z AverageLatency(us)|873|712|82| |tbl4-z AverageLatency(us)|55,114|32,670|59| |tbl1-l AverageLatency(us)|890|781|88| |tbl2-l AverageLatency(us)|726|655|90| |tbl3-l AverageLatency(us)|777|690|89| |tbl4-l AverageLatency(us)|40,235|27,774|69| | | | | | | | | | | | |*original*|*feature*|*%*| |tbl1-u 95thPercentileLatency(us)|2,831|2,569|91| |tbl2-u 95thPercentileLatency(us)|1,266|1,073|85| |tbl3-u 95thPercentileLatency(us)|1,497|1,194|80| |tbl4-u 95thPercentileLatency(us)|370,943|49,471|13| |tbl1-z 95thPercentileLatency(us)|1,784|1,669|94| |tbl2-z 95thPercentileLatency(us)|918|871|95| |tbl3-z 95thPercentileLatency(us)|978|933|95| |tbl4-z 95thPercentileLatency(us)|336,639|48,863|15| |tbl1-l 95thPercentileLatency(us)|1,523|1,441|95| |tbl2-l 95thPercentileLatency(us)|820|825|101| |tbl3-l 95thPercentileLatency(us)|918|907|99| |tbl4-l 95thPercentileLatency(us)|77,951|48,575|62| > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, cmp.png, evict_BC100_vs_BC23.png, > eviction_100p.png, eviction_100p.png, eviction_100p.png, gc_100p.png, > read_requests_100pBC_vs_23pBC.png, requests_100p.png, requests_100p.png, > requests_new_100p.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to impro
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 9:18 PM: - All tests below have done on my home PC: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of auto-scaling (see describe here): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (mbFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation. Will be fine add more logic here. } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables (32 regions each): tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _requestdistribution=uniform_ Workload scenario "z": _requestdistribution=zipfian_ Workload scenario "l": _requestdistribution=latest_ Other parameters: _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _recordcount=100 (I just noticed this value is too small, I will provide new tests with bigger value later)_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature: *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 | no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 | start reading but *count.limit* haven't reached. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 | *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching DataBlock (%): 91 LruBlockCache: BlockCache evicted (MB): 7722
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120565#comment-17120565 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 8:44 PM: - Seems like our trouble with the servers for a long time and I've decided install HBase on my home PC. Another important point - I have done the algorithm, that I posted above (will add changes to PR quite soon). It is good when numbers of reading requests are changing. Looks like the new approach copes well with wide variety kind of situation (a lot of tests in the next messages after answers). 1. I'm nor sure, but maybe it is because first few seconds, while BlockCache is empty, my old version of realization prevented effective populating the BC. I mean it was skipping blocks when eviction is not running - and a lot of blocks could be cached but were lost. With the new approach the problems has gone. For example: This is when 100% of data caching (uniform distribution): [OVERALL], RunTime(ms), 1506417 [OVERALL], Throughput(ops/sec), 33191.34077748724 [TOTAL_GCS_PS_Scavenge], Count, 8388 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 12146 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.8062840501667201 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 22 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.0014604189942094387 [TOTAL_GCs], Count, 8389 [TOTAL_GC_TIME], Time(ms), 12168 [TOTAL_GC_TIME_%], Time(%), 0.8077444691609296 [READ], Operations, 5000 [READ], AverageLatency(us), 1503.45024378 [READ], MinLatency(us), 137 [READ], MaxLatency(us), 383999 [READ], 95thPercentileLatency(us), 2231 [READ], 99thPercentileLatency(us), 13503 [READ], Return=OK, 5000 The same table with the patch: [OVERALL], RunTime(ms), 1073257 [OVERALL], Throughput(ops/sec), 46587.1641181935 [TOTAL_GCS_PS_Scavenge], Count, 7201 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 9799 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9130152423883563 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 23 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.002143009549436901 [TOTAL_GCs], Count, 7202 [TOTAL_GC_TIME], Time(ms), 9822 [TOTAL_GC_TIME_%], Time(%), 0.9151582519377931 [READ], Operations, 5000 [READ], AverageLatency(us), 1070.52889804 [READ], MinLatency(us), 142 [READ], MaxLatency(us), 327167 [READ], 95thPercentileLatency(us), 2071 [READ], 99thPercentileLatency(us), 6539 [READ], Return=OK, 5000 The same picture show all other test - you could see details below. 2.Looks like it could make negative effect if we try to use the feature if we set *hbase.lru.cache.heavy.eviction.count.limit*=0 and *hbase.lru.cache.heavy.eviction.mb.size.limit*=1 and doing sporadly short reading the same data. I meant when size BC=3 and we read block 1,2,3,4,3,4 ... 4,3,2,1,2,1 ... 1,2,3,4,3,4... In this scenario better save all blocks. But this parameters will skip blocks which we will need quite soon. My opinion - it is extremely good for massive long-term reading on powerful servers. For short reading small amount of date too small values of the parameters could be pathological. 3. If I understand you correct - you meant that after compaction real blocks offset changed. But when HFiles compacted anyway all blocks removed from BC too. 4.Now we have two parameters for tuning: *hbase.lru.cache.heavy.eviction.count.limit* - it controls how soon we want to see eviction rate reduce. If we know that our load pattern is only long term reading, we can set it 0. It means if we are reading - it is for a long time. But if we have some times short reading the same data and some times long-term reading - we have to divide it by this parameter. For example we know - our short reading used to about 1 min, we have to set the param about 10 and it will enable the feature only for long time massive reading. *hbase.lru.cache.heavy.eviction.mb.size.limit* - it lets to control when we sure that GC will be suffer. For weak PC it could be about 50-100 MB. For powerful servers 300-500 MB. I added some useful information into logging: {color:#871094}LOG{color}.info({color:#067d17}"BlockCache evicted (MB): {}, overhead (%) {}, " {color}+ {color:#067d17}"heavy eviction counter {}, " {color}+ {color:#067d17}"current caching DataBlock (%): {}"{color}, mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, {color:#00}cache{color}.{color:#871094}cacheDataBlockPercent{color}); It will help to understand what kind of values we have and how to tune it. 5. I think it is pretty good idea. Give me time, please, to do tests and check what will be. Well, I will post information about the tests in the next message. was (Author: pustota): Seems like our trouble with the servers for a long time and I've decided install HBase on my home PC. Another important point - I have done t
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 8:43 PM: - All tests below have done on my home PC: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of auto-scaling (see describe here): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (mbFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation. Will be fine add more logic here. } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables (32 regions each): tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=uniform_ Workload scenario "z": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=zipfian_ Workload scenario "l": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=latest_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature: *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 | no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 | start reading but *count.limit* haven't reached. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 | *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching Dat
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120565#comment-17120565 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 8:06 PM: - Seems like our trouble with the servers for a long time and I've decided install HBase on my home PC. Another important point - I have done the algorithm, that I posted above (will add changes to PR quite soon). It is good when numbers of reading requests are changing. Looks like the new approach copes well with wide variety kind of situation (a lot of tests in the next messages after answers). 1. I'm nor sure, but maybe it is because first few seconds, while BlockCache is empty, my old version of realization prevented effective populating the BC. I mean it was skipping blocks when eviction is not running - and a lot of blocks could be cached but were lost. With the new approach the problems has gone. For example: This is when 100% of data caching (uniform distribution): [OVERALL], RunTime(ms), 1506417 [OVERALL], Throughput(ops/sec), 33191.34077748724 [TOTAL_GCS_PS_Scavenge], Count, 8388 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 12146 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.8062840501667201 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 22 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.0014604189942094387 [TOTAL_GCs], Count, 8389 [TOTAL_GC_TIME], Time(ms), 12168 [TOTAL_GC_TIME_%], Time(%), 0.8077444691609296 [READ], Operations, 5000 [READ], AverageLatency(us), 1503.45024378 [READ], MinLatency(us), 137 [READ], MaxLatency(us), 383999 [READ], 95thPercentileLatency(us), 2231 [READ], 99thPercentileLatency(us), 13503 [READ], Return=OK, 5000 The same table with the patch: [OVERALL], RunTime(ms), 1073257 [OVERALL], Throughput(ops/sec), 46587.1641181935 [TOTAL_GCS_PS_Scavenge], Count, 7201 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 9799 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.9130152423883563 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 23 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.002143009549436901 [TOTAL_GCs], Count, 7202 [TOTAL_GC_TIME], Time(ms), 9822 [TOTAL_GC_TIME_%], Time(%), 0.9151582519377931 [READ], Operations, 5000 [READ], AverageLatency(us), 1070.52889804 [READ], MinLatency(us), 142 [READ], MaxLatency(us), 327167 [READ], 95thPercentileLatency(us), 2071 [READ], 99thPercentileLatency(us), 6539 [READ], Return=OK, 5000 The same picture all other test - you could see details below. 2.Looks like it could make negative effect if we try to use the feature if we set *hbase.lru.cache.heavy.eviction.count.limit*=0 and *hbase.lru.cache.heavy.eviction.mb.size.limit*=1 and doing sporadly short reading the same data. I meant when size BC=3 and we read block 1,2,3,4,3,4 ... 4,3,2,1,2,1 ... 1,2,3,4,3,4... In this scenario better save all blocks. But this parameters will skip blocks which we will need quite soon. My opinion - it is extremely good for massive long-term reading on powerful servers. For short reading small amount of date too small values of the parameters could be pathological. 3. If I understand you correct - you meant that after compaction real blocks offset changed. But when HFiles compacted anyway all blocks removed from BC too. 4.Now we have two parameters for tuning: *hbase.lru.cache.heavy.eviction.count.limit* - it controls how soon we want to see eviction rate reduce. If we know that our load pattern is only long term reading, we can set it 0. It means if we are reading - it is for a long time. But if we have some times short reading the same data and some times long-term reading - we have to divide it by this parameter. For example we know - our short reading used to about 1 min, we have to set the param about 10 and it will enable the feature only for long time massive reading. *hbase.lru.cache.heavy.eviction.mb.size.limit* - it lets to control when we sure that GC will be suffer. For weak CPU it could be about 50-100 MB. For powerful servers 300-500 MB. I added some useful information into logging: {color:#871094}LOG{color}.info({color:#067d17}"BlockCache evicted (MB): {}, overhead (%) {}, " {color}+ {color:#067d17}"heavy eviction counter {}, " {color}+ {color:#067d17}"current caching DataBlock (%): {}"{color}, mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, {color:#00}cache{color}.{color:#871094}cacheDataBlockPercent{color}); It will help to understand what kind of values we have and how to tune it. 4. I think it is pretty good idea. Give me time, please, to do tests and check what will be. Well, I will post information about the tests in the next message. was (Author: pustota): Seems like our trouble with the servers for a long time and I've decided install HBase on my home PC. Another important point - I have done the a
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 5:27 PM: - All tests below have done on my home PC: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of autoscaling (see describe here): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (bytesFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation. Will be fine add more logic here. } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables (32 regions each): tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=uniform_ Workload scenario "z": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=zipfian_ Workload scenario "l": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=latest_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature. *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 | no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 | start reading but *count.limit* haven't reach. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 | *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching Dat
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 5:05 PM: - All tests below have done on my home PC: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of autoscaling (see describe here): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (bytesFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables (32 regions each): tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=uniform_ Workload scenario "z": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=zipfian_ Workload scenario "l": _operationcount=50 000 000 (for tbl4 just 500 000 because there is batch 100)_ _readproportion=1_ _requestdistribution=latest_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature. *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 | no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 | start reading but *count.limit* haven't reach. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 | *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching DataBlock (%): 91 LruBlockCache: Blo
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 5:01 PM: - All tests below have done on: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of autoscaling (see describe here): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (bytesFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables: tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=uniform_ Workload scenario "z": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=zipfian_ Workload scenario "l": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=latest_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature. *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 | no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 | start reading but *count.limit* haven't reach. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 | *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching DataBlock (%): 91 LruBlockCache: BlockCache evicted (MB): 7722, overhead (
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 4:56 PM: - All tests below have done on: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of autoscaling (see describe here): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (bytesFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables: tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=uniform_ Workload scenario "z": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=zipfian_ Workload scenario "l": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=latest_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature. *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 | no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 | start reading but *count.limit* haven't reach. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 | *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching DataBlock (%): 91 LruBlockCache: BlockCache evicted (MB): 7722, overhead (
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120590#comment-17120590 ] Danil Lipovoy edited comment on HBASE-23887 at 5/31/20, 4:55 PM: - All tests below have done on: _AMD Ryzen 7 2700X Eight-Core Processor (3150 MHz, 16 threads)._ Logic of autoscaling (see describe here): {code:java} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) { if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) { if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) { return; } } ...{code} And how to calculate cacheDataBlockPercent is here: {code:java} public void run() { ... LruBlockCache cache = this.cache.get(); if (cache == null) break; bytesFreed = cache.evict(); long stopTime = System.currentTimeMillis(); // We need of control the time of working cache.evict() // If heavy cleaning BlockCache control. // It helps avoid put too many blocks into BlockCache // when evict() works very active. if (stopTime - startTime <= 1000 * 10 - 1) { mbFreedSum += bytesFreed/1024/1024; // Now went less then 10 sec, just sum up and thats all } else { freedDataOverheadPercent = (int) (mbFreedSum * 100 / cache.heavyEvictionBytesSizeLimit) - 100; if (mbFreedSum > cache.heavyEvictionBytesSizeLimit) { heavyEvictionCount++; if (heavyEvictionCount > cache.heavyEvictionCountLimit) { if (freedDataOverheadPercent > 100) { cache.cacheDataBlockPercent -= 3; } else { if (freedDataOverheadPercent > 50) { cache.cacheDataBlockPercent -= 1; } else { if (freedDataOverheadPercent < 30) { cache.cacheDataBlockPercent += 1; } } } } } else { if (bytesFreedSum > cache.heavyEvictionBytesSizeLimit * 0.5 && cache.cacheDataBlockPercent < 50) { cache.cacheDataBlockPercent += 5; // It help prevent some premature escape from accidental fluctuation } else { heavyEvictionCount = 0; cache.cacheDataBlockPercent = 100; } } LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " + "heavy eviction counter: {}, " + "current caching DataBlock (%): {}", mbFreedSum, freedDataOverheadPercent, heavyEvictionCount, cache.cacheDataBlockPercent); mbFreedSum = 0; startTime = stopTime; } {code} I prepared 4 tables: tbl1 - 200 mln records, 100 bytes each. Total size 30 Gb. tbl2 - 20 mln records, 500 bytes each. Total size 10.4 Gb. tbl3 - 100 mln records, 100 bytes each. Total size 15.4 Gb. tbl4 - the same like tbl3 but I use it for testing work with batches (batchSize=100) Workload scenario "u": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=uniform_ Workload scenario "z": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=zipfian_ Workload scenario "l": _operationcount=5000 (for tbl4 just 50 because there is batch 100)_ _readproportion=1_ _requestdistribution=latest_ Then I run all tables with all scenarios on original version (total 4*3=12 tests) and 12 with the feature. *hbase.lru.cache.heavy.eviction.count.limit* = 3 *hbase.lru.cache.heavy.eviction.mb.size.limit* = 200 Performance results: !requests_100p.png! We could see that on the second graph lines have some a step at the begin. It is because works auto scaling. Let see the log of RegionServer: LruBlockCache: BlockCache evicted (MB): 0, overhead (%): 0, heavy eviction counter: 0, current caching DataBlock (%): 100 | no load, do nothing LruBlockCache: BlockCache evicted (MB): 0, overhead (%): 0, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 0, overhead (%): 0, heavy eviction counter: 0, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 229, overhead (%): 14, heavy eviction counter: 1, current caching DataBlock (%): 100 | start reading but *count.limit* haven't reach. LruBlockCache: BlockCache evicted (MB): 6958, overhead (%): 3379, heavy eviction counter: 2, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8117, overhead (%): 3958, heavy eviction counter: 3, current caching DataBlock (%): 100 LruBlockCache: BlockCache evicted (MB): 8713, overhead (%): 4256, heavy eviction counter: 4, current caching DataBlock (%): 97 | *count.limit* have reached, decrease on 3% LruBlockCache: BlockCache evicted (MB): 8723, overhead (%): 4261, heavy eviction counter: 5, current caching DataBlock (%): 94 LruBlockCache: BlockCache evicted (MB): 8318, overhead (%): 4059, heavy eviction counter: 6, current caching DataBlock (%): 91 LruBlockCache: BlockCache evicted (MB): 7722, overhead (%): 3761,
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114609#comment-17114609 ] Danil Lipovoy edited comment on HBASE-23887 at 5/23/20, 8:52 AM: - [~bharathv] Thank you for your interest!) I would prefer answer with the some real tests, but unfortunately now I have big trouble with our servers. I will return when we fix it (have no idea how long it would take). was (Author: pustota): [~bharathv] Thank you for your interest!) I would prefer answer with the some real tests, but unfortunately now I have big trouble with our servers. I will return when we fix it (have no idea how much it would take). > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, cmp.png, evict_BC100_vs_BC23.png, > read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offsets: > 124 > 198 > 223 > Current way - we put the block 124, then put 198, evict 124, put 223, evict > 198. A lot of work (5 actions). > With the feature - last few digits evenly distributed from 0 to 99. When we > divide by modulus we got: > 124 -> 24 > 198 -> 98 > 223 -> 23 > It helps to sort them. Some part, for example below 50 (if we set > *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip > others. It means we will not try to handle the block 198 and save CPU for > other job. In the result - we put block 124, then put 223, evict 124 (3 > actions). > See the picture in attachment with test below. Requests per second is higher, > GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 1-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > Other parameters help to control when this logic will be enabled. It means it > will work only while heavy reading going on. > hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run > eviction process that start to avoid of putting data to BlockCache > hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to > evicted each time that start to avoid of putting data to BlockCache > By default: if 10 times (100 secunds) evicted more than 10 MB (each time) > then we start to skip 50% of data blocks. > When heavy evitions process end then new logic off and will put into > BlockCache all blocks again. > > Descriptions of the test: > 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem. > 4 RegionServers > 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF) > Total BlockCache Size = 48 Gb (8 % of data in HFiles) > Random read in 20 threads > > I am going to make Pull Request, hope it is right way to make some > contribution in this cool product. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110503#comment-17110503 ] Danil Lipovoy edited comment on HBASE-23887 at 5/18/20, 6:05 PM: - Hi guys! I was thinking about weak point - we have to set *hbase.lru.cache.data.block.percent* and can't be sure it is enough or not (just approximately). What do you think about the next approach: Use just the parameter - *hbase.lru.cache.heavy.eviction.bytes.size.limit* ** and calc online how many evition bytes above it. And use this information to decide, how aggressive reduction should be. For example: We set *hbase.lru.cache.heavy.eviction.bytes.size.limit* = 200 Mb. When starts heavy reading the real eviction volume can be 500 Mb in 10 seconds (total). Then we calc 500 * 100 / 200 = 250 % If the value more than 100, then start to skip 5% of blocks. In the next 10 seconds the real eviction volume could be ~475 Mb (500 * 0,95). Calc the value 475 * 100 / 200 = 238% Still too much, then skip on 5% more -> 10% of blocks. Then in the next 10 seconds the real eviction volume could be ~451 Mb (500 * 0,9). And so on. After 9 iteration we went to 300 Mb and it is 150%. Then we could use less aggressive reduction. Just 1% for instead of 5%. It means we will skip 41% and get 295 Mb (500 * 0,59). Calc the value 295 * 100 / 200 = 148% And reduce it more while reach 130%. There we could stop reducing because if we got 99% then *hbase.lru.cache.heavy.eviction.count.limit* would set 0 and reset all to begin state (like no skipping at all). |Time (sec)|Evicted (Mb)|Skip (%)|Above limit (%)| | |0|500|0|250| | |10|475|5|238|< 5% reduction| |20|450|10|225| | |30|425|15|213| | |40|400|20|200| | |50|375|25|188| | |60|350|30|175| | |70|325|35|163| | |80|300|40|150| | |90|295|41|148|< start by 1%| |100|290|42|145| | |110|285|43|143| | |120|280|44|140| | |130|275|45|138| | |140|270|46|135| | |150|265|47|133| | |160|260|48|130|< enough| |170|260|48|130| | What do you think? was (Author: pustota): Hi guys! I was thinking about weak point - we have to set *hbase.lru.cache.data.block.percent* and can't be sure it is enough or not (just approximately). What do you think about the next approach: Use just the parameter - *hbase.lru.cache.heavy.eviction.bytes.size.limit* ** and calc online how many evition bytes above it. And use this information to decide, how aggressive reduction should be. For example: We set *hbase.lru.cache.heavy.eviction.bytes.size.limit* = 200 Mb. When starts heavy reading the real eviction volume can be 500 Mb in 10 seconds (total). Then we calc 500 * 100 / 200 = 250 % If the value more than 100, then start to skip 5% of blocks. In the next 10 seconds the real eviction volume could be ~475 Mb (500 * 0,95). Calc the value 475 * 100 / 200 = 238% Still too much, then skip on 5% more -> 10% of blocks. Then in the next 10 seconds the real eviction volume could be ~451 Mb (500 * 0,9). And so on. After 9 iteration we went to 300 Mb and it is 150%. Then we could use less aggressive reduction. Just 1% for instead of 5%. It means we will skip 41% and get 295 Mb (500 * 0,59). Calc the value 295 * 100 / 200 = 148% And reduce it more while reach 130%. There we could stop reducing because if we got 99% then *hbase.lru.cache.heavy.eviction.count.limit* would set 0 and reset all to begin state (like no skipping at all). |Evicted (Mb)|Skip (%)|Above limit (%)| | |500|0|250| | |475|5|238|< 5% reduction| |450|10|225| | |425|15|213| | |400|20|200| | |375|25|188| | |350|30|175| | |325|35|163| | |300|40|150| | |295|41|148|< by 1%| |290|42|145| | |285|43|143| | |280|44|140| | |275|45|138| | |270|46|135| | |265|47|133| | |260|48|130|< enough| |260|48|130| | What do you think? > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, > BlockCacheEvictionProcess.gif, cmp.png, evict_BC100_vs_BC23.png, > read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:51 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have a >>high rate of evictions Yes, that is what I meant >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have a >>high rate of evictions Yes, that is what I mean >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and i
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:50 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have a >>high rate of evictions You are right >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this c
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:50 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have a >>high rate of evictions Yes, that is what I mean >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have a >>high rate of evictions You are right >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:39 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache th
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:40 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 in a row was eviction (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:38 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < after 10 seconds start eviction process and evicted block #5 1 1 7 1 1 < after 10 seconds start eviction process and here is no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < eviction 5 1 1 7 1 1 < no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > Example:
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:36 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < eviction 5 1 1 7 1 1 < no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before. Now I added 2 parameters which help to control when this logic will be enabled. It means it will work only while heavy reading going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < eviction 5 1 1 7 1 1 < no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with offse
[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve by reduce eviction rate
[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102303#comment-17102303 ] Danil Lipovoy edited comment on HBASE-23887 at 5/8/20, 6:36 AM: [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading is going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < eviction 5 1 1 7 1 1 < no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb was (Author: pustota): [~elserj] >>Can you please update the title/description so your improvement is a little >>more clear? Of course, done >>As I understand it: you choose to not cache a block which we would have That is correct >>At least, your solution would have a higher benefit if all data is accessed >>"equally" It was before) Now I added 2 parameters which help to control when new logic will be enabled. It means it will work only while heavy reading going on. For example, we have not equaly distibution of reading blocks: 1 5 1 1 7 < eviction 5 1 1 7 1 1 < no eviction In this case eviction will work rarely. And we can use it to control by params: hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache By default: if 10 times (=100 seconds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks. Thats why this feature will work when the data really can't fit into BlockCache -> eviction rate really work hard and it usually means reading blocks evenly distributed. When heavy evitions process end then new logic off and will put into BlockCache all blocks again. I am not sure that explain the idea very clean. Please get me know if I need provide more information. >>What YCSB workload did you run for which these results you've shared It was Workload C: Read only >>Also, how much data did you generate and how does that relate to the total >>blockcache size? It was 600 Gb in HFiles Total BlockCache Size 48 Gb > BlockCache performance improve by reduce eviction rate > -- > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance >Reporter: Danil Lipovoy >Priority: Minor > Attachments: 1582787018434_rs_metrics.jpg, > 1582801838065_rs_metrics_new.png, BC_LongRun.png, cmp.png, > evict_BC100_vs_BC23.png, read_requests_100pBC_vs_23pBC.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. > Sometimes we have more data than can fit into BlockCache and it is cause a > high rate of evictions. In this case we can skip cache a block N and insted > cache the N+1th block. Anyway we would evict N block quite soon and that why > that skipping good for performance. > Example: > Imagine we have little cache, just can fit only 1 block and we are trying to > read 3 blocks with off