[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve

Danil Lipovoy (Jira) Tue, 25 Feb 2020 10:59:45 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044740#comment-17044740
 ]


Danil Lipovoy edited comment on HBASE-23887 at 2/25/20 6:58 PM:
----------------------------------------------------------------

[~reidchan]

>>Then the naming hbase.lru.cache.data.block.percent is not good, it is 
>>confusing.

Ok, what would be better?

>>One case: if client happens to read 198 many times, then 198 will never get 
>>cached because of the rule.

It is good point! We can easy provide special condition for that extreme cases 
- just skip the new logic for IN_MEMORY tables:
{code:java}
 if (cacheDataBlockPercent != 100 && buf.getBlockType().isData() && !inMemory) 
{{code}
If someone read really often (in our case block evicts by few seconds because 
heavy read) and need full cached data, he can use this option.

>>Another one extreme case: all offset last 2 digits just between [00, 85] 
>>(assuming 85 set), then BC will cache them all anyway...

Please clarify, do you mean that blocks could be not evenly distributed?
 If it is the point, I could collect statistics for millions requests and we 
will know it for sure.

>>YCSB is a good benchmark tool,

Agree, I use it for tests. Now I loaded 1 bln records into 1 table 64 regions 
all in on 1 RS and read it by YCSB. Below the results.

1. Current version:
 [OVERALL], RunTime(ms), 45470
 [OVERALL], Throughput(ops/sec), 21992.522542335606
 [TOTAL_GCS_PS_Scavenge], Count, 8
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 740
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.627446668132835
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 31
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.06817681988124037
 [TOTAL_GCs], Count, 9
 [TOTAL_GC_TIME], Time(ms), 771
 [TOTAL_GC_TIME_%], Time(%), 1.6956234880140753
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 882.532253
 [READ], MinLatency(us), 137
 [READ], MaxLatency(us), 599039
 [READ], 95thPercentileLatency(us), 765
 [READ], 99thPercentileLatency(us), 949
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 1560.15
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 61599
 [CLEANUP], 95thPercentileLatency(us), 32
 [CLEANUP], 99thPercentileLatency(us), 61599

2. With the feature:
 [OVERALL], RunTime(ms), 34467
 [OVERALL], Throughput(ops/sec), 29013.25905939014
 [TOTAL_GCS_PS_Scavenge], Count, 9
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 104
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.3017378942176575
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 32
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.09284242899004845
 [TOTAL_GCs], Count, 10
 [TOTAL_GC_TIME], Time(ms), 136
 [TOTAL_GC_TIME_%], Time(%), 0.3945803232077059
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 661.87457
 [READ], MinLatency(us), 145
 [READ], MaxLatency(us), 261759
 [READ], 95thPercentileLatency(us), 808
 [READ], 99thPercentileLatency(us), 955
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 2626.25
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 104511
 [CLEANUP], 95thPercentileLatency(us), 26
 [CLEANUP], 99thPercentileLatency(us), 104511


was (Author: pustota):
[~reidchan]

>>Then the naming hbase.lru.cache.data.block.percent is not good, it is 
>>confusing.

Ok, what would be better?

>>One case: if client happens to read 198 many times, then 198 will never get 
>>cached because of the rule.

It is good point! We can easy provide special condition for that extreme cases 
- just skip the new logic for IN_MEMORY tables:
{code:java}
 if (cacheDataBlockPercent != 100 && buf.getBlockType().isData() && !inMemory) 
{{code}
If someone read really often (in our case block evicts by few seconds because 
heavy read) and need full cached data, he can use this option.

>>Another one extreme case: all offset last 2 digits just between [00, 85] 
>>(assuming 85 set), then BC will cache them all anyway...

Please clarify, do you mean that blocks could be not evenly distributed?
 If it is the point, I could collect statistics for millions requests and we 
will know it for sure.

>>YCSB is a good benchmark tool,

Agree, I use it for tests. Now I loaded 1 bln records into 1 table 64 regions 
all in on 1 RS and read it by YCSB. Below the results.

1. The old way:
 [OVERALL], RunTime(ms), 45470
 [OVERALL], Throughput(ops/sec), 21992.522542335606
 [TOTAL_GCS_PS_Scavenge], Count, 8
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 740
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.627446668132835
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 31
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.06817681988124037
 [TOTAL_GCs], Count, 9
 [TOTAL_GC_TIME], Time(ms), 771
 [TOTAL_GC_TIME_%], Time(%), 1.6956234880140753
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 882.532253
 [READ], MinLatency(us), 137
 [READ], MaxLatency(us), 599039
 [READ], 95thPercentileLatency(us), 765
 [READ], 99thPercentileLatency(us), 949
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 1560.15
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 61599
 [CLEANUP], 95thPercentileLatency(us), 32
 [CLEANUP], 99thPercentileLatency(us), 61599

2. The new way:
 [OVERALL], RunTime(ms), 34467
 [OVERALL], Throughput(ops/sec), 29013.25905939014
 [TOTAL_GCS_PS_Scavenge], Count, 9
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 104
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.3017378942176575
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 32
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.09284242899004845
 [TOTAL_GCs], Count, 10
 [TOTAL_GC_TIME], Time(ms), 136
 [TOTAL_GC_TIME_%], Time(%), 0.3945803232077059
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 661.87457
 [READ], MinLatency(us), 145
 [READ], MaxLatency(us), 261759
 [READ], 95thPercentileLatency(us), 808
 [READ], 99thPercentileLatency(us), 955
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 2626.25
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 104511
 [CLEANUP], 95thPercentileLatency(us), 26
 [CLEANUP], 99thPercentileLatency(us), 104511

> BlockCache performance improve
> ------------------------------
>
>                 Key: HBASE-23887
>                 URL: https://issues.apache.org/jira/browse/HBASE-23887
>             Project: HBase
>          Issue Type: Improvement
>          Components: BlockCache, Performance
>            Reporter: Danil Lipovoy
>            Priority: Minor
>         Attachments: cmp.png
>
>
> Hi!
> I first time here, correct me please if something wrong.
> I want propose how to improve performance when data in HFiles much more than 
> BlockChache (usual story in BigData). The idea - caching only part of DATA 
> blocks. It is good becouse LruBlockCache starts to work and save huge amount 
> of GC. See the picture in attachment with test below. Requests per second is 
> higher, GC is lower.
>  
> The key point of the code:
> Added the parameter: *hbase.lru.cache.data.block.percent* which by default = 
> 100
>  
> But if we set it 0-99, then will work the next logic:
>  
>  
> {code:java}
> public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory) {   
>   if (cacheDataBlockPercent != 100 && buf.getBlockType().isData())      
>     if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) 
>       return;    
> ... 
> // the same code as usual
> }
> {code}
>  
>  
> Descriptions of the test:
> 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 4 RegionServers
> 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF)
> Total BlockCache Size = 48 Gb (8 % of data in HFiles)
> Random read in 20 threads
>  
> I am going to make Pull Request, hope it is right way to make some 
> contribution in this cool product.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HBASE-23887) BlockCache performance improve

Reply via email to