[ 
https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044740#comment-17044740
 ] 

Danil Lipovoy edited comment on HBASE-23887 at 2/25/20 8:48 PM:
----------------------------------------------------------------

[~reidchan]

>>Then the naming hbase.lru.cache.data.block.percent is not good, it is 
>>confusing.

Ok, what would be better?

>>One case: if client happens to read 198 many times, then 198 will never get 
>>cached because of the rule.

It is good point! We can easy provide special condition for that extreme cases 
- just skip the new logic for IN_MEMORY tables:
{code:java}
 if (cacheDataBlockPercent != 100 && buf.getBlockType().isData() && !inMemory) 
{{code}
If someone read really often (in our case block evicts by few seconds because 
heavy read) and need full cached data, he can use this option.

>>Another one extreme case: all offset last 2 digits just between [00, 85] 
>>(assuming 85 set), then BC will cache them all anyway...

Please clarify, do you mean that blocks could be not evenly distributed?
 If it is the point, I could collect statistics for millions requests and we 
will know it for sure.

>>YCSB is a good benchmark tool,

Agree, we use this perfect tool for tests. You are right about random access, 
it is exactly our case in production environment (about 300 Tb HFiles) and 
that's why we started to think how to optimize this case.

Now I loaded 1 bln records into 1 table 64 regions all in on 1 RS and read it 
uniform by YCSB. Below the results.

1. Current version:
 [OVERALL], RunTime(ms), 45470
 [OVERALL], Throughput(ops/sec), 21992.522542335606
 [TOTAL_GCS_PS_Scavenge], Count, 8
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 740
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.627446668132835
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 31
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.06817681988124037
 [TOTAL_GCs], Count, 9
 [TOTAL_GC_TIME], Time(ms), 771
 [TOTAL_GC_TIME_%], Time(%), 1.6956234880140753
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 882.532253
 [READ], MinLatency(us), 137
 [READ], MaxLatency(us), 599039
 [READ], 95thPercentileLatency(us), 765
 [READ], 99thPercentileLatency(us), 949
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 1560.15
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 61599
 [CLEANUP], 95thPercentileLatency(us), 32
 [CLEANUP], 99thPercentileLatency(us), 61599

2. With the feature:
 [OVERALL], RunTime(ms), 34467
 [OVERALL], Throughput(ops/sec), 29013.25905939014
 [TOTAL_GCS_PS_Scavenge], Count, 9
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 104
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.3017378942176575
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 32
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.09284242899004845
 [TOTAL_GCs], Count, 10
 [TOTAL_GC_TIME], Time(ms), 136
 [TOTAL_GC_TIME_%], Time(%), 0.3945803232077059
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 661.87457
 [READ], MinLatency(us), 145
 [READ], MaxLatency(us), 261759
 [READ], 95thPercentileLatency(us), 808
 [READ], 99thPercentileLatency(us), 955
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 2626.25
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 104511
 [CLEANUP], 95thPercentileLatency(us), 26
 [CLEANUP], 99thPercentileLatency(us), 104511


was (Author: pustota):
[~reidchan]

>>Then the naming hbase.lru.cache.data.block.percent is not good, it is 
>>confusing.

Ok, what would be better?

>>One case: if client happens to read 198 many times, then 198 will never get 
>>cached because of the rule.

It is good point! We can easy provide special condition for that extreme cases 
- just skip the new logic for IN_MEMORY tables:
{code:java}
 if (cacheDataBlockPercent != 100 && buf.getBlockType().isData() && !inMemory) 
{{code}
If someone read really often (in our case block evicts by few seconds because 
heavy read) and need full cached data, he can use this option.

>>Another one extreme case: all offset last 2 digits just between [00, 85] 
>>(assuming 85 set), then BC will cache them all anyway...

Please clarify, do you mean that blocks could be not evenly distributed?
 If it is the point, I could collect statistics for millions requests and we 
will know it for sure.

>>YCSB is a good benchmark tool,

Agree, I use it for tests. Now I loaded 1 bln records into 1 table 64 regions 
all in on 1 RS and read it by YCSB. Below the results.

1. Current version:
 [OVERALL], RunTime(ms), 45470
 [OVERALL], Throughput(ops/sec), 21992.522542335606
 [TOTAL_GCS_PS_Scavenge], Count, 8
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 740
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 1.627446668132835
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 31
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.06817681988124037
 [TOTAL_GCs], Count, 9
 [TOTAL_GC_TIME], Time(ms), 771
 [TOTAL_GC_TIME_%], Time(%), 1.6956234880140753
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 882.532253
 [READ], MinLatency(us), 137
 [READ], MaxLatency(us), 599039
 [READ], 95thPercentileLatency(us), 765
 [READ], 99thPercentileLatency(us), 949
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 1560.15
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 61599
 [CLEANUP], 95thPercentileLatency(us), 32
 [CLEANUP], 99thPercentileLatency(us), 61599

2. With the feature:
 [OVERALL], RunTime(ms), 34467
 [OVERALL], Throughput(ops/sec), 29013.25905939014
 [TOTAL_GCS_PS_Scavenge], Count, 9
 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 104
 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.3017378942176575
 [TOTAL_GCS_PS_MarkSweep], Count, 1
 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 32
 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.09284242899004845
 [TOTAL_GCs], Count, 10
 [TOTAL_GC_TIME], Time(ms), 136
 [TOTAL_GC_TIME_%], Time(%), 0.3945803232077059
 [READ], Operations, 1000000
 [READ], AverageLatency(us), 661.87457
 [READ], MinLatency(us), 145
 [READ], MaxLatency(us), 261759
 [READ], 95thPercentileLatency(us), 808
 [READ], 99thPercentileLatency(us), 955
 [READ], Return=OK, 1000000
 [CLEANUP], Operations, 40
 [CLEANUP], AverageLatency(us), 2626.25
 [CLEANUP], MinLatency(us), 1
 [CLEANUP], MaxLatency(us), 104511
 [CLEANUP], 95thPercentileLatency(us), 26
 [CLEANUP], 99thPercentileLatency(us), 104511

> BlockCache performance improve
> ------------------------------
>
>                 Key: HBASE-23887
>                 URL: https://issues.apache.org/jira/browse/HBASE-23887
>             Project: HBase
>          Issue Type: Improvement
>          Components: BlockCache, Performance
>            Reporter: Danil Lipovoy
>            Priority: Minor
>         Attachments: cmp.png
>
>
> Hi!
> I first time here, correct me please if something wrong.
> I want propose how to improve performance when data in HFiles much more than 
> BlockChache (usual story in BigData). The idea - caching only part of DATA 
> blocks. It is good becouse LruBlockCache starts to work and save huge amount 
> of GC. See the picture in attachment with test below. Requests per second is 
> higher, GC is lower.
>  
> The key point of the code:
> Added the parameter: *hbase.lru.cache.data.block.percent* which by default = 
> 100
>  
> But if we set it 0-99, then will work the next logic:
>  
>  
> {code:java}
> public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory) {   
>   if (cacheDataBlockPercent != 100 && buf.getBlockType().isData())      
>     if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) 
>       return;    
> ... 
> // the same code as usual
> }
> {code}
>  
>  
> Descriptions of the test:
> 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 4 RegionServers
> 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF)
> Total BlockCache Size = 48 Gb (8 % of data in HFiles)
> Random read in 20 threads
>  
> I am going to make Pull Request, hope it is right way to make some 
> contribution in this cool product.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to