[ 
https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942225#comment-15942225
 ] 

Eshcar Hillel commented on HBASE-17339:
---------------------------------------

I am attaching results of an experiment with mixed workload, and also the most 
updated patch if anyone else wants to run it own experiments.
For the lower percentiles the optimization gains 8-9% in read latency, for high 
percentiles it ranges between -5% to +5%. 
The experiment ran 100M get operations. With no optimization this translates 
into 100M (full) scans, ~400M cache accesses from which ~30M are misses.
With the optimization we have only 62M (full) scans (the rest scan only the 
memory for results), and only ~300M cache accesses, but the same amount of 
misses ~30M. 
In other experiment I did I saw the hit ratio dropping from 90% with no 
optimization to 80% with the optimization.
If we can reduce the amount of misses we can reduce the read latency also in 
the high percentiles.

Can we have a different caching policy that reduces misses when reading less 
from the cache? Perhaps TinyLFU (HBASE-15560) can help here [~ben.manes]?

> Scan-Memory-First Optimization for Get Operations
> -------------------------------------------------
>
>                 Key: HBASE-17339
>                 URL: https://issues.apache.org/jira/browse/HBASE-17339
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>         Attachments: HBASE-17339-V01.patch, HBASE-17339-V02.patch, 
> HBASE-17339-V03.patch, HBASE-17339-V03.patch, HBASE-17339-V04.patch, 
> HBASE-17339-V05.patch, HBASE-17339-V06.patch, read-latency-mixed-workload.jpg
>
>
> The current implementation of a get operation (to retrieve values for a 
> specific key) scans through all relevant stores of the region; for each store 
> both memory components (memstores segments) and disk components (hfiles) are 
> scanned in parallel.
> We suggest to apply an optimization that speculatively scans memory-only 
> components first and only if the result is incomplete scans both memory and 
> disk.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to