[
https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942225#comment-15942225
]
Eshcar Hillel commented on HBASE-17339:
---------------------------------------
I am attaching results of an experiment with mixed workload, and also the most
updated patch if anyone else wants to run it own experiments.
For the lower percentiles the optimization gains 8-9% in read latency, for high
percentiles it ranges between -5% to +5%.
The experiment ran 100M get operations. With no optimization this translates
into 100M (full) scans, ~400M cache accesses from which ~30M are misses.
With the optimization we have only 62M (full) scans (the rest scan only the
memory for results), and only ~300M cache accesses, but the same amount of
misses ~30M.
In other experiment I did I saw the hit ratio dropping from 90% with no
optimization to 80% with the optimization.
If we can reduce the amount of misses we can reduce the read latency also in
the high percentiles.
Can we have a different caching policy that reduces misses when reading less
from the cache? Perhaps TinyLFU (HBASE-15560) can help here [~ben.manes]?
> Scan-Memory-First Optimization for Get Operations
> -------------------------------------------------
>
> Key: HBASE-17339
> URL: https://issues.apache.org/jira/browse/HBASE-17339
> Project: HBase
> Issue Type: Improvement
> Reporter: Eshcar Hillel
> Assignee: Eshcar Hillel
> Attachments: HBASE-17339-V01.patch, HBASE-17339-V02.patch,
> HBASE-17339-V03.patch, HBASE-17339-V03.patch, HBASE-17339-V04.patch,
> HBASE-17339-V05.patch, HBASE-17339-V06.patch, read-latency-mixed-workload.jpg
>
>
> The current implementation of a get operation (to retrieve values for a
> specific key) scans through all relevant stores of the region; for each store
> both memory components (memstores segments) and disk components (hfiles) are
> scanned in parallel.
> We suggest to apply an optimization that speculatively scans memory-only
> components first and only if the result is incomplete scans both memory and
> disk.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)