[ 
https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15761344#comment-15761344
 ] 

Eshcar Hillel commented on HBASE-17339:
---------------------------------------

Benchmark results presented in HBASE-16417 show that this optimization improves 
avg latency of read operation by roughly 10% in a mixed workload running 95% 
puts and 5% gets.
In addition it presents the ratio between number of memory-only scans vs. 
number of full scans (only 31%-35% of the operations require full scan) when 
running with the optimization. 
And the reduction in cache accesses (70% less accesses), and cache misses 
(40%-45% less misses) which lead to this latency improvement. 

This is the patch we used in our experiments. It is intended only to 
demonstrate the potential gain. 
A full solution requires more work. Specifically we need to make sure it is 
correct. 
(1) it can be applied only in cases where the get operation indicates specific 
column qualifiers - then the condition to not invoke a full scan should be 
replaced with a test verifying all columns have non-empty result.
(2) this property can be set on by default and the application would have an 
option to set it off on per operation by setting an optional flag passed in the 
Scan object. This flag would be on by default.

> Scan-Memory-First Optimization
> ------------------------------
>
>                 Key: HBASE-17339
>                 URL: https://issues.apache.org/jira/browse/HBASE-17339
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Eshcar Hillel
>         Attachments: HBASE-17339-V01.patch
>
>
> The current implementation of a get operation (to retrieve values for a 
> specific key) scans through all relevant stores of the region; for each store 
> both memory components (memstores segments) and disk components (hfiles) are 
> scanned in parallel.
> We suggest to apply an optimization that speculatively scans memory-only 
> components first and only if the result is incomplete scans both memory and 
> disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to