[ 
https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15763596#comment-15763596
 ] 

Eshcar Hillel commented on HBASE-17339:
---------------------------------------

Thanks all for commenting.
Indeed this optimization would not work in the general case; this was briefly 
discussed in HBASE-16417.
However, we believe quit often this optimization can yield correct answer and 
therefore should be applied. 
In this Jira we would like to come up with the use cases where the optimization 
can *not* be applied, and the user should be advised not to apply it (for 
example when the application is manipulating versions), and the complete set of 
conditions that when satisfied the optimization can be applied.
Hopefully this way we can allow application benefit from reduced latency when 
the results are known to be correct, as well as allow it bypass this 
optimization when it is impossible to ensure their correctness.

@ted_yu: there are multiple options for setting the mixed workload. We wanted 
to balance between the amount of data written in the experiment and the time it 
takes to run it. 95-5 was the optimal point for this. We can try different 
numbers as well.
The full details of the experiments can be found in the report in HBASE-16417. 
I will make a clean report for the current Jira which includes only the 
relevant sections. 



> Scan-Memory-First Optimization
> ------------------------------
>
>                 Key: HBASE-17339
>                 URL: https://issues.apache.org/jira/browse/HBASE-17339
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Eshcar Hillel
>         Attachments: HBASE-17339-V01.patch
>
>
> The current implementation of a get operation (to retrieve values for a 
> specific key) scans through all relevant stores of the region; for each store 
> both memory components (memstores segments) and disk components (hfiles) are 
> scanned in parallel.
> We suggest to apply an optimization that speculatively scans memory-only 
> components first and only if the result is incomplete scans both memory and 
> disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to