[ 
https://issues.apache.org/jira/browse/HBASE-16225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381937#comment-15381937
 ] 

Lars Hofhansl commented on HBASE-16225:
---------------------------------------

That's another change for performance:
The observation is that in the vast majority of the cases consecutive cells 
come from the StoreFileScanner or StoreScanner, so comparing every cell to the 
next is wasteful. So in theory in KeyValueHeap we could read a set of cells 
from the underlying scanners and then only compare the first and last keys. If 
we can sorts the sets that way, we done (only two compares per set) if the sets 
overlap we could half the set and try again or simply fall back comparing each 
cells.


> Refactor ScanQueryMatcher
> -------------------------
>
>                 Key: HBASE-16225
>                 URL: https://issues.apache.org/jira/browse/HBASE-16225
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> As said in HBASE-16223, the code of {{ScanQueryMatcher}} is too complicated. 
> I suggest that we can abstract an interface and implement several sub classes 
> which separate different logic into different implementations. For example, 
> the requirements of compaction and user scan are different, now we also need 
> to consider the logic of user scan even if we only want to add a logic for 
> compaction. And at least, the raw scan does not need a query matcher... we 
> can implement a dummy query matcher for it.
> Suggestions are welcomed. Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to