[ 
https://issues.apache.org/jira/browse/HBASE-9769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13796417#comment-13796417
 ] 

Lars Hofhansl commented on HBASE-9769:
--------------------------------------

bq. Lars, HTable can have small number of versions and large number of column 
qualifiers or large values (say 100K). 

That is true. Seeking to the next column is not a good idea, though, if we know 
there are not going to be many versions to skip. So the suggested patch here 
will not be slower than before, and it will improve performance in many cases.

As the size of a KV approaches the HFile blocksize (64k by default), SKIP and 
SEEK_NEXT_COL should become equivalent in performance (in both cases we'll need 
to find the KV in the next block).

As I said, this does not eliminate the NEXT_ROW seeking.

I fear the filter approach will lead to issues when there are already filters 
configured on the scan. You'd have to convert this to a FilterList while 
keeping all the semantics and performance characteristics.
I think it might be best to ship your Filter and document its use.

I'll file a separate issue for my patch.


> Improve performance of a Scanner with explicit column list when rows are 
> small/medium size
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9769
>                 URL: https://issues.apache.org/jira/browse/HBASE-9769
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 0.98.0, 0.94.12, 0.96.0
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>         Attachments: 9769-0.94-sample1.txt, 9769-0.94-sample2.txt, 
> 9769-0.94-sample.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to