[jira] [Commented] (HBASE-9769) Improve performance of a Scanner with explicit column list when rows are small/medium size

Lars Hofhansl (JIRA) Thu, 17 Oct 2013 00:19:20 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-9769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797680#comment-13797680
 ]


Lars Hofhansl commented on HBASE-9769:
--------------------------------------

Did some profiling on why reseek() is so much slower than next() even when 
reseek just has to seek to the next key. The reason is all the compares we're 
doing... For each reseek:
* 2 KV compares in KeyValueHeap.generalizedSeek to find the top scanner
* 2 key compares in HFileReaderV2.ScannerV2.reseekTo (one to check for reseek, 
one to check against the index key)
* 2 key compares in HFileReaderV2.ScannerV2.blockSeek to find the right key

After all that we finally read the KV we found.

While next() just reads the next KV from the current HFile block.

Nothing jumps here as to how we could simplify this.

> Improve performance of a Scanner with explicit column list when rows are 
> small/medium size
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9769
>                 URL: https://issues.apache.org/jira/browse/HBASE-9769
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 0.98.0, 0.94.12, 0.96.0
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>         Attachments: 9769-0.94-sample1.txt, 9769-0.94-sample2.txt, 
> 9769-0.94-sample.txt, 9769-94.txt, 9769-94-v2.txt, 9769-trunk-v1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9769) Improve performance of a Scanner with explicit column list when rows are small/medium size

Reply via email to