[jira] [Commented] (HBASE-10625) Remove unnecessary key compare from AbstractScannerV2.reseekTo

Lars Hofhansl (JIRA) Thu, 27 Feb 2014 23:42:42 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915531#comment-13915531
 ]


Lars Hofhansl commented on HBASE-10625:
---------------------------------------

Final numbers for reference. (Doubled number of rows and reset everything after 
each run).

20m rows, 5 columns, 8 byte keys, 10 bytes values, no encoding:
||collumns||None||C0||C1||C4||C1,C3||C2,C3||C2,C3,C4||
|w/o patch|14.88|14.74|23.57|17.70|34.48|26.06|21.76|
|w/ patch|14.24|14.02|22.09|16.95|32.39|24.86|21.63|

20m rows, 5 columns, 8 byte keys, 10 bytes values, FAST_DIFF:
||collumns||None||C0||C1||C4||C1,C3||C2,C3||C2,C3,C4||
|w/o patch|22.07|20.59|30.65|23.53|43.25|33.62|28.90|
|w/ patch|22.59|20.22|29.84|22.91|42.43|33.31|28.87|


> Remove unnecessary key compare from AbstractScannerV2.reseekTo
> --------------------------------------------------------------
>
>                 Key: HBASE-10625
>                 URL: https://issues.apache.org/jira/browse/HBASE-10625
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>         Attachments: 10625-0.94.txt, 10625-trunk.txt
>
>
> In reseekTo we find this
> {code}
> ...
>         compared = compareKey(reader.getComparator(), key, offset, length);
>         if (compared < 1) {
>           // If the required key is less than or equal to current key, then
>           // don't do anything.
>           return compared;
>         } else {
>            ...
>            return loadBlockAndSeekToKey(this.block, this.nextIndexedKey,
>               false, key, offset, length, false);
> ...
> {code}
> loadBlockAndSeekToKey already does the right thing when a we pass a key that 
> sorts before the current key. It's less efficient than this early check, but 
> in the vast (all?) cases we pass forward keys (as required by the reseek 
> contract). We're optimizing the wrong thing.
> Scanning with the ExplicitColumnTracker is 20-30% faster.
> (I tested with rows of 5 short KVs selected the 2nd and or 4th column)
> I propose simply removing that check.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10625) Remove unnecessary key compare from AbstractScannerV2.reseekTo

Reply via email to