[
https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816911#comment-13816911
]
Lars Hofhansl commented on HBASE-9915:
--------------------------------------
Some number with Phoenix. 5m rows, 5 long columns, 8 byte rowkeys, FAST_DIFF
encoding, table fully flushed and major compacted, everything in the blockcache.
(some weirdly named columns, this was a preexisting table that I mapped into
Phoenix - with CREATE TABLE).
||Query||Without Patch||With Patch||
|select count\(*) from "my5"|12.8s|9.7s|
|select count\(*) from "my5" where "3" = 1|23.5s|11.8s|
|select count\(*) from "my5" where "3" > 1|34.8s|15.6s|
|select avg("3") from "my5"|35.6s|17.4s|
|select avg("0"), avg("3") from "my5"|36.5s|20.2s|
|select avg("0"), avg("3") from "my5" where "4" = 1|31.8s|15.4s|
|select avg("0"), avg("3") from "my5" where "4" > 1|46.4s|25.1s|
Note that Phoenix adds a "fake" column to each row (so each row has a known KV
for things like COUNT) and (almost) always uses the ExplicitColumnTracker.
> Severe performance bug: isSeeked() in EncodedScannerV2 is always false
> ----------------------------------------------------------------------
>
> Key: HBASE-9915
> URL: https://issues.apache.org/jira/browse/HBASE-9915
> Project: HBase
> Issue Type: Bug
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Fix For: 0.98.0, 0.96.1, 0.94.14
>
> Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt,
> profile.png
>
>
> While debugging why reseek is so slow I found that it is quite broken for
> encoded scanners.
> The problem is this:
> AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was
> seeked or not. If it was it checks whether the KV we want to seek to is in
> the current block, if not it always consults the index blocks again.
> isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2
> and thus always returns false, which in turns causes an index lookup for each
> reseek.
--
This message was sent by Atlassian JIRA
(v6.1#6144)