[ 
https://issues.apache.org/jira/browse/HBASE-9751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793532#comment-13793532
 ] 

Lars Hofhansl commented on HBASE-9751:
--------------------------------------

We're not finding the startRow, per se. We just start scanning at or after the 
row specified, the real startRow just happens to be the row key of the next KV.
In order to find the real stopRow you'd have to seek to the stopRow and then 
scan forward once - hence incur and extra IO/cache lookup; might be worth it, 
need to try.
Would also have to think about how we'd handle filter seeks that seek past the 
real stop row, might need a special check for that.

I also like Ted's suggestion about just generally passing the readpoint down 
the scanner stack. In an unrelated project we followed an approach of using 
ThreadLocals on not so hot paths and passing down the object on the hot paths 
instead.
We could question the use of threadlocals altogether. I think that's would had 
in mind, Vladimir, right?


> Excessive  readpoints checks in StoreFileScanner
> ------------------------------------------------
>
>                 Key: HBASE-9751
>                 URL: https://issues.apache.org/jira/browse/HBASE-9751
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0, 0.94.12, 0.96.0
>            Reporter: Vladimir Rodionov
>            Assignee: Lars Hofhansl
>             Fix For: 0.98.0, 0.94.13, 0.96.1
>
>         Attachments: 9751-0.94.txt, 9751-trunk.txt
>
>
> It seems that usage of skipKVsNewerThanReadpoint in StoreFileScanner can be 
> greatly reduced or even eliminated all together (HFiles are immutable and no 
> new KVs can be inserted after scanner instance is created). The same is true 
> for MemStoreScanner which checks readpoint on every next() and seek(). Each 
> readpoint check is ThreadLocal.get() and it is quite expensive.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to