[ 
https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166128#comment-14166128
 ] 

James Taylor commented on PHOENIX-1267:
---------------------------------------

Our guideposts aren't that granular - they're more in the  range of 1/10 of the 
region size, so perhaps 500MB - 1GB by default (am I doing my math right?). So 
I guess we shouldn't decide based on that.

Use our "point lookup" is a skip scan. We essentially hop from row to row using 
SEEK_NEXT_HINT to get there. Would this be a good candidate for using small 
scan? Would it depend on how many seeks we're doing? We could figure that out 
in advance.

So that leaves us with the scan case. We can only know we'll scan < N rows if 
we have a LIMIT or if we'll go through our ChunkedResultIterator. So those two 
cases seem good to use a small scan. We can turn small scan off for the 
ChunkedResulterator case after we hit the end of the first batch (3000 rows by 
default).

> Set scan.setSmall(true) when appropriate
> ----------------------------------------
>
>                 Key: PHOENIX-1267
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1267
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: jay wong
>         Attachments: smallscan.patch, smallscan2.patch, smallscan3.patch
>
>
> There's a nice optimization that has been in HBase for a while now to set a 
> scan as "small". This prevents extra RPC calls, I believe. We should add a 
> hint for queries that forces it to be set/not set, and make our best guess on 
> when it should default to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to