[ 
https://issues.apache.org/jira/browse/PHOENIX-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16630611#comment-16630611
 ] 

Bin Shi commented on PHOENIX-4594:
----------------------------------

-->So in the code above you binary search within the windows and linearly move 
through the windows?

Yes, as long as we have guide posts in prefix encoding, we have to sequentially 
decode and load guide posts. With moving window, at least we can reduce memory 
footprint and cut the cost of decoding/loading guide posts after scan ranges. 
This is a small optimization that we can hold even abandon, and we can decide 
what's the right solution after measuring the benefit of using prefix encoding 
for guide posts and solve the problem with the right solution directly.

I had the same question about "how much the guide posts prefix encoding 
actually helps" when I saw the guide post info actually contains the number of 
estimated rows, the estimated size and the last update time. What's the overall 
contribution to reduce footprint of guide post info by using prefix encoding on 
guide posts (keys). I'll collect data to measure it. Although I think one of 
main benefits of using the tree like structure of guide post info can reduce 
the time complexity which using array can't achieve, we can hold the discussion 
about data structure after I provide the result of measurement :).

 

> Perform binary search on guideposts during query compilation
> ------------------------------------------------------------
>
>                 Key: PHOENIX-4594
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4594
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Bin Shi
>            Priority: Major
>         Attachments: PHOENIX-4594-0913.patch, PHOENIX-4594_0917.patch, 
> PHOENIX-4594_0918.patch
>
>
> If there are many guideposts, performance will suffer during query 
> compilation because we do a linear search of the guideposts to find the 
> intersection with the scan ranges. Instead, in 
> BaseResultIterators.getParallelScans() we should populate an array of 
> guideposts and perform a binary search. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to