[
https://issues.apache.org/jira/browse/PHOENIX-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16630611#comment-16630611
]
Bin Shi commented on PHOENIX-4594:
----------------------------------
-->So in the code above you binary search within the windows and linearly move
through the windows?
Yes, as long as we have guide posts in prefix encoding, we have to sequentially
decode and load guide posts. With moving window, at least we can reduce memory
footprint and cut the cost of decoding/loading guide posts after scan ranges.
This is a small optimization that we can hold even abandon, and we can decide
what's the right solution after measuring the benefit of using prefix encoding
for guide posts and solve the problem with the right solution directly.
I had the same question about "how much the guide posts prefix encoding
actually helps" when I saw the guide post info actually contains the number of
estimated rows, the estimated size and the last update time. What's the overall
contribution to reduce footprint of guide post info by using prefix encoding on
guide posts (keys). I'll collect data to measure it. Although I think one of
main benefits of using the tree like structure of guide post info can reduce
the time complexity which using array can't achieve, we can hold the discussion
about data structure after I provide the result of measurement :).
> Perform binary search on guideposts during query compilation
> ------------------------------------------------------------
>
> Key: PHOENIX-4594
> URL: https://issues.apache.org/jira/browse/PHOENIX-4594
> Project: Phoenix
> Issue Type: Improvement
> Reporter: James Taylor
> Assignee: Bin Shi
> Priority: Major
> Attachments: PHOENIX-4594-0913.patch, PHOENIX-4594_0917.patch,
> PHOENIX-4594_0918.patch
>
>
> If there are many guideposts, performance will suffer during query
> compilation because we do a linear search of the guideposts to find the
> intersection with the scan ranges. Instead, in
> BaseResultIterators.getParallelScans() we should populate an array of
> guideposts and perform a binary search.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)