[
https://issues.apache.org/jira/browse/PHOENIX-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629817#comment-16629817
]
Lars Hofhansl commented on PHOENIX-4594:
----------------------------------------
Thanks [~Bin Shi], I completely agree that we should tackle these issues in
separate jiras.
So in the code above you binary search within the windows and linearly move
through the windows?
As an aside, do we have an idea about how much the guide posts prefix encoding
actually helps?! The relative effectiveness would increase as the guidepost
width shrinks - as guideposts are more likely to actually have a prefix in
common. (But after all we're recording a guidepost every 300MB right now. Even
if we did 10MB, how likely would consecutive guideposts have a similar prefix?)
I know, I know, it depends on the data :), but I still wonder how useful this
is with practical data, or whether that was a bit of a premature optimization.
If we didn't have that, you could binary search the entire and be done. (and
think about PHOENIX-4927 :) )
> Perform binary search on guideposts during query compilation
> ------------------------------------------------------------
>
> Key: PHOENIX-4594
> URL: https://issues.apache.org/jira/browse/PHOENIX-4594
> Project: Phoenix
> Issue Type: Improvement
> Reporter: James Taylor
> Assignee: Bin Shi
> Priority: Major
> Attachments: PHOENIX-4594-0913.patch, PHOENIX-4594_0917.patch,
> PHOENIX-4594_0918.patch
>
>
> If there are many guideposts, performance will suffer during query
> compilation because we do a linear search of the guideposts to find the
> intersection with the scan ranges. Instead, in
> BaseResultIterators.getParallelScans() we should populate an array of
> guideposts and perform a binary search.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)