[
https://issues.apache.org/jira/browse/PHOENIX-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150413#comment-14150413
]
Lars Hofhansl commented on PHOENIX-1278:
----------------------------------------
Another question: Why do we need to merge sort the results of salt bucket
queries? The row-ranges of each bucket (and indeed each region) are always
strictly exclusive of all other regions. Even if we want to guarantee sorted
rows to the client (which is another question we should discuss) we should be
able to do that by sorting entire salted chunk w.r.t. each other by just
looking at the first row in each chunk. Maybe there are other scenarios where
this is possible.
> Performance degradation for salted tables with guideposts
> ---------------------------------------------------------
>
> Key: PHOENIX-1278
> URL: https://issues.apache.org/jira/browse/PHOENIX-1278
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: Anoop Sam John
>
> When a table is salted, we're seeing a degradation in performance using our
> new guidepost-based parallelization. With salted tables, we do a merge sort
> with the results from all the parallel scans. I suspect the cause here is
> that we're doing a merge sort now between more chunks than before (since we
> chunk everything up more now than we used to). We should group the scans
> we're doing for the same bucket together and do a concat with those results
> and then do a merge sort only with the concatenated batches.
> Pls revert PHOENIX-1279 when we implement this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)