[
https://issues.apache.org/jira/browse/CALCITE-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082393#comment-17082393
]
Haisheng Yuan commented on CALCITE-3920:
----------------------------------------
Maybe we don't have to implement them together. Simply changing the sort
algorithm may suffice. In Postgres database, when there is a limit in the plan,
it will choose top-N heapsort when there is enough memory, or external disk
merge sort when not enough memory.
> Improve ORDER BY computation in Enumerable convention by exploiting LIMIT
> -------------------------------------------------------------------------
>
> Key: CALCITE-3920
> URL: https://issues.apache.org/jira/browse/CALCITE-3920
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.22.0
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
> Fix For: 1.23.0
>
>
> There are many use-cases (pagination, top-k) relying on queries with an ORDER
> BY clause followed by a LIMIT.
> At the moment, the two operations are implemented independently the one from
> the other
> in the Enumerable convention. Even when we know that consumer needs only the
> top-10 results the sort operation will try to maintain its entire input
> sorted. The complexity of the sorting operation is O( n ) space and O( nlogn
> ) time, where n is the size of the input.
> By implementing ORDER BY and LIMIT together there are various optimizations
> that can be applied to reduce the space and time complexity of the sorting
> algorithm.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)