[
https://issues.apache.org/jira/browse/CALCITE-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Solimando updated CALCITE-7144:
------------------------------------------
Component/s: core
> LIMIT should not be pushed through projections containing window functions
> --------------------------------------------------------------------------
>
> Key: CALCITE-7144
> URL: https://issues.apache.org/jira/browse/CALCITE-7144
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.40.0
> Reporter: Arnaud Jegou
> Priority: Major
> Fix For: 1.41.0
>
>
> _This is an issue with the RelBuilder, we have not noticed this behaviour
> with rules_
> When adding a limit/offset with no collation, the RelBuilder will try to push
> it down to a lower sort with no limit or offset, even if there is a project
> in between
> Basically instead of generating the following:
> {code:java}
> LogicalSort(fetch=[10])
> LogicalProject(xxx)
> LogicalSort(sort0=[$1], dir0=[ASC]) {code}
> it will generate:
> {code:java}
> LogicalProject(xxx)
> LogicalSort(fetch=[10], sort0=[$1], dir0=[ASC]) {code}
> This is valid for "regular" projects, but not if the project contains a
> window function as in this case the limit would be applied before the
> aggregation and produce invalid results
> What actually happens in the code is, in
> [RelBuilder.sortLimit:|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3662]
> # if there is [no
> collation|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3687]
> # and the current top node [is a
> project|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3706]
> # and the input of that project [is a
> Sort|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3708]
> # and that sort has [no limit or
> offset|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3710]
> # then the "new" limit and offset are [pushed down into that
> sort|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3711]
> {code:java}
> if (fieldCollations.isEmpty()) { // 1
> // ...
> if (top instanceof Project) { // 2
> final Project project = (Project) top;
> if (project.getInput() instanceof Sort) { // 3
> final Sort sort2 = (Sort) project.getInput();
> if (sort2.offset == null && sort2.fetch == null) { // 4
> final RelNode sort =
> struct.sortFactory.createSort(sort2.getInput(),
> sort2.collation, offsetNode, fetchNode); // 5
> replaceTop(
> struct.projectFactory.createProject(sort,
> project.getHints(),
> project.getProjects(),
> Pair.right(project.getNamedProjects()),
> project.getVariablesSet()));
> return this; {code}
> One possible fix would simply be to check that the project has no window
> function in 3) by replacing the condition with
> {code:java}
> if (!project.containsOver() && project.getInput() instanceof Sort) { {code}
> I am not 100% sure that this is the right fix but it looks correct, and the
> tests do run against a version containing this change without errors
>
> Here is a test case that covers this scenario:
> {code:java}
> /**
> * Test case for
> * <a href="https://issues.apache.org/jira/browse/CALCITE-XXX">[CALCITE-XXX]
> * RelBuilder pushes limits through window functions which is invalid</a>.
> * The test checks that the RelBuilder does not merge a limit through a
> Project
> * that contains a windowed aggregate function into a lower sort. */
> @Test void testLimitOverProjectWithWindowFunctions() {
> final Function<RelBuilder, RelNode> f = b -> b.scan("EMP")
> .sort(1)
> .project(b.field("DEPTNO"),
> b.aggregateCall(SqlStdOperatorTable.ROW_NUMBER)
> .over()
> .partitionBy()
> .orderBy(b.field("EMPNO"))
> .rowsUnbounded()
> .as("x"))
> .limit(0, 10)
> .build();
> final String expected = ""
> + "LogicalSort(fetch=[10])\n"
> + " LogicalProject(DEPTNO=[$7], x=[ROW_NUMBER() OVER (ORDER BY
> $0)])\n"
> + " LogicalSort(sort0=[$1], dir0=[ASC])\n"
> + " LogicalTableScan(table=[[scott, EMP]])\n";
> assertThat(f.apply(createBuilder()), hasTree(expected));
> }{code}
> And I opened a pr to show what the fix would look like, if the proposed fix
> is correct: [https://github.com/apache/calcite/pull/4503] (but feel free to
> discard it and create a new one if need be)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)