[
https://issues.apache.org/jira/browse/FLINK-39720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-39720:
-----------------------------------
Labels: pull-request-available (was: )
> SubQueryDecorrelator produces incorrect plans for correlated EXISTS with
> HAVING on aggregate outputs
> ----------------------------------------------------------------------------------------------------
>
> Key: FLINK-39720
> URL: https://issues.apache.org/jira/browse/FLINK-39720
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 1.20.4, 2.3.0, 2.2.1
> Reporter: lincoln lee
> Assignee: lincoln lee
> Priority: Critical
> Labels: pull-request-available
>
> SubQueryDecorrelator.decorrelateRel(LogicalFilter) reattaches the
> non-correlated remainder of a Filter condition to the rewritten input without
> remapping its
> RexInputRefs through frame.oldToNewOutputs. When the child LogicalAggregate
> has had correlated columns injected into its group key (which shifts the
> position of
> aggregate-output fields), surviving HAVING / Filter predicates silently
> point at the wrong column. The resulting plan is structurally valid but
> semantically wrong.
> Reproduction
> Schema (matches SubQuerySemiJoinTest): l(a INT, b BIGINT, c VARCHAR), r(d
> INT, e BIGINT, f VARCHAR).
> SELECT * FROM l
> WHERE EXISTS (
> SELECT 1 FROM r
> WHERE l.a = r.d -- correlated WHERE
> GROUP BY r.f
> HAVING SUM(r.e) >= 3 -- non-correlated HAVING on aggregate output
> );
> Expected: HAVING applies to the SUM(r.e) column.
> Actual (before fix): HAVING applies to the injected r.d group-key column
> (>=($1, 3) where $1 is now r.d, not SUM(r.e)). Plan is silently wrong.
> Other shapes that trigger the same drift:
> - Compound HAVING: HAVING SUM(r.e) >= 3 AND MAX(r.e) < 100
> - Mixed agg + COUNT: HAVING SUM(r.e) >= 3 AND COUNT(*) > 1
> - Multiple correlated cols: WHERE l.a = r.d AND l.b = r.e ... HAVING
> COUNT(r.d) >= 2
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)