[
https://issues.apache.org/jira/browse/CALCITE-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012088#comment-18012088
]
Zhen Chen commented on CALCITE-7125:
------------------------------------
I think it's OK to use the old version as the default rule. If not, the issue
described in this Jira ticket won't be resolved. The affected rules can be
adapted to the implementation in CALCITE-6893 later. If possible, please list
the rules that cannot be combined with CALCITE-6893 and create a new Jira
ticket so that someone can fix it.
> Impossible to get a plan with partial aggregate push-down via
> IntersectToDistinctRule
> -------------------------------------------------------------------------------------
>
> Key: CALCITE-7125
> URL: https://issues.apache.org/jira/browse/CALCITE-7125
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.40.0
> Reporter: Alessandro Solimando
> Assignee: Alessandro Solimando
> Priority: Major
> Labels: breaking_change
> Fix For: 1.41.0
>
>
> CALCITE-6893 changed _IntersectToDistinctRule_ to rewrite "INTERSECT
> DISTINCT" queries using a tagging plus filtering strategy rather than pushing
> aggregates into the "UNION" inputs.
> Rationale was to let other rules (_AGGREGATE_UNION_TRANSPOSE_ is mentioned in
> the ticket description) produce the plan with partial aggregation push-down,
> and let the CBO do the choice.
> Consider the following query:
> {noformat}
> SELECT empno FROM emp
> INTERSECT
> SELECT empno FROM emp
> {noformat}
> _IntersectToDistinctRule_, as per CALCITE-6893, would rewrite it as follows:
> {noformat}
> LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[AND(>($1, 0), >($2, 0))])
> LogicalAggregate(group=[{0}], count_i0=[COUNT() FILTER (WHERE i = 0)],
> count_i1=[COUNT() FILTER (WHERE i = 1)])
> LogicalUnion(all=[true])
> LogicalProject(EMPNO=[$0], i=[0]) // "tag" i=0
> TableScan(emp)
> LogicalProject(EMPNO=[$0], i=[1]) // "tag" i=1
> TableScan(emp)
> {noformat}
> This structure introduces:
> - A tagging column "i" added via projection
> - Two Projects wrapping each input
> - A single Aggregate with COUNT() FILTER expressions over those tags
> - A final filter requiring count_iN > 0, for N in "1..#operands"
> With this rewrite, the planner:
> - Can no longer push aggregates into the union inputs
> - Cannot apply rules like
> _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ or
> _AGGREGATE_PROJECT_MERGE_/_AggregateProjectMergeRule_
> - Fails to recover the earlier, sometimes more efficient, plan shape:
> {noformat}
> LogicalAggregate(group=[{0}])
> LogicalUnion(all=[true])
> LogicalAggregate(group=[{0}])
> ...
> LogicalAggregate(group=[{0}])
> ...
> {noformat}
> This prevents early row reduction and downstream rule composition, even when
> all relevant transformation rules are enabled.
> Prior to CALCITE-6893, the same query with the same planner configuration
> produced the optimized form above. That plan is no longer reachable with the
> current {_}IntersectToDistinctRule{_}, and no combination of core rules
> (including _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_,
> _PROJECT_SET_OP_TRANSPOSE_/_ProjectSetOpTransposeRule_, etc.) can recover it.
> Unless I am mistaken and someone can suggest further rules that can lead to
> the same plan as before, this seems like a regression.
> My understanding is that, the original rewrite proposed in CALCITE-6893 was
> supposed to rewritable (by means of
> _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ rule). The rewrite
> was incorrect and the currently implemented form has been suggested, but it
> hasn't been discussed if the "new" plan shape (post CALCITE-6893) could be
> transformed into the "old" plan shape (pre CALCITE-6893).
> The proposal for this ticket is to keep both behaviors, controllable via a
> rule configuration flag, with the pre- CALCITE-6893 one as default.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)