[
https://issues.apache.org/jira/browse/CALCITE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18035281#comment-18035281
]
Ruben Q L edited comment on CALCITE-7266 at 11/4/25 10:04 AM:
--------------------------------------------------------------
[~suibianwanwan33] , since you handled the original bug fix, probably you're
more expert than myself on this topic, I'd like your opinion in this one,
please.
I created a draft here: https://github.com/apache/calcite/pull/4614
A few test plans would need adjustment.
was (Author: rubenql):
[~suibianwanwan33] , since you handled the original bug fix, probably you're
more expert than myself on this topic, I'd like your opinion in this one,
please.
> Optimize the "well-known count bug" fix
> ---------------------------------------
>
> Key: CALCITE-7266
> URL: https://issues.apache.org/jira/browse/CALCITE-7266
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Reporter: Ruben Q L
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.42.0
>
>
> CALCITE-7010 fixed the "well-known count bug" on the RelDecorrelator.
> As shown
> [here|https://github.com/apache/calcite/blob/8b5c17e51e0c9c3f8e3db17c8d449e67e4e2974a/core/src/main/java/org/apache/calcite/sql2rel/RelDecorrelator.java#L819],
> the root cause of this bug is a misalignment when no match if found: the
> original (correlated) plan returns NULL (or 0 for COUNT) when no match is
> found; whereas the (bugged) decorrelated plan returned empty result set when
> no match is found.
> This has been extensively explained on the original Jira, linked papers, PR
> and the comments in the new code introduced by CALCITE-7010.
> The fix for this issue relied on introducing an extra join in order to avoid
> "missing" any result on the decorrelated plan.
> I'd like to explore the possibility of optimizing this fix.
> Specifically, I'd like to discuss the situation where the original Correlate
> is of type LEFT. The examples introduced in CALCITE-7010 were all about INNER
> Correlates (which become INNER Joins), however I wonder if in case of LEFT
> Correlates this situation could be handled differently. I'd argue, when we
> have a LEFT Correlate (and no COUNT on the Aggregate) we will not require the
> extra Join introduced by rewriteScalarAggregate, and the pre-bugfix
> decorrelated plan was just fine. The reason for that is that, in this
> scenario, having a NULL in case of mismatch vs having an empty set in case of
> mismatch would be effectively the same since, due to the nature of the LEFT
> type, an empty set will result on populating NULL values on the RHS, which is
> precisely what the original plan was doing.
> Maybe I'm missing something... but I wanted to open the discussion to see if
> we can optimize this fix and avoid (if possible in certain scenarios) the
> extra join, which can be quite expensive depending on the data that the query
> is handling.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)