[ 
https://issues.apache.org/jira/browse/CALCITE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18035281#comment-18035281
 ] 

Ruben Q L edited comment on CALCITE-7266 at 11/4/25 10:04 AM:
--------------------------------------------------------------

[~suibianwanwan33] , since you handled the original bug fix, probably you're 
more expert than myself on this topic, I'd like your opinion in this one, 
please.

I created a draft here: https://github.com/apache/calcite/pull/4614
A few test plans would need adjustment.


was (Author: rubenql):
[~suibianwanwan33] , since you handled the original bug fix, probably you're 
more expert than myself on this topic, I'd like your opinion in this one, 
please.

> Optimize the "well-known count bug" fix
> ---------------------------------------
>
>                 Key: CALCITE-7266
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7266
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Ruben Q L
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.42.0
>
>
> CALCITE-7010 fixed the "well-known count bug" on the RelDecorrelator.
> As shown 
> [here|https://github.com/apache/calcite/blob/8b5c17e51e0c9c3f8e3db17c8d449e67e4e2974a/core/src/main/java/org/apache/calcite/sql2rel/RelDecorrelator.java#L819],
>  the root cause of this bug is a misalignment when no match if found: the 
> original (correlated) plan returns NULL (or 0 for COUNT) when no match is 
> found; whereas the (bugged) decorrelated plan returned empty result set when 
> no match is found.
> This has been extensively explained on the original Jira, linked papers, PR 
> and the comments in the new code introduced by CALCITE-7010.
> The fix for this issue relied on introducing an extra join in order to avoid 
> "missing" any result on the decorrelated plan.
> I'd like to explore the possibility of optimizing this fix.
> Specifically, I'd like to discuss the situation where the original Correlate 
> is of type LEFT. The examples introduced in CALCITE-7010 were all about INNER 
> Correlates (which become INNER Joins), however I wonder if in case of LEFT 
> Correlates this situation could be handled differently. I'd argue, when we 
> have a LEFT Correlate (and no COUNT on the Aggregate) we will not require the 
> extra Join introduced by rewriteScalarAggregate, and the pre-bugfix 
> decorrelated plan was just fine. The reason for that is that, in this 
> scenario, having a NULL in case of mismatch vs having an empty set in case of 
> mismatch would be effectively the same since, due to the nature of the LEFT 
> type, an empty set will result on populating NULL values on the RHS, which is 
> precisely what the original plan was doing.
> Maybe I'm missing something... but I wanted to open the discussion to see if 
> we can optimize this fix and avoid (if possible in certain scenarios) the 
> extra join, which can be quite expensive depending on the data that the query 
> is handling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to