[
https://issues.apache.org/jira/browse/CALCITE-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730328#comment-17730328
]
Rong Rong edited comment on CALCITE-5740 at 6/8/23 1:06 AM:
------------------------------------------------------------
i see, yeah that was a poorly chosen example. you are correct COUNT(*) results
can be different if b.key is not unique. however, this might've been my own
configuration issue, but running the following query through the planner
{code:sql}
SELECT
col, COUNT(*)
FROM
a
WHERE
a.key IN (SELECT key FROM b WHERE val BETWEEN 0 AND 10)
{code}
with JOIN_TO_SEMI_JOIN rule will still result in an inner JOIN, with the RHS
becomes
{code:sql}
SELECT DISTINCT key FROM b WHERE val BETWEEN 0 AND 10
{code}
e.g. it is still not generating a SEMI-JOIN.
is there
# some other rule I can use to configure the planner to generate SEMI-JOIN?
# some default configuration that will directly generate a SEMI-JOIN when
going through the SqlToRelConverter?
Thanks!
was (Author: rongr):
i see, yeah that was a poorly chosen example. you are correct COUNT(*) results
can be different if b.key is not unique. however, this might've been my own
configuration issue, but running the following query through the planner
SELECT col, COUNT(*)FROMaWHEREa.key IN (SELECT key FROM b WHERE val BETWEEN 0
AND 10)
with JOIN_TO_SEMI_JOIN rule will still result in an inner JOIN, with the RHS
table as the result of
SELECT DISTINCT key FROM b WHERE val BETWEEN 0 AND 10
e.g. it is still not generating a SEMI-JOIN.
is there
# some other rule I can use to configure the planner to generate SEMI-JOIN?
# some default configuration that will directly generate a SEMI-JOIN when
going through the SqlToRelConverter?
Thanks!
> Support for AggToSemiJoinRule
> -----------------------------
>
> Key: CALCITE-5740
> URL: https://issues.apache.org/jira/browse/CALCITE-5740
> Project: Calcite
> Issue Type: New Feature
> Reporter: Rong Rong
> Priority: Major
>
> **Description**
> Currently we only have JoinToSemiJoin and ProjectToSemiJoin rule. which in
> the rule itself it performance check and see if the project accesses columns
> from the RHS result
> This can be extended to Aggregate as well, experimental code:
> https://github.com/walterddr/calcite/pull/1/files
> **Alternative**
> Alternative is to add a project/calc between the join and the aggregate to
> activate the project-to-semi-join rule. please share if there's any other
> alternative if I haven't considered.
> thanks
--
This message was sent by Atlassian Jira
(v8.20.10#820010)