[ 
https://issues.apache.org/jira/browse/CALCITE-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730328#comment-17730328
 ] 

Rong Rong edited comment on CALCITE-5740 at 6/8/23 1:14 AM:
------------------------------------------------------------

i see, yeah that was a poorly chosen example. you are correct COUNT(*) results 
can be different if b.key is not unique. however, this might've been my own 
configuration issue, but running the following query through the planner
{code:sql}
SELECT  
  col, COUNT(*)
FROM
  a
WHERE
  a.key IN (SELECT key FROM b WHERE val BETWEEN 0 AND 10)
{code}
with JOIN_TO_SEMI_JOIN rule will still result in an inner JOIN, the plan looks 
like this
{code:sql}
    LogicalAggregate(group=[{1}], EXPR$1=[COUNT()])
      LogicalJoin(condition=[=($0, $2)], joinType=[inner])
        LogicalProject(key=[$0], col=[$1])
          LogicalTableScan(table=[[a]])
        LogicalAggregate(group=[{0}])
          LogicalProject(key=[$1], val=[$4])
            LogicalFilter(condition=[AND(>=($4, 0), <=($4, 10))])
              LogicalTableScan(table=[[b]]) {code}
e.g. it is still not generating a SEMI-JOIN. 

is there
 # some other rule I can use to configure the planner to generate SEMI-JOIN?
 # some default configuration that will directly generate a SEMI-JOIN when 
going through the SqlToRelConverter?

Thanks!


was (Author: rongr):
i see, yeah that was a poorly chosen example. you are correct COUNT(*) results 
can be different if b.key is not unique. however, this might've been my own 
configuration issue, but running the following query through the planner
{code:sql}
SELECT  
  col, COUNT(*)
FROM
  a
WHERE
  a.key IN (SELECT key FROM b WHERE val BETWEEN 0 AND 10)
{code}
with JOIN_TO_SEMI_JOIN rule will still result in an inner JOIN, with the RHS 
becomes
{code:sql}
SELECT DISTINCT key FROM b WHERE val BETWEEN 0 AND 10
{code}
e.g. it is still not generating a SEMI-JOIN. 

is there
 # some other rule I can use to configure the planner to generate SEMI-JOIN?
 # some default configuration that will directly generate a SEMI-JOIN when 
going through the SqlToRelConverter?

Thanks!

> Support for AggToSemiJoinRule
> -----------------------------
>
>                 Key: CALCITE-5740
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5740
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Rong Rong
>            Priority: Major
>
> **Description**
> Currently we only have JoinToSemiJoin and ProjectToSemiJoin rule.  which in 
> the rule itself it performance check and see if the project accesses columns 
> from the RHS result
> This can be extended to Aggregate as well, experimental code: 
> https://github.com/walterddr/calcite/pull/1/files
> **Alternative**
> Alternative is to add a project/calc between the join and the aggregate to 
> activate the project-to-semi-join rule. please share if there's any other 
> alternative if I haven't considered. 
> thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to