mihaibudiu commented on code in PR #3641:
URL: https://github.com/apache/calcite/pull/3641#discussion_r1480422181
##########
core/src/main/java/org/apache/calcite/tools/RelBuilder.java:
##########
@@ -2525,6 +2529,29 @@ private RelBuilder aggregate_(GroupKeyImpl groupKey,
return project(projects.transform((i, name) -> aliasMaybe(field(i),
name)));
}
+ /**
+ * Removed redundant distinct if an input is already unique.
Review Comment:
I would document that this is specifically for aggregates.
And perhaps a better function name would be removeRedundantAggregateDistinct.
##########
core/src/test/resources/org/apache/calcite/test/SqlToRelConverterTest.xml:
##########
@@ -6343,6 +6343,161 @@ LogicalAggregate(group=[{0}], CNT=[COUNT()])
LogicalFilter(condition=[>(ITEM($0, 'N_NATIONKEY'), 5)])
LogicalProject(**=[$0])
LogicalTableScan(table=[[CATALOG, SALES, NATION]])
+]]>
+ </Resource>
+ </TestCase>
+ <TestCase name="testRemoveDistinctIfUnique">
Review Comment:
Where is the corresponding test?
##########
core/src/main/java/org/apache/calcite/tools/RelBuilder.java:
##########
@@ -4902,6 +4929,18 @@ public interface Config {
/** Sets {@link #convertCorrelateToJoin()}. */
Config withConvertCorrelateToJoin(boolean convertCorrelateToJoin);
+
+ /** Whether to save the distinct if we know that the input is
+ * already unique; default true. */
+ @Value.Default
+ default boolean redundantDistinct() {
Review Comment:
This flag is a little unintuitive, since it *inhibits* the optimization
rather than enabling it.
All the other similar flags are in the opposite way.
##########
core/src/test/resources/org/apache/calcite/test/SqlToRelConverterTest.xml:
##########
@@ -6343,6 +6343,161 @@ LogicalAggregate(group=[{0}], CNT=[COUNT()])
LogicalFilter(condition=[>(ITEM($0, 'N_NATIONKEY'), 5)])
LogicalProject(**=[$0])
LogicalTableScan(table=[[CATALOG, SALES, NATION]])
+]]>
+ </Resource>
+ </TestCase>
+ <TestCase name="testRemoveDistinctIfUnique">
+ <Resource name="sql">
+ <![CDATA[SELECT
+ deptno,
+ COUNT(DISTINCT sal) as cds,
+ COUNT(sal) as cs,
+ SUM(DISTINCT sal) AS sds,
+ SUM(sal) AS ss
+FROM (
+ SELECT DISTINCT deptno, sal
+ FROM emp)
+GROUP BY deptno]]>
+ </Resource>
+ <Resource name="plan">
+ <![CDATA[
+LogicalProject(DEPTNO=[$0], CDS=[$1], CS=[$2], SDS=[$3], SS=[$3])
+ LogicalAggregate(group=[{0}], CDS=[COUNT($1)], CS=[COUNT()], SDS=[SUM($1)])
+ LogicalAggregate(group=[{0, 1}])
+ LogicalProject(DEPTNO=[$7], SAL=[$5])
+ LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+]]>
+ </Resource>
+ </TestCase>
+ <TestCase name="testRemoveDistinctIfUnique1">
+ <Resource name="sql">
+ <![CDATA[SELECT
+ deptno,
+ COUNT(DISTINCT sal) as cds,
+ COUNT(sal) as cs,
+ SUM(DISTINCT sal) AS sds,
+ SUM(sal) AS ss
+FROM (
+ SELECT DISTINCT deptno, sal
+ FROM emp)
+GROUP BY deptno]]>
+ </Resource>
+ <Resource name="plan">
+ <![CDATA[
+LogicalProject(DEPTNO=[$0], CDS=[$1], CS=[$2], SDS=[$3], SS=[$3])
+ LogicalAggregate(group=[{0}], CDS=[COUNT($1)], CS=[COUNT()], SDS=[SUM($1)])
+ LogicalAggregate(group=[{0, 1}])
+ LogicalProject(DEPTNO=[$7], SAL=[$5])
+ LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+]]>
+ </Resource>
+ </TestCase>
+ <TestCase name="testRemoveDistinctIfUnique2">
+ <Resource name="sql">
+ <![CDATA[SELECT
+ COUNT(DISTINCT sal) as cds,
+ COUNT(sal) as cs,
+ SUM(DISTINCT sal) AS sds,
+ SUM(sal) AS ss
+FROM (
+ SELECT deptno, 1 as sal
+ FROM emp GROUP BY deptno)GROUP BY deptno
+]]>
+ </Resource>
+ <Resource name="plan">
+ <![CDATA[
+LogicalProject(CDS=[$1], CS=[$2], SDS=[$3], SS=[$3])
+ LogicalAggregate(group=[{0}], CDS=[COUNT($1)], CS=[COUNT()], SDS=[SUM($1)])
+ LogicalProject(DEPTNO=[$0], SAL=[1])
+ LogicalAggregate(group=[{0}])
+ LogicalProject(DEPTNO=[$7])
+ LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+]]>
+ </Resource>
+ </TestCase>
+ <TestCase name="testRemoveDistinctIfUnique3">
+ <Resource name="sql">
+ <![CDATA[SELECT
+ COUNT(DISTINCT sal) as cds,
+ COUNT(sal) as cs,
+ SUM(DISTINCT sal) AS sds,
+ SUM(sal) AS ss
+FROM (
+ SELECT DISTINCT deptno, sal
+ FROM emp)
+]]>
+ </Resource>
+ <Resource name="plan">
+ <![CDATA[
+LogicalAggregate(group=[{}], CDS=[COUNT(DISTINCT $0)], CS=[COUNT()],
SDS=[SUM(DISTINCT $0)], SS=[SUM($0)])
+ LogicalProject(SAL=[$1])
+ LogicalAggregate(group=[{0, 1}])
+ LogicalProject(DEPTNO=[$7], SAL=[$5])
+ LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+]]>
+ </Resource>
+ </TestCase>
+ <TestCase name="testRemoveDistinctIfUnique4">
+ <Resource name="sql">
+ <![CDATA[SELECT
+ COUNT(DISTINCT sal) as cds,
+ COUNT(sal) as cs,
+ SUM(DISTINCT sal) AS sds,
+ SUM(sal) AS ss
+FROM (
+ SELECT deptno, sal
+ FROM emp GROUP BY deptno, sal)GROUP BY deptno
+]]>
+ </Resource>
+ <Resource name="plan">
+ <![CDATA[
+LogicalProject(CDS=[$1], CS=[$2], SDS=[$3], SS=[$4])
+ LogicalAggregate(group=[{0}], CDS=[COUNT(DISTINCT $1)], CS=[COUNT()],
SDS=[SUM(DISTINCT $1)], SS=[SUM($1)])
+ LogicalAggregate(group=[{0, 1}])
+ LogicalProject(DEPTNO=[$7], SAL=[$5])
+ LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+]]>
+ </Resource>
+ </TestCase>
+ <TestCase name="testRemoveDistinctIfUnique5">
+ <Resource name="sql">
+ <![CDATA[SELECT COUNT(DISTINCT empno)
+FROM emp
+]]>
+ </Resource>
+ <Resource name="plan">
+ <![CDATA[
+LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
Review Comment:
Why is the DISTINCT removed from this case?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]