[ https://issues.apache.org/jira/browse/CALCITE-5516 ]
fugang deleted comment on CALCITE-5516:
---------------------------------
was (Author: JIRAUSER305412):
For the first example, the main issue lies in the function
{{{}COALESCE(SUM(SAL), 0){}}}. By converting it to
sql复制代码
| |CAST(CASE WHEN COUNT(*) = 0 THEN NULL ELSE SUM(SAL) END / COUNT(*) AS
INTEGER)|
we can avoid such inconsistencies, and this problem is not only present in the
{{{}PostgresqlSqlDialect{}}}, but also in the {{{}MysqlSqlDialect{}}}.
By modifying the {{reduceSum}} method and the {{matches}} method in the
{{{}AggregateReduceFunctionsRule{}}}, it is possible to correct this issue.
However, I am not sure if eliminating the conversion of {{SUM}} to {{SUM0}}
would have any impact on other dialects.
Although Postgresql itself does not support syntax like {{{}t WITHIN
DISTINCT{}}}, Calcite itself is not tied to a specific SQL version or standard,
and I have not found any distinction based on dialects in the validator module,
which I believe is reasonable.
> RelOptRulesTest AggregateReduceFunctionsRule optimized plan not semantically
> equivalent to the original one after conversion
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-5516
> URL: https://issues.apache.org/jira/browse/CALCITE-5516
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.32.0
> Reporter: Pinhan Zhao
> Priority: Blocker
>
> After converting the "plan before" and "plan after" of
> testReduceWithNonTypePredicate
> (https://github.com/apache/calcite/blob/fb340ece8c5e60ecfd6371950f8dcb665c85a712/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java#L6820)
> to PostgreSQL queries, we obtained two queries:
> {code:sql}
> SELECT AVG(SAL), AVG(SAL) FROM EMP{code}
> and
> {code:sql}
> SELECT AVG(SAL), CAST(CASE WHEN COUNT(*) = 0 THEN NULL ELSE
> COALESCE(SUM(SAL), 0) END / COUNT(*) AS INTEGER) FROM EMP{code}
>
> The code we used to convert the plans to PostgreSQL queries is:
> {code:java}
> converter.visitRoot(rel).asStatement().toSqlString(PostgresqlSqlDialect.DEFAULT).getSql().replace('\n',
> ''){code}
>
> However, these queries are not semantically equivalent. With a
> counterexample below:
> {code:sql}
> --------DEPT----------
> CREATE TABLE DEPT (
> DEPTNO INTEGER PRIMARY KEY,
> NAME VARCHAR(20)
> );
> INSERT INTO DEPT VALUES (0,'0');
> INSERT INTO DEPT VALUES (1,'0');
> --------EMP----------
> CREATE TABLE EMP (
> EMPNO INTEGER PRIMARY KEY,
> DEPTNO INTEGER REFERENCES DEPT(DEPTNO),
> ENAME VARCHAR(20),
> JOB VARCHAR(20),
> MGR INTEGER,
> HIREDATE DATE,
> SAL INTEGER,
> COMM INTEGER,
> SLACKER BOOLEAN
> );
> INSERT INTO EMP VALUES (0,0,'1','0',0,'2000-01-01',NULL,0,True);
> INSERT INTO EMP VALUES (-1,1,'0','0',0,'2000-01-01',NULL,0,True);{code}
>
> we can have two queries produce different output:
> {code:java}
> avg | avg
> -----+-----
> |
> (1 row) {code}
> (output of the plan before optimization, with converted to PostgreSQL query)
>
> {code:java}
> avg | int4
> -----+------
> | 0
> (1 row)
> {code}
> (output of the plan after optimization, with converted to PostgreSQL query)
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)