Hi community

I found some different behaviors in the SQL's verification phase using
`SqlConformance#isGroupByAlias=true`.
This case is that the alias in the grouping is equal to the name of the
table's column.


For example:
CREATE TABLE test_tbl (
  id integer,
  col1 varchar
);
 id | col1
----+------
  1 | abc
  2 | abb
  3 | abcd
  4 | a
  5 | abc


-- origin sql
SELECT LENGTH(`col1`) AS `col1`, COUNT(*)
FROM `test_tbl`
GROUP BY `col1`

-- converted rel
LogicalAggregate(group=[{0}], EXPR$1=[COUNT()])
  LogicalProject(col1=[LENGTH($1)])
    LogicalTableScan(table=[[default, test_tbl]])

According to rel, we will get results as below:
col1| EXPR$1
----+-------
  3 | 3
  1 | 1
  4 | 1


But, It's different from mysql.

MySQL-Result(MySQL5.7 or MySQL8.0):
+------+----------+
| col1 | COUNT(*) |
+------+----------+
|    1 |        1 |
|    3 |        1 |
|    3 |        2 |
|    4 |        1 |
+------+----------+

The behavior of MySQL is grouping by table[test_tbl]'s column[col1]
firstly, not grouping by LENGTH(`col1`).
And it calc expression of LENGTH(`col1`) after executing aggregate.


I tried this case in PostgreSQL and SQLite, and found that the two
sql-engine's behavior is the same as MySQL.

PostgreSQL-Result(PostgreSQL 13):
 col1 | count
------+-------
    1 |     1
    3 |     1
    4 |     1
    3 |     2

SQLite-Result:(SQLite 3)
 col1 | count
------+-------
    1 |     1
    3 |     1
    4 |     1
    3 |     2


PS:
-- Plan of PostgreSQL:
EXPLAIN SELECT LENGTH("col1") AS "col1", COUNT(*)
FROM "test_tbl"
GROUP BY "col1";
                            QUERY PLAN
-------------------------------------------------------------------
 HashAggregate  (cost=12.10..13.85 rows=140 width=528)
   Group Key: col1
   ->  Seq Scan on test_tbl  (cost=0.00..11.40 rows=140 width=516)
(3 rows)


Best
Xurenhe

Reply via email to