Thanks for Yanjing Wang. It's relevant to the CALCITE-4512: https://issues.apache.org/jira/browse/CALCITE-4512 I do some investigation on some sql-engines, and I hope to help the discussion of CALCITE-4512.
Yanjing Wang <[email protected]> 于2022年3月11日周五 10:30写道: > Hi Xuren, > This jira CALCITE-4512 <https://issues.apache.org/jira/browse/CALCITE-4512 > > > may be similar to your problem, hope helpful for you. > > 徐仁和 <[email protected]> 于2022年3月10日周四 18:14写道: > > > Hi community > > > > I found some different behaviors in the SQL's verification phase using > > `SqlConformance#isGroupByAlias=true`. > > This case is that the alias in the grouping is equal to the name of the > > table's column. > > > > > > For example: > > CREATE TABLE test_tbl ( > > id integer, > > col1 varchar > > ); > > id | col1 > > ----+------ > > 1 | abc > > 2 | abb > > 3 | abcd > > 4 | a > > 5 | abc > > > > > > -- origin sql > > SELECT LENGTH(`col1`) AS `col1`, COUNT(*) > > FROM `test_tbl` > > GROUP BY `col1` > > > > -- converted rel > > LogicalAggregate(group=[{0}], EXPR$1=[COUNT()]) > > LogicalProject(col1=[LENGTH($1)]) > > LogicalTableScan(table=[[default, test_tbl]]) > > > > According to rel, we will get results as below: > > col1| EXPR$1 > > ----+------- > > 3 | 3 > > 1 | 1 > > 4 | 1 > > > > > > But, It's different from mysql. > > > > MySQL-Result(MySQL5.7 or MySQL8.0): > > +------+----------+ > > | col1 | COUNT(*) | > > +------+----------+ > > | 1 | 1 | > > | 3 | 1 | > > | 3 | 2 | > > | 4 | 1 | > > +------+----------+ > > > > The behavior of MySQL is grouping by table[test_tbl]'s column[col1] > > firstly, not grouping by LENGTH(`col1`). > > And it calc expression of LENGTH(`col1`) after executing aggregate. > > > > > > I tried this case in PostgreSQL and SQLite, and found that the two > > sql-engine's behavior is the same as MySQL. > > > > PostgreSQL-Result(PostgreSQL 13): > > col1 | count > > ------+------- > > 1 | 1 > > 3 | 1 > > 4 | 1 > > 3 | 2 > > > > SQLite-Result:(SQLite 3) > > col1 | count > > ------+------- > > 1 | 1 > > 3 | 1 > > 4 | 1 > > 3 | 2 > > > > > > PS: > > -- Plan of PostgreSQL: > > EXPLAIN SELECT LENGTH("col1") AS "col1", COUNT(*) > > FROM "test_tbl" > > GROUP BY "col1"; > > QUERY PLAN > > ------------------------------------------------------------------- > > HashAggregate (cost=12.10..13.85 rows=140 width=528) > > Group Key: col1 > > -> Seq Scan on test_tbl (cost=0.00..11.40 rows=140 width=516) > > (3 rows) > > > > > > Best > > Xurenhe > > >
