[ https://issues.apache.org/jira/browse/KYLIN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832155#comment-17832155 ]
ASF subversion and git services commented on KYLIN-5742: -------------------------------------------------------- Commit c396134127b17e741e6ead1197589afe7bb773d7 in kylin's branch refs/heads/kylin5 from fengguangyuan [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=c396134127 ] KYLIN-5742 Make the query result of duplicate group sets same as Spark Co-authored-by: Guangyuan Feng <guangyuan.f...@kyligence.io> > When the "Group by" group has duplicate values, the result of Grouping Set > query is inconsistent with that in SparkSQL > ---------------------------------------------------------------------------------------------------------------------- > > Key: KYLIN-5742 > URL: https://issues.apache.org/jira/browse/KYLIN-5742 > Project: Kylin > Issue Type: Bug > Affects Versions: 5.0-beta > Reporter: zhong.zhu > Assignee: zhong.zhu > Priority: Major > Fix For: 5.0.0 > > Attachments: image-2023-12-11-14-54-38-652.png, > image-2023-12-11-14-55-46-222.png, image-2023-12-11-14-57-32-037.png, > image-2023-12-11-14-57-56-771.png > > > {code:sql} > -- sql1 > select C_NAME,C_CITY,C_NATION,C_REGION,count(*) > FROM SSB.LINEORDER as LINEORDER > INNER JOIN SSB.CUSTOMER as CUSTOMER > ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY > where C_NATION = 'CHINA' and C_CITY = 'CHINA 0' > group by > GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION)) > order by C_NAME; > -- sql2 > select C_NAME,C_CITY,C_NATION,C_REGION,count(*) > FROM SSB.LINEORDER as LINEORDER > INNER JOIN SSB.CUSTOMER as CUSTOMER > ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY > where C_NATION = 'CHINA' and C_CITY = 'CHINA 0' > group by > C_NAME,C_CITY,C_NATION,C_REGION, > GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION)) > order by C_NAME; > -- sql3 > select C_NAME,C_CITY,C_NATION,C_REGION,count(*) > FROM SSB.LINEORDER as LINEORDER > INNER JOIN SSB.CUSTOMER as CUSTOMER > ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY > where C_NATION = 'CHINA' and C_CITY = 'CHINA 0' > group by > C_NAME,C_CITY,C_NATION,C_REGION > GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION)) > order by C_NAME > {code} > In spark-sql, sql1 and sql3 query results are consistent as follows: > !image-2023-12-11-14-54-38-652.png! > In spark-sql, sql 2 the query results are as follows. > !image-2023-12-11-14-55-46-222.png! > In KYLIN, the query result of sql1 is as follows, which is consistent with > the result of spark-sql sql sql1 sql2: > !image-2023-12-11-14-57-32-037.png! > The query result of sql2 is as follows, which is inconsistent with the > spark-sql sql2 result: > !image-2023-12-11-14-57-56-771.png! > The syntax of sql3 is not supported > Hive does not support commas before grouping sets, that is, sql2 is not > supported, and the query results of sql1 and sql3 are consistent with > spark-sql -- This message was sent by Atlassian Jira (v8.20.10#820010)