[
https://issues.apache.org/jira/browse/KYLIN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
pengfei.zhan resolved KYLIN-5742.
---------------------------------
Fix Version/s: 5.0-beta
(was: 5.0.0)
Resolution: Fixed
> When the "Group by" group has duplicate values, the result of Grouping Set
> query is inconsistent with that in SparkSQL
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: KYLIN-5742
> URL: https://issues.apache.org/jira/browse/KYLIN-5742
> Project: Kylin
> Issue Type: Bug
> Affects Versions: 5.0-beta
> Reporter: zhong.zhu
> Assignee: zhong.zhu
> Priority: Major
> Fix For: 5.0-beta
>
> Attachments: image-2023-12-11-14-54-38-652.png,
> image-2023-12-11-14-55-46-222.png, image-2023-12-11-14-57-32-037.png,
> image-2023-12-11-14-57-56-771.png
>
>
> {code:sql}
> -- sql1
> select C_NAME,C_CITY,C_NATION,C_REGION,count(*)
> FROM SSB.LINEORDER as LINEORDER
> INNER JOIN SSB.CUSTOMER as CUSTOMER
> ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY
> where C_NATION = 'CHINA' and C_CITY = 'CHINA 0'
> group by
> GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION))
> order by C_NAME;
> -- sql2
> select C_NAME,C_CITY,C_NATION,C_REGION,count(*)
> FROM SSB.LINEORDER as LINEORDER
> INNER JOIN SSB.CUSTOMER as CUSTOMER
> ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY
> where C_NATION = 'CHINA' and C_CITY = 'CHINA 0'
> group by
> C_NAME,C_CITY,C_NATION,C_REGION,
> GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION))
> order by C_NAME;
> -- sql3
> select C_NAME,C_CITY,C_NATION,C_REGION,count(*)
> FROM SSB.LINEORDER as LINEORDER
> INNER JOIN SSB.CUSTOMER as CUSTOMER
> ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY
> where C_NATION = 'CHINA' and C_CITY = 'CHINA 0'
> group by
> C_NAME,C_CITY,C_NATION,C_REGION
> GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION))
> order by C_NAME
> {code}
> In spark-sql, sql1 and sql3 query results are consistent as follows:
> !image-2023-12-11-14-54-38-652.png!
> In spark-sql, sql 2 the query results are as follows.
> !image-2023-12-11-14-55-46-222.png!
> In KYLIN, the query result of sql1 is as follows, which is consistent with
> the result of spark-sql sql sql1 sql2:
> !image-2023-12-11-14-57-32-037.png!
> The query result of sql2 is as follows, which is inconsistent with the
> spark-sql sql2 result:
> !image-2023-12-11-14-57-56-771.png!
> The syntax of sql3 is not supported
> Hive does not support commas before grouping sets, that is, sql2 is not
> supported, and the query results of sql1 and sql3 are consistent with
> spark-sql
--
This message was sent by Atlassian Jira
(v8.20.10#820010)