[ 
https://issues.apache.org/jira/browse/HIVE-28254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28254:
----------------------------------
    Labels: pull-request-available  (was: )

> CBO (Calcite Return Path): Multiple DISTINCT leads to wrong results
> -------------------------------------------------------------------
>
>                 Key: HIVE-28254
>                 URL: https://issues.apache.org/jira/browse/HIVE-28254
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>    Affects Versions: 4.0.0
>            Reporter: Shohei Okumiya
>            Assignee: Shohei Okumiya
>            Priority: Major
>              Labels: pull-request-available
>
> CBO return path can build incorrect GroupByOperator when multiple 
> aggregations with DISTINCT are involved.
> This is an example.
> {code:java}
> CREATE TABLE test (col1 INT, col2 INT);
> INSERT INTO test VALUES (1, 100), (2, 200), (2, 200), (3, 300);
> set hive.cbo.returnpath.hiveop=true;
> set hive.map.aggr=false;
> SELECT
>   SUM(DISTINCT col1),
>   COUNT(DISTINCT col1),
>   SUM(DISTINCT col2),
>   SUM(col2)
> FROM test;{code}
> The last column should be 800. But the SUM refers to col1 and the actual 
> result is 8.
> {code:java}
> +------+------+------+------+
> | _c0  | _c1  | _c2  | _c3  |
> +------+------+------+------+
> | 6    | 3    | 600  | 8    |
> +------+------+------+------+ {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to