[
https://issues.apache.org/jira/browse/CALCITE-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Solimando resolved CALCITE-5158.
-------------------------------------------
Resolution: Invalid
> count(1) with subquery count(distinct) gives wrong results with
> hive.optimize.distinct.rewrite=true and cbo on
> --------------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-5158
> URL: https://issues.apache.org/jira/browse/CALCITE-5158
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.19.0
> Reporter: honghui.Liu
> Priority: Major
>
> {code:java}
> create table count_distinct(a int, b int);
> insert into table count_distinct values (1,2),(2,3);
> set hive.execution.engine=tez;
> set hive.cbo.enable=true;
> set hive.optimize.distinct.rewrite=true;
> select count(1) from (
> select count(distinct a) from count_distinct
> ) tmp; {code}
> it give wrong result when hive.optimize.distinct.rewrite is true, By default,
> it's true for all 3.x versions. The test result is 2, and the expected result
> is 1.
> Before CBO optimization,RelNode tree as this,
> {code:java}
> HiveProject(_o__c0=[$0])
> HiveAggregate(group=[{}], agg#0=[count($0)])
> HiveProject($f0=[1])
> HiveProject(_o__c0=[$0])
> HiveAggregate(group=[{}], agg#0=[count(DISTINCT $0)])
> HiveProject($f0=[$0])
> HiveTableScan(table=[[default.count_distinct]],
> table:alias=[count_distinct]) {code}
> Optimized by HiveExpandDistinctAggregatesRule, RelNode tree as this,
> {code:java}
> HiveProject(_o__c0=[$0])
> HiveAggregate(group=[{}], agg#0=[count($0)])
> HiveProject($f0=[1])
> HiveProject(_o__c0=[$0])
> HiveAggregate(group=[{}], agg#0=[count($0)])
> HiveAggregate(group=[{0}])
> HiveProject($f0=[$0])
> HiveProject($f0=[$0])
> HiveTableScan(table=[[default.count_distinct]],
> table:alias=[count_distinct]) {code}
> count(distinct xx) converte to count (xx) from (select xx from table_name
> group by xx)
> Optimized by Projection Pruning, RelNode tree as this,
> {code:java}
> HiveAggregate(group=[{}], agg#0=[count()])
> HiveProject(DUMMY=[0])
> HiveAggregate(group=[{}])
> HiveAggregate(group=[{0}])
> HiveProject(a=[$0])
> HiveTableScan(table=[[default.count_distinct]],
> table:alias=[count_distinct]) {code}
> In this case, an error occurs in the execution plan.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)