Biao Wu created HIVE-15848:
------------------------------
Summary: count or sum distinct incorrect when
hive.optimize.reducededuplication set to true
Key: HIVE-15848
URL: https://issues.apache.org/jira/browse/HIVE-15848
Project: Hive
Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Biao Wu
Priority: Critical
Test Table:
{code:sql}
create table test(id int,key int,name int);
{code}
Data:
||id||key||name||
|1 |1 |2
|1 |2 |3
|1 |3 |2
|1 |4 |2
|1 |5 |3
Test SQL1:
{code:sql}
select id,count(Distinct key),count(Distinct name)
from (select id,key,name from count_distinct_test group by id,key,name)m
group by id;
{code}
result:
|1|5|4
expect:
|1|5|2
Test SQL2:
{code:sql}
select id,count(Distinct name),count(Distinct key)
from (select id,key,name from count_distinct_test group by id,name,key)m
group by id;
{code}
result:
|1|2|5
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)