Remus Rusanu created HIVE-6873:
----------------------------------
Summary: DISTINCT clause in aggregates is handled incorrectly by
vectorized execution
Key: HIVE-6873
URL: https://issues.apache.org/jira/browse/HIVE-6873
Project: Hive
Issue Type: Bug
Components: Query Processor
Affects Versions: 0.13.0, 0.14.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
The vectorized aggregates ignore the DISTINCT clause. This cause incorrect
results. Due to how GroupByOperatorDesc adds the DISTINCT keys to the overall
aggregate keys the vectorized aggregates do account for the extra key, but they
do not process the data correctly for the key. the reduce side the aggregates
the input from the vectorized map side to results that are only sometimes
correct but mostly incorrect. HIVE-4607 tracks the proper fix, but meantime I'm
filing a bug to disable vectorized execution if DISTINCT is present. Fix is
trivial.
--
This message was sent by Atlassian JIRA
(v6.2#6252)