Remus Rusanu created HIVE-6873:
----------------------------------

             Summary: DISTINCT clause in aggregates is handled incorrectly by 
vectorized execution
                 Key: HIVE-6873
                 URL: https://issues.apache.org/jira/browse/HIVE-6873
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.13.0, 0.14.0
            Reporter: Remus Rusanu
            Assignee: Remus Rusanu


The vectorized aggregates ignore the DISTINCT clause. This cause incorrect 
results. Due to how GroupByOperatorDesc adds the DISTINCT keys to the overall 
aggregate keys the vectorized aggregates do account for the extra key, but they 
do not process the data correctly for the key. the reduce side the aggregates 
the input from the vectorized map side to results that are only sometimes 
correct but mostly incorrect. HIVE-4607 tracks the proper fix, but meantime I'm 
filing a bug to disable vectorized execution if DISTINCT is present. Fix is 
trivial.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to