[ 
https://issues.apache.org/jira/browse/HIVE-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4751:
---------------------------

    Description: 
Right now, the query shown below cannot be optimized by correlation optimizer 
because for distinct aggregations, distinct columns will be added into the 
sorting keys. 
{code:sql}
SELECT a.key AS key1, a.cnt AS cnt1, b.key AS key2, b.cnt AS cnt2
FROM (SELECT x.key as key, count(DISTINCT x.value) AS cnt FROM src x group by 
x.key) a
JOIN (SELECT y.key as key, count(DISTINCT y.value) AS cnt FROM src1 y group by 
y.key) b
ON (a.key = b.key)
{\code}

  was:
{code:sql}
SELECT a.key AS key1, a.cnt AS cnt1, b.key AS key2, b.cnt AS cnt2
      FROM (SELECT x.key as key, count(x.value) AS cnt FROM src x group by 
x.key) a
      JOIN (SELECT y.key as key, count(y.value) AS cnt FROM src1 y group by 
y.key) b
      ON (a.key = b.key)
{\code}

    
> Support distinct aggregations
> -----------------------------
>
>                 Key: HIVE-4751
>                 URL: https://issues.apache.org/jira/browse/HIVE-4751
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>
> Right now, the query shown below cannot be optimized by correlation optimizer 
> because for distinct aggregations, distinct columns will be added into the 
> sorting keys. 
> {code:sql}
> SELECT a.key AS key1, a.cnt AS cnt1, b.key AS key2, b.cnt AS cnt2
> FROM (SELECT x.key as key, count(DISTINCT x.value) AS cnt FROM src x group by 
> x.key) a
> JOIN (SELECT y.key as key, count(DISTINCT y.value) AS cnt FROM src1 y group 
> by y.key) b
> ON (a.key = b.key)
> {\code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to