[
https://issues.apache.org/jira/browse/HIVE-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yin Huai updated HIVE-4751:
---------------------------
Description:
Right now, the query shown below cannot be optimized by correlation optimizer
because for distinct aggregations, distinct columns will be added into the
sorting keys.
{code:sql}
SELECT a.key AS key1, a.cnt AS cnt1, b.key AS key2, b.cnt AS cnt2
FROM (SELECT x.key as key, count(DISTINCT x.value) AS cnt FROM src x group by
x.key) a
JOIN (SELECT y.key as key, count(DISTINCT y.value) AS cnt FROM src1 y group by
y.key) b
ON (a.key = b.key)
{\code}
was:
{code:sql}
SELECT a.key AS key1, a.cnt AS cnt1, b.key AS key2, b.cnt AS cnt2
FROM (SELECT x.key as key, count(x.value) AS cnt FROM src x group by
x.key) a
JOIN (SELECT y.key as key, count(y.value) AS cnt FROM src1 y group by
y.key) b
ON (a.key = b.key)
{\code}
> Support distinct aggregations
> -----------------------------
>
> Key: HIVE-4751
> URL: https://issues.apache.org/jira/browse/HIVE-4751
> Project: Hive
> Issue Type: Sub-task
> Components: Query Processor
> Reporter: Yin Huai
> Assignee: Yin Huai
>
> Right now, the query shown below cannot be optimized by correlation optimizer
> because for distinct aggregations, distinct columns will be added into the
> sorting keys.
> {code:sql}
> SELECT a.key AS key1, a.cnt AS cnt1, b.key AS key2, b.cnt AS cnt2
> FROM (SELECT x.key as key, count(DISTINCT x.value) AS cnt FROM src x group by
> x.key) a
> JOIN (SELECT y.key as key, count(DISTINCT y.value) AS cnt FROM src1 y group
> by y.key) b
> ON (a.key = b.key)
> {\code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira