[
https://issues.apache.org/jira/browse/HIVE-24976?focusedWorklogId=577599&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-577599
]
ASF GitHub Bot logged work on HIVE-24976:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 06/Apr/21 13:46
Start Date: 06/Apr/21 13:46
Worklog Time Spent: 10m
Work Description: kasakrisz opened a new pull request #2155:
URL: https://github.com/apache/hive/pull/2155
### What changes were proposed in this pull request?
Enable usage of `distinct` in aggregate functions.
### Why are the changes needed?
SQL standard allows.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
```
mvn test -Dtest.output.overwrite -DskipSparkTests
-Dtest=TestMiniLlapLocalCliDriver -Dqfile=windowing_count_distinct.q -pl
itests/qtest -Pitests
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 577599)
Remaining Estimate: 0h
Time Spent: 10m
> CBO: count(distinct) in a window function fails CBO
> ---------------------------------------------------
>
> Key: HIVE-24976
> URL: https://issues.apache.org/jira/browse/HIVE-24976
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Reporter: Gopal Vijayaraghavan
> Assignee: Krisztian Kasa
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {code}
> create temporary table tmp_tbl(
> `rule_id` string,
> `severity` string,
> `alert_id` string,
> `alert_type` string);
> explain cbo
> select `k`.`rule_id`,
> count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
> from tmp_tbl k
> ;
> explain
> select `k`.`rule_id`,
> count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
> from tmp_tbl k
> ;
> {code}
> Fails CBO, because the count(distinct) is not being recognized as belonging
> to a windowing operation.
> So it throws the following exception
> {code}
> throw new CalciteSemanticException("Distinct without an
> aggregation.",
> UnsupportedFeature.Distinct_without_an_aggreggation);
> {code}
> https://github.com/apache/hive/blob/73c3770d858b063c69dea6c64a759f8fdacad460/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4914
> This prevents a query like this from using a materialized view which already
> exists in the system (the MV obviously does not contain this expression, but
> represents a complex transform from a JSON structure into a columnar layout).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)