[
https://issues.apache.org/jira/browse/HIVE-29339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stamatis Zampetakis resolved HIVE-29339.
----------------------------------------
Fix Version/s: 4.3.0
Resolution: Fixed
Fixed in
https://github.com/apache/hive/commit/844df7e1b6741d2700d5a692ebe16aa5fe23a292
> Remove DISTINCT indicator from SqlAggFunctions
> ----------------------------------------------
>
> Key: HIVE-29339
> URL: https://issues.apache.org/jira/browse/HIVE-29339
> Project: Hive
> Issue Type: Task
> Components: CBO
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.3.0
>
>
> The
> [CanAggregateDistinct|https://github.com/apache/hive/blob/d9ec04156d84bedbaa9f8dc40c27dbb88a3b9f49/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/functions/CanAggregateDistinct.java]
> interface provides an extra indicator to aggregate functions allowing them
> to indicate if they use DISTINCT or not.
> However, this indicator is redundant at the operator level cause the
> information is already present in the query plan (AggregateCall/RexOver).
> Having the indicator in multiple places is also problematic cause the
> information between the call and the operator may be misaligned that may lead
> to bugs depending on which field the rules/planner will check.
> Finally, the presence of the DISTINCT indicator at the operator level
> essentially means that for each aggregate function there are two operators
> (one that supports DISTINCT and one that doesn't) so essentially we double
> the available operators.
> Removing the DISTINCT indicator from all aggregate functions, leads to
> simpler and more generic code, increases code coverage, and facilitate
> maintenance since it removes Hive specific interfaces.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)