kasakrisz commented on code in PR #3588:
URL: https://github.com/apache/hive/pull/3588#discussion_r977574485
##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRewriteToDataSketchesRules.java:
##########
@@ -336,16 +336,13 @@ protected VBuilderPAP(Aggregate aggregate, Project
project, String sketchClass)
@Override
boolean isApplicable(AggregateCall aggCall) {
- if ((aggInput instanceof Project)
- && !aggCall.isDistinct() && aggCall.getArgList().size() == 4
+ if ((aggInput != null)
+ && !aggCall.isDistinct() && aggCall.getArgList().size() == 1
Review Comment:
This rule already has a good comment
https://github.com/apache/hive/blob/a017e54c98c76ccf0c185b47533b336b0a398dc7/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRewriteToDataSketchesRules.java#L296-L303
Before this patch function calls like
```
percentile_disc(0.2) within group (order by id)
```
was transformed at AST level to a 4 parameter function:
```
percentile_disc(0.2, id, asc, nulls_last)
```
and the collation was not represented in the CBO plan at all.
With this patch collation is properly set in these aggregate calls in CBO
plan so this rule must expect that version.
An extra predicate was added to this condition which checks the number of
order by keys:
```
&& aggCall.collation.getFieldCollations().size() == 1) {
```
It must be 1 in case of `percentile_disc`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]