kasakrisz commented on code in PR #3588:
URL: https://github.com/apache/hive/pull/3588#discussion_r977574485


##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRewriteToDataSketchesRules.java:
##########
@@ -336,16 +336,13 @@ protected VBuilderPAP(Aggregate aggregate, Project 
project, String sketchClass)
 
       @Override
       boolean isApplicable(AggregateCall aggCall) {
-        if ((aggInput instanceof Project)
-            && !aggCall.isDistinct() && aggCall.getArgList().size() == 4
+        if ((aggInput != null)
+            && !aggCall.isDistinct() && aggCall.getArgList().size() == 1

Review Comment:
   This rule already has a good comment
   
https://github.com/apache/hive/blob/a017e54c98c76ccf0c185b47533b336b0a398dc7/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRewriteToDataSketchesRules.java#L296-L303
   
   Before this patch function calls like
   ```
   percentile_disc(0.2) within group (order by id)
   ```
   was transformed at AST level to a 4 parameter function:
   ```
   percentile_disc(0.2, id, asc, nulls_last)
   ```
   and the collation was not represented in the CBO plan at all.
   
   With this patch collation is properly set in these aggregate calls in CBO 
plan so this rule must expect that version.
   An extra predicate was added to this condition which checks the number of 
order by keys:
   ```
   && aggCall.collation.getFieldCollations().size() == 1) {
   ```
   It must be 1 in case of `percentile_disc`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to