Jefffrey commented on code in PR #20460:
URL: https://github.com/apache/datafusion/pull/20460#discussion_r3104911987


##########
datafusion/optimizer/src/single_distinct_to_groupby.rs:
##########
@@ -192,11 +190,15 @@ impl OptimizerRule for SingleDistinctToGroupBy {
                                 },
                         }) => {
                             if distinct {
-                                assert_eq_or_internal_err!(
-                                    args.len(),
-                                    1,
-                                    "DISTINCT aggregate should have exactly 
one argument"
-                                );
+                                // De-duplicate args so that e.g. 
count(distinct c, c)
+                                // is treated as count(distinct c).
+                                // is_single_distinct_agg already verified 
that all
+                                // unique distinct args across aggregates 
refer to the
+                                // same single field.
+                                let mut seen = HashSet::new();
+                                args.retain(|arg| {
+                                    seen.insert(arg.schema_name().to_string())
+                                });

Review Comment:
   This seems a bit odd to handle here; what happens in the case this rule 
doesn't fire (e.g. theres another aggregate which causes this rule to not do 
rewrite)



##########
datafusion/functions-aggregate/src/count.rs:
##########
@@ -293,20 +293,37 @@ impl AggregateUDFImpl for Count {
 
     fn state_fields(&self, args: StateFieldsArgs) -> Result<Vec<FieldRef>> {
         if args.is_distinct {
-            let dtype: DataType = match &args.input_fields[0].data_type() {
-                DataType::Dictionary(_, values_type) => 
(**values_type).clone(),
-                &dtype => dtype.clone(),
-            };
+            if args.input_fields.len() > 1 {
+                Ok(args
+                    .input_fields
+                    .iter()
+                    .map(|field| {
+                        Arc::new(Field::new(
+                            format_state_name(args.name, "count distinct"),

Review Comment:
   Does this comment still need to be addressed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to