Jefffrey commented on PR #19736: URL: https://github.com/apache/datafusion/pull/19736#issuecomment-3736572470
Looking into it some more, it looks like the two-phase aggregation related changes aren't really needed as they don't affect anything 🤔 The introduced tests don't fail on main, and it seems its because the only supported aggregates for the two-phase aggregation branch are min/max/sum: https://github.com/apache/datafusion/blob/f9697c14e29babc961c074eaec008e747495a636/datafusion/optimizer/src/single_distinct_to_groupby.rs#L85-L94 - See how we bail if we find a non-distinct function that isn't sum/min/max I guess it doesn't hurt to keep the fix but they won't actually affect anything (since `ignore nulls` doesn't affect sum/min/max), and we bail out of the rule if we have a filter or order_by in any aggregate: https://github.com/apache/datafusion/blob/f9697c14e29babc961c074eaec008e747495a636/datafusion/optimizer/src/single_distinct_to_groupby.rs#L81-L83 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
