mustafasrepo commented on code in PR #8353:
URL: https://github.com/apache/arrow-datafusion/pull/8353#discussion_r1410199833
##########
datafusion/physical-expr/src/aggregate/mod.rs:
##########
@@ -102,6 +102,21 @@ pub trait AggregateExpr: Send + Sync + Debug +
PartialEq<dyn Any> {
"AggregateExpr: default name"
}
+ /// Returns Aggregate Fucntion Name
+ fn func_name(&self) -> &str;
+
+ /// Human readable name such as `"MIN(c2)"` or `"RANK()"`. The default
+ /// implementation returns `"FUNCTION_NAME(args)"`
+ fn display_name(&self) -> String {
Review Comment:
I gave this a try. However, replacing name with current display name did not
work. It seems like during schema creation aggregate exprs uses `.name` for the
field (See
[link](https://github.com/apache/arrow-datafusion/blob/06bbe1298fa8aa042b6a6462e55b2890969d884a/datafusion/physical-expr/src/aggregate/mod.rs#L77)).
Hence plan depends on the `.name` string for successful execution. Otherwise
subsequent projection after (window or aggregation) may refer to invalid
fields.
Other option might be adding a `display_name` member to the each aggregate
expression (similar to `.name` method). This `display_name` member will not be
set from argument but will calculate at the initialization(Users will not have
control over it). Then we will use `.display_name(&self) -> &str` API to return
this member. With this way `func_name` argument will be removed. However, we
will still have parallel `.display_name` on top `.name` which might be
confusing. However, I think this approach is better than the current one.
What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]