alamb commented on code in PR #8360:
URL: https://github.com/apache/arrow-datafusion/pull/8360#discussion_r1409756297
##########
datafusion/expr/src/udf.rs:
##########
@@ -89,6 +91,34 @@ impl ScalarUDF {
signature: signature.clone(),
return_type: return_type.clone(),
fun: fun.clone(),
+ aliases: vec![],
+ }
+ }
+
+ /// Create a new ScalarUDF with aliases
+ pub fn new_with_aliases(
Review Comment:
Unless there is a compelling reason, I think we should avoid adding this new
constructor and only have `with_aliases` to avoid a redundant API
##########
datafusion/expr/src/udf.rs:
##########
@@ -89,6 +91,34 @@ impl ScalarUDF {
signature: signature.clone(),
return_type: return_type.clone(),
fun: fun.clone(),
+ aliases: vec![],
+ }
+ }
+
+ /// Create a new ScalarUDF with aliases
+ pub fn new_with_aliases(
+ name: &str,
+ signature: &Signature,
+ return_type: &ReturnTypeFunction,
+ fun: &ScalarFunctionImplementation,
+ aliases: Vec<String>,
+ ) -> Self {
+ Self {
+ name: name.to_owned(),
+ signature: signature.clone(),
+ return_type: return_type.clone(),
+ fun: fun.clone(),
+ aliases,
+ }
+ }
+
+ pub fn with_aliases(self, aliases: Vec<&str>) -> Self {
+ Self {
+ name: self.name,
+ signature: self.signature,
+ return_type: self.return_type,
+ fun: self.fun,
+ aliases: aliases.iter().map(|s| s.to_string()).collect(),
}
Review Comment:
I think you can write this more concisely like this. It would also be good
to add comments:
```suggestion
/// Adds additional names that can be used to invoke this function, in
addition to `name`
pub fn with_aliases(mut self, aliases: impl IntoIterator<Item = &str>)
-> Self {
self.aliases.extend(aliases.into_iter().map(|s| s.to_string()));
self
}
```
##########
datafusion/expr/src/udf.rs:
##########
@@ -106,6 +136,10 @@ impl ScalarUDF {
&self.name
}
+ pub fn aliases(&self) -> &[String] {
Review Comment:
```suggestion
/// Returns the aliases for this function. See [`with_aliases`] for
more details
pub fn aliases(&self) -> &[String] {
```
##########
datafusion/expr/src/udf.rs:
##########
@@ -49,6 +49,8 @@ pub struct ScalarUDF {
/// the batch's row count (so that the generative zero-argument function
can know
/// the result array size).
fun: ScalarFunctionImplementation,
+ /// Optional aliases for the function
Review Comment:
```suggestion
/// Optional aliases for the function. This list should NOT include the
value of `name` as well
```
##########
datafusion/core/src/execution/context/mod.rs:
##########
@@ -811,8 +811,14 @@ impl SessionContext {
/// - `SELECT MY_FUNC(x)...` will look for a function named `"my_func"`
/// - `SELECT "my_FUNC"(x)` will look for a function named `"my_FUNC"`
pub fn register_udf(&self, f: ScalarUDF) {
- self.state
- .write()
+ let mut state = self.state.write();
+ let aliases = f.aliases();
+ for alias in aliases {
+ state
+ .scalar_functions
+ .insert(alias.to_string(), Arc::new(f.clone()));
Review Comment:
This is an excellent point.
At the moment, the code currently ignores any existing functions that were
registered with the same name so I don't think this PR makes that worse (though
it is not good)
For this PR perhaps we can just add some documentation that explains that
"any functions registered with the udf name or its aliases are overwritten with
this new function"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]