kosiew commented on code in PR #18025:
URL: https://github.com/apache/datafusion/pull/18025#discussion_r2506760915
##########
datafusion/functions/src/macros.rs:
##########
@@ -59,6 +59,24 @@ macro_rules! export_functions {
}
};
+ // function that requires config and takes a vector argument
+ (single $FUNC:ident, $DOC:expr, @config $arg:ident,) => {
+ #[doc = $DOC]
+ pub fn $FUNC($arg: Vec<datafusion_expr::Expr>) ->
datafusion_expr::Expr {
+ use datafusion_common::config::ConfigOptions;
+ super::$FUNC(&ConfigOptions::default()).call($arg)
Review Comment:
Here’s the reasoning behind the current setup:
* **Why the constructor takes `ConfigOptions`:** It’s primarily for
**performance and identity**. Parsing timezones (or other config-dependent
data) can be expensive, and since the timezone impacts the UDF’s behavior, it’s
treated as part of its identity (`Hash`, `Eq`). This way, UDFs with different
configs (e.g. timezones) are considered distinct instances.
* **Why the macro uses defaults:** The macro path is meant as a
**convenience layer** for building expressions programmatically, not for actual
runtime execution. It doesn’t have access to the session context, so it
defaults to `ConfigOptions::default()` just to simplify expression building.
* **How runtime config is actually applied:** When queries are executed,
DataFusion calls `with_updated_config()` to rebuild the UDF with the *real*
session config. Example from `to_timestamp.rs`:
```rust
fn with_updated_config(&self, config: &ConfigOptions) -> Option<ScalarUDF>
{
Some(Self::new_with_config(config).into())
}
```
This ensures that during execution, the correct configuration (e.g.
timezone) is used.
So while it *looks* inconsistent, it’s actually by design — the
macro-created version isn’t the same one used at runtime. Still, I agree it’s a
bit non-obvious, so adding a comment or doc note clarifying this distinction
between the **macro (Expr-building)** and **runtime (session-config)** paths
would definitely help avoid confusion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]