Re: [PR] Add PostgreSQL-style named arguments support for scalar functions [datafusion]

via GitHub Sun, 19 Oct 2025 07:40:12 -0700


Jefffrey commented on code in PR #18019:
URL: https://github.com/apache/datafusion/pull/18019#discussion_r2443343960



##########
datafusion/expr/src/udf.rs:
##########
@@ -957,3 +949,284 @@ mod tests {
         hasher.finish()
     }
 }
+
+/// Argument resolution logic for named function parameters
+pub mod arguments {
+    use crate::Expr;
+    use datafusion_common::{plan_err, Result};
+
+    /// Resolves function arguments, handling named and positional notation.
+    ///
+    /// This function validates and reorders arguments to match the function's 
parameter names
+    /// when named arguments are used.
+    ///
+    /// # Rules
+    /// - All positional arguments must come before named arguments
+    /// - Named arguments can be in any order after positional arguments
+    /// - All parameter names must match the provided parameter_names
+    /// - No duplicate parameter names allowed
+    ///
+    /// # Arguments
+    /// * `param_names` - The function's parameter names in order
+    /// * `args` - The argument expressions
+    /// * `arg_names` - Optional parameter name for each argument
+    ///
+    /// # Returns
+    /// A vector of expressions in the correct order matching the parameter 
names
+    ///
+    /// # Examples
+    /// ```text
+    /// Given parameters ["a", "b", "c"]
+    /// And call: func(10, c => 30, b => 20)
+    /// Returns: [Expr(10), Expr(20), Expr(30)]
+    /// ```
+    pub fn resolve_function_arguments(
+        param_names: &[String],
+        args: Vec<Expr>,
+        arg_names: Vec<Option<String>>,
+    ) -> Result<Vec<Expr>> {
+        // Validate that arg_names length matches args length
+        if args.len() != arg_names.len() {
+            return plan_err!(
+                "Internal error: args length ({}) != arg_names length ({})",
+                args.len(),
+                arg_names.len()
+            );
+        }
+
+        // Check if all arguments are positional (fast path)
+        if arg_names.iter().all(|name| name.is_none()) {
+            return Ok(args);
+        }
+
+        // Validate mixed positional and named arguments
+        validate_argument_order(&arg_names)?;
+
+        // Validate and reorder named arguments
+        reorder_named_arguments(param_names, args, arg_names)
+    }
+
+    /// Validates that positional arguments come before named arguments
+    fn validate_argument_order(arg_names: &[Option<String>]) -> Result<()> {
+        let mut seen_named = false;
+        for (i, arg_name) in arg_names.iter().enumerate() {
+            match arg_name {
+                Some(_) => seen_named = true,
+                None if seen_named => {
+                    return plan_err!(
+                        "Positional argument at position {} follows named 
argument. \
+                         All positional arguments must come before named 
arguments.",
+                        i
+                    );
+                }
+                None => {}
+            }
+        }
+        Ok(())
+    }
+
+    /// Reorders arguments based on named parameters to match signature order
+    fn reorder_named_arguments(
+        param_names: &[String],
+        args: Vec<Expr>,
+        arg_names: Vec<Option<String>>,
+    ) -> Result<Vec<Expr>> {
+        // Count positional vs named arguments
+        let positional_count = arg_names.iter().filter(|n| 
n.is_none()).count();
+
+        // Capture args length before consuming the vector
+        let args_len = args.len();
+
+        // Create a result vector with the expected size
+        let expected_arg_count = param_names.len();
+        let mut result: Vec<Option<Expr>> = vec![None; expected_arg_count];
+
+        // Track which parameters have been assigned
+        let mut assigned = vec![false; expected_arg_count];
+
+        // Process all arguments (both positional and named)

Review Comment:
   Can you please remove them. I'm seeing some functions that have a comment 
for every single line of code inside which is far too verbose considering the 
comments don't add any meaningful benefit; if you are using an LLM to generate 
this code please do a pass over it to remove these comments. If these are 
handwritten comments, please consider if the code already gives us the same 
information the comment is adding on; we shouldn't see cases where every single 
line of a function has a comment explaining exactly what the code is already 
doing.
   
   This extends to tests too, for example I see this:
   
   ```rust
       #[test]
       fn test_signature_numeric_with_parameter_names() {
           // Test numeric signature with parameter names
   ```
   
   The name of the test already tells us whats happening; adding a comment with 
the exact same information adds verbosity. It might seem like a small nitpick, 
but if we keep adding comments like this over time to the codebase, we'll 
eventually have so much comment bloat all over the place.
   
   I apologize if this comes off as harsh; I'm just often seeing PRs that seem 
LLM generated and have so much unnecessary comments that weren't pruned before 
the PR was submitted.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add PostgreSQL-style named arguments support for scalar functions [datafusion]

Reply via email to