bubulalabu opened a new pull request, #18019:
URL: https://github.com/apache/datafusion/pull/18019

   ## Which issue does this PR close?
   
   Closes #17379.
   
   ## Rationale for this change
   
   PostgreSQL supports named arguments for function calls using the syntax 
`function_name(param => value)`, which improves code readability and allows 
arguments to be specified in any order. DataFusion should support this syntax 
to  enhance the user experience, especially for functions with many optional 
parameters.
   
   ## What changes are included in this PR?
   
   This PR implements PostgreSQL-style named arguments for scalar functions.
   
   **Features:**
   - Parse named arguments from SQL (param => value syntax)
   - Resolve named arguments to positional order before execution
   - Support mixed positional and named arguments
   - Store parameter names in function signatures
   - Show parameter names in error messages
   
   **Limitations:**
   - Named arguments only work for functions with known arity (fixed number of 
parameters)
   - Variadic functions (like `concat`) cannot use named arguments as they 
accept variable numbers of arguments
   - Supported signature types: `Exact`, `Uniform`, `Any`, `Coercible`, 
`Comparable`, `Numeric`, `String`, `Nullary`, `ArraySignature`, and `OneOf` 
(combinations of these)
   - Not supported: `Variadic`, `VariadicAny`, `UserDefined`
   
   **Implementation:**
   - Added argument resolution logic with validation
   - Extended Signature with parameter_names field
   - Updated SQL parser to handle named argument syntax
   - Integrated into physical planning phase
   - Added comprehensive tests and documentation
   
   **Example usage:**
   ```sql
   -- All named arguments
   SELECT substr(str => 'hello world', start_pos => 7, length => 5);
   
   -- Mixed positional and named arguments
   SELECT substr('hello world', start_pos => 7, length => 5);
   
   -- Named arguments in any order
   SELECT substr(length => 5, str => 'hello world', start_pos => 7);
   ```
   
   **Improved error messages:**
   
   Before this PR, error messages showed generic types:
   ```
   Candidate functions:
       substr(Any, Any)
       substr(Any, Any, Any)
   ```
   
   After this PR, error messages show parameter names:
   ```
   Candidate functions:
       substr(str, start_pos)
       substr(str, start_pos, length)
   ```
   
   Example error output:
   ```
   datafusion % target/debug/datafusion-cli
   DataFusion CLI v50.1.0
   > SELECT substr(str => 'hello world');
   Error during planning: Internal error: Function 'substr' failed to match any 
signature, errors: Error during planning: The function 'substr' expected 2 
arguments but received 1,Error during planning: The function 'substr' expected 
3 arguments but received 1.
   This issue was likely caused by a bug in DataFusion's code. Please help us 
to resolve this by filing a bug report in our issue tracker: 
https://github.com/apache/datafusion/issues No function matches the given name 
and argument types 'substr(Utf8)'. You might need to add explicit type casts.
           Candidate functions:
           substr(str, start_pos)
           substr(str, start_pos, length)
   ```
   
   ## Are these changes tested?
   
   Yes, comprehensive tests are included:
   
   1. **Unit tests** (18 tests total):
      - Argument validation and reordering logic (8 tests in `udf.rs`)
      - Error message formatting with parameter names (2 tests in `utils.rs`)
      - TypeSignature parameter name support for all fixed-arity variants 
including ArraySignature (10 tests in `signature.rs`)
   
   2. **Integration tests** (`named_arguments.slt`):
      - Positional arguments (baseline)
      - Named arguments in order
      - Named arguments out of order
      - Mixed positional and named arguments
      - Optional parameters
      - Function aliases
      - Error cases (positional after named, unknown parameter, duplicate 
parameter)
      - Error message format verification
   
   All tests pass successfully.
   
   ## Are there any user-facing changes?
   
   **Yes**, this PR adds new user-facing functionality:
   
   1. **New SQL syntax**: Users can now call functions with named arguments 
using `param => value` syntax (only for functions with fixed arity)
   2. **Improved error messages**: Signature mismatch errors now display 
parameter names instead of generic types
   3. **UDF API**: Function authors can add parameter names to their functions 
using:
      ```rust
      signature: Signature::uniform(2, vec![DataType::Float64], 
Volatility::Immutable)
          .with_parameter_names(vec!["base".to_string(), 
"exponent".to_string()])
          .expect("valid parameter names")
      ```
   
   **Potential breaking change** (very unlikely): Added new public field 
`parameter_names: Option<Vec<String>>` to `Signature` struct. This is 
technically a breaking change if code constructs `Signature` using struct 
literal syntax. However, this is extremely unlikely in practice because:
   - `Signature` is almost always constructed using builder methods 
(`Signature::exact()`, `Signature::uniform()`, etc.)
   - The new field defaults to `None`, maintaining existing behavior
   - Existing code using builder methods continues to work without modification
   
   **No other breaking changes**: The feature is purely additive - existing SQL 
queries and UDF implementations work without modification.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to