dejankrak-db opened a new pull request, #56239:
URL: https://github.com/apache/spark/pull/56239

   ### What changes were proposed in this pull request?
   
   A parameterless built-in function (`current_user`, `current_date`, 
`current_time`, `current_timestamp`, `user`, `session_user`, `grouping__id`) 
now takes precedence over a SQL UDF parameter that shares its name. Previously 
the UDF parameter alias shadowed the built-in, e.g. `CREATE FUNCTION 
f(current_user STRING) RETURNS STRING RETURN current_user; SELECT f('alice')` 
returned `'alice'` instead of the actual current user.
   
   SQL UDF input-parameter aliases are marked with a named `Metadata` key, 
`SessionCatalog.SQL_FUNCTION_PARAMETER_ALIAS_METADATA_KEY`. Producers are 
`SessionCatalog.makeSQLFunctionPlan` and `makeSQLTableFunctionPlan` (CALL time, 
scalar and table UDFs) and `CreateSQLFunctionCommand` (CREATE time). Storing 
the marker in `Metadata` (rather than a `TreeNodeTag`) lets it propagate 
automatically through `Alias.toAttribute` and `AttributeReference` copies.
   
   In `ColumnResolutionHelper`, when name-based resolution returns an attribute 
carrying this metadata, `LiteralFunctionResolution` is preferred over it. Real 
columns from relations (no marker) still win, preserving the overall precedence 
(column > parameterless function > UDF parameter).
   
   The new behavior is gated behind a legacy kill-switch, 
`spark.sql.legacy.allowUdfParameterToShadowParameterlessFunction` (default 
`false`); set to `true` to restore the previous behavior.
   
   ### Why are the changes needed?
   
   To match the documented SQL name resolution rules: a parameterless built-in 
function should not be shadowed by a same-named UDF parameter.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. A SQL UDF parameter named like a parameterless built-in function no 
longer shadows that function in the function body. The legacy conf restores the 
old behavior.
   
   ### How was this patch tested?
   
   Added golden-file tests `sql-udf-name-precedence(.legacy)` and 
`parameterless-function-name-precedence(.legacy)` covering column > param, LCA 
> param, outer-ref > param, parameterless function > param (scalar and table 
UDF), param > session variable, nested-UDF inner-scope binding, and the legacy 
(flag-on) behavior. All 4 scenarios pass.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Yes, co-authored using Claude code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to