Shubhambhusate opened a new pull request, #53542:
URL: https://github.com/apache/spark/pull/53542

   ### What changes were proposed in this pull request?
   
   **Changes made:**
   
   Added new error condition in error-conditions.json:
   UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF - A clear error message for 
SQL UDFs used in lambda functions.
   
   Modified SessionCatalog.scala:
   Added detection of NamedLambdaVariable and UnresolvedNamedLambdaVariable in 
SQL function inputs.
   Added checks in both makeSQLFunctionBuilder and makeSQLFunctionPlan methods 
to throw a descriptive error when lambda variables are detected.
   
   Added test case in SQLFunctionSuite.scala:
   Test verifies the new error message is thrown when a SQL UDF is used inside 
a higher-order function.
   
   
   
   ### Why are the changes needed?
   
   Currently, when a SQL UDF is used inside a higher-order function like 
transform, the error message is confusing:
   ```
   
   CREATE FUNCTION lower_udf(s STRING) RETURNS STRING RETURN lower(s);
   SELECT transform(array('A', 'B'), x -> lower_udf(x));
   
   ```
   **Before (confusing error):**
   
   [MISSING_ATTRIBUTES.RESOLVED_ATTRIBUTE_MISSING_FROM_INPUT] 
   Resolved attribute(s) "x" missing from in operator !Project [cast(lambda 
x#20395 as string) AS s#20397]. 
   SQLSTATE: XX000
   
   <img width="1728" height="427" alt="Screenshot 2025-12-18 at 6 13 29 PM" 
src="https://github.com/user-attachments/assets/8d7e79dd-bd86-4199-8b16-fae0b9313d46";
 />
   
   
   This error doesn't explain why the attribute is missing or what the user 
should do.
   
   **After (clear error):**
   
   [UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF] The feature is not 
supported: Lambda function with SQL UDF "spark_catalog.default.lower_udf(lambda 
x)" in a higher order function. SQLSTATE: 0A000
   
   <img width="1728" height="314" alt="Screenshot 2025-12-18 at 6 14 11 PM" 
src="https://github.com/user-attachments/assets/76b30d2d-1c3a-4a8d-8feb-65a5295d6d35";
 />
   
   
   This is consistent with the existing error message for Python UDFs in the 
same scenario (UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_PYTHON_UDF).
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. Users will now see a clearer, more actionable error message when 
attempting to use a SQL UDF inside a higher-order function's lambda expression.
   
   ### How was this patch tested?
   **Test 1:**
   
   Added a new test case "SQL UDF in higher-order function should fail with 
clear error message" in SQLFunctionSuite.scala that:
   Creates a SQL UDF
   Attempts to use it in a transform higher-order function
   Verifies the error condition is 
UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF
   Verifies the error message contains the function name and lambda x
   
   **Test 2:**
   Manual testing
   spark.sql("CREATE OR REPLACE FUNCTION test_lower_udf(s STRING) RETURNS 
STRING RETURN lower(s)") spark.sql("SELECT transform(array('A', 'B'), x -> 
test_lower_udf(x))").show()
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to