[PR] [SPARK-54598] Extract logic of `read_all_udfs` [spark]

via GitHub Thu, 04 Dec 2025 16:23:04 -0800


Yicong-Huang opened a new pull request, #53330:
URL: https://github.com/apache/spark/pull/53330


   ### What changes were proposed in this pull request?
   This PR refactors the UDF reading logic in `read_udfs()` to eliminate code 
duplication. Currently, the logic for reading UDFs (functions and their 
argument offsets) is duplicated across multiple `eval_type` branches, with 
different patterns for single UDF vs. multiple UDFs cases.
   
   ### Why are the changes needed?
   
   This duplication makes the code harder to maintain and increases the risk of 
inconsistencies. By centralizing the UDF reading logic at the beginning of 
`read_udfs()`, we can:
   - Reduce code duplication (eliminated 41 lines of repeated code, net +9 
lines including the new helper function)
   - Improve maintainability
   - Ensure consistent UDF reading behavior across all eval types
   - Make it easier to add new eval types in the future
   
   
   ### Does this PR introduce _any_ user-facing change?
   No, this is an internal refactoring that maintains backward compatibility. 
The API behavior remains the same from the user's perspective.
   
   ### How was this patch tested?
   Existing Tests
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-54598] Extract logic of `read_all_udfs` [spark]

Reply via email to