harshitsaini17 opened a new pull request, #19184:
URL: https://github.com/apache/datafusion/pull/19184

   # fix: shuffle should report nullability correctly
   
   - Replace return_type with return_field_from_args to preserve input 
nullability
   - Add test to verify nullability is correctly reported
   - Addresses issue #19145
   
   ## Which issue does this PR close?
   
   Closes #19145
   
   ## Rationale for this change
   
   The `shuffle` UDF was using the default `is_nullable` implementation which 
always returns `true`, regardless of the input array's nullability. This causes:
   1. Incorrect schema inference - non-nullable inputs are incorrectly marked 
as nullable
   2. Missed optimization opportunities - the query optimizer cannot apply 
certain optimizations when nullability information is incorrect
   3. Potential runtime errors - incorrect metadata can lead to unexpected 
behavior in downstream operations
   
   The shuffle function simply reorders elements within an array without 
changing the array's structure or nullability, so the output should have the 
same nullability as the input.
   
   ## What changes are included in this PR?
   
   1. **Implemented `return_field_from_args`**: Returns the input field 
directly, preserving both data type and nullability
   2. **Updated `return_type`**: Now returns an error directing users to use 
`return_field_from_args` instead (following DataFusion best practices)
   3. **Added comprehensive tests**: Verifies that both nullable and 
non-nullable inputs are handled correctly
   
   ## Are these changes tested?
   
   Yes, this PR includes a new test `test_shuffle_nullability` that verifies:
   - Non-nullable array input produces non-nullable output
   - Nullable array input produces nullable output
   - Data types are preserved correctly in both cases
   
   Test results:
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to