kosiew opened a new pull request, #19955:
URL: https://github.com/apache/datafusion/pull/19955
## Which issue does this PR close?
* Closes #19841.
## Rationale for this change
Struct-to-struct casting previously fell back to **positional mapping** when
there was **no field-name overlap** and the number of fields matched. That
behavior is ambiguous and can silently produce incorrect results when
source/target schemas have different field naming conventions or ordering.
This PR makes struct casting **strictly name-based**: when there is no
overlap in field names between the source and target structs, the cast is
rejected with a clear planning error. This prevents accidental, hard-to-detect
data corruption and forces callers to provide explicit field names (or align
schemas) when casting.
## What changes are included in this PR?
* Removed the positional fallback logic for struct casting in
`cast_struct_column`; child fields are now resolved **only by name**.
* Updated `validate_struct_compatibility` to **error out** when there is
**no field name overlap**, instead of allowing positional compatibility checks.
* Updated unit tests to reflect the new behavior (no-overlap casts now fail
with an appropriate error).
* Updated SQLLogicTest files to construct structs using **explicit field
names** (e.g. `{id: 1}` / `{a: 1, b: 'x'}` or `struct(… AS field)`), avoiding
reliance on positional behavior.
* Improved error messaging to explicitly mention the lack of field name
overlap.
## Are these changes tested?
Yes.
* Updated existing Rust unit tests in `nested_struct.rs` to assert the new
failure mode and error message.
* Updated SQLLogicTest coverage (`struct.slt`, `joins.slt`) to use named
struct literals so tests continue to validate struct behavior without
positional casting.
## Are there any user-facing changes?
Yes — behavioral change / potential breaking change.
* Casting between two `STRUCT`s with **no overlapping field names** now
fails (previously it could succeed via positional mapping if field counts
matched).
* Users must provide explicit field names (e.g. `{a: 1, b: 'x'}` or
`struct(expr AS a, expr AS b)`) or ensure schemas share field names.
* Error messages are more explicit: casts are rejected when there is “no
field name overlap”.
## LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content
has been manually reviewed and tested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]