neilconway opened a new pull request, #22105:
URL: https://github.com/apache/datafusion/pull/22105
## Which issue does this PR close?
- Closes #12727.
## Rationale for this change
The Substrait logical-plan consumer was discarding field nullability when
reconstructing DataFusion schemas from Substrait struct types. Nullability
matters because a Substrait plan may have been produced or optimized using
non-null guarantees.
This also improves DataFusion <-> Substrait round-trip fidelity: required
fields encoded by the producer are preserved when the plan is consumed again,
instead of being widened to nullable.
## What changes are included in this PR?
- Preserve per-field nullability when converting Substrait struct types /
`NamedStruct` schemas into DataFusion schemas.
- Treat Substrait `Required` as non-nullable, and `Nullable`,
`Unspecified`, or unknown nullability values as nullable.
- Keep deprecated `UserDefinedTypeReference` non-nullable because it does
not carry nullability metadata.
- Enforce named-table `ReadRel` schema compatibility when the Substrait
schema requires a field to be non-null but the resolved DataFusion table schema
marks it nullable.
- Extend compatibility checking recursively through nested `Struct` fields.
- Leave `List` and `Map` child nullability compatibility as future work,
since their child nullability is not faithfully reconstructed today.
## Are these changes tested?
Yes; new tests added.
## Are there any user-facing changes?
We are a bit stricter when consuming Substrait plans now, but that could
prevent problems: for example, if a Substrait plan was produced under the
assumption that a field `x` is non-nullable but the local DataFusion schema
allows null values in `x`, executing the plan might produce unexpected results.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]