nathaniel-d-ef opened a new pull request, #8546: URL: https://github.com/apache/arrow-rs/pull/8546
# Which issue does this PR close? Related to: https://github.com/apache/arrow-rs/issues/4886 (“Add Avro Support”) # Rationale for this change This PR introduces fixes to the Avro reader and writer to ensure correct and robust roundtrip serialization of complex union types. The core issue was that the previous implementation failed to properly distinguish between logically distinct types within a union if they shared the same physical representation. This fundamental flaw led to valid union schemas/data being flagged as invalid, the loss of specific names of named type branches (e.g., "Fx8" becoming "fixed"). This PR makes a change whereby name and namespace data is registered and retrieved from metadata, which ensures that complex Avro unions can now be reliably read, converted to Arrow, and written back to Avro without validation errors or loss of type information. This behavior will be further validated in a follow-up PR to add support for writing dense unions. # What changes are included in this PR? - A solution to the issue described above, whereby name and namespace metadata is registered and retrieved to support correct naming. - Test refactor to validate the behavior, including some reworking of TDD patterns to support the additional assertion data. - A new test file and round-trip test to validate the Avro paradigm of reuse of named types elsewhere in a schema (existing logic, code coverage missing until now). # Are these changes tested? - Yes, some existing tests were required to change to ensure assertions matched, as metadata is now included where it previously wasn't. - Additional test as mentioned relating to schema name references. # Are there any user-facing changes? Crate not yet public -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
