nathaniel-d-ef opened a new pull request, #8546:
URL: https://github.com/apache/arrow-rs/pull/8546

   # Which issue does this PR close?
   
   Related to: https://github.com/apache/arrow-rs/issues/4886 (“Add Avro 
Support”)
   
   # Rationale for this change
   
   This PR introduces fixes to the Avro reader and writer to ensure correct and 
robust roundtrip serialization of complex union types. The core issue was that 
the previous implementation failed to properly distinguish between logically 
distinct types within a union if they shared the same physical representation. 
This fundamental flaw led to valid union schemas/data being flagged as invalid, 
the loss of specific names of named type branches (e.g., "Fx8" becoming 
"fixed").
   
   This PR makes a change whereby name and namespace data is registered and 
retrieved from metadata, which ensures that complex Avro unions can now be 
reliably read, converted to Arrow, and written back to Avro without validation 
errors or loss of type information.
   
   This behavior will be further validated in a follow-up PR to add support for 
writing dense unions.
   
   # What changes are included in this PR?
   
   - A solution to the issue described above, whereby name and namespace 
metadata is registered and retrieved to support correct naming.
   - Test refactor to validate the behavior, including some reworking of TDD 
patterns to support the additional assertion data.
   - A new test file and round-trip test to validate the Avro paradigm of reuse 
of named types elsewhere in a schema (existing logic, code coverage missing 
until now).
   
   # Are these changes tested?
   
   - Yes, some existing tests were required to change to ensure assertions 
matched, as metadata is now included where it previously wasn't.
   - Additional test as mentioned relating to schema name references.
   
   # Are there any user-facing changes?
   
   Crate not yet public
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to