jecsand838 opened a new pull request, #8075:
URL: https://github.com/apache/arrow-rs/pull/8075

   # Which issue does this PR close?
   
   - Part of https://github.com/apache/arrow-rs/issues/4886
   
   # Rationale for this change
   
   This change introduces functionality to convert an `ArrowSchema` into an 
`AvroSchema`. This is a crucial feature for improving interoperability between 
Arrow and Avro. By enabling direct schema conversion, we can streamline data 
pipelines that write Arrow data to Avro-compatible formats. Furthermore we'd be 
able to create reader `AvroSchema` instances directly from an arrow-rs `Schema` 
for easier schema evolution support.
   
   These updates are also foundational to the upcoming `arrow-avro` `Writer`.
   
   # What changes are included in this PR?
   
   - **`TryFrom<&ArrowSchema> for AvroSchema`**: The core of this PR is the 
implementation of the `TryFrom` trait to allow a fallible conversion from an 
`ArrowSchema` reference to a new `AvroSchema`.
   - **Type Mapping Logic**: Added comprehensive logic to map Arrow `DataType` 
variants to their corresponding Avro type representations. This includes:
       - Primitive types (`Boolean`, `Int`, `Float`, `Binary`, `Utf8`).
       - Logical types (e.g., `Timestamp`, `Date`, `Decimal`, `UUID`).
       - Complex types (`Struct`, `List`, `Map`, `Dictionary`). Dictionaries 
are converted to Avro `enum` types.
   - **Name Sanitization**: Implemented a `NameGenerator` to ensure that field 
names derived from the `ArrowSchema` are valid according to Avro naming 
conventions and are unique within their scope.
   - **Metadata Handling**: The conversion preserves relevant metadata from the 
Arrow schema.
   - `arrow-avro` metadata constants to simplify working with Avro metadata in 
Arrow DataTypes.
   
   # Are these changes tested?
   
   Yes, this change is accompanied by new tests in `schema.rs`. The tests cover:
   - Correct mapping of all supported primitive, temporal, and logical types.
   - Conversion of complex and nested structures like `Struct`, `List`, and 
`Map`.
   - Proper handling of dictionary-encoded fields to Avro enums.
   - Validation of name sanitization logic.
   - Round-trip conversion tests for various data types to ensure correctness.
   
   # Are there any user-facing changes?
   
   N/A
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to