jecsand838 opened a new pull request, #8075: URL: https://github.com/apache/arrow-rs/pull/8075
# Which issue does this PR close? - Part of https://github.com/apache/arrow-rs/issues/4886 # Rationale for this change This change introduces functionality to convert an `ArrowSchema` into an `AvroSchema`. This is a crucial feature for improving interoperability between Arrow and Avro. By enabling direct schema conversion, we can streamline data pipelines that write Arrow data to Avro-compatible formats. Furthermore we'd be able to create reader `AvroSchema` instances directly from an arrow-rs `Schema` for easier schema evolution support. These updates are also foundational to the upcoming `arrow-avro` `Writer`. # What changes are included in this PR? - **`TryFrom<&ArrowSchema> for AvroSchema`**: The core of this PR is the implementation of the `TryFrom` trait to allow a fallible conversion from an `ArrowSchema` reference to a new `AvroSchema`. - **Type Mapping Logic**: Added comprehensive logic to map Arrow `DataType` variants to their corresponding Avro type representations. This includes: - Primitive types (`Boolean`, `Int`, `Float`, `Binary`, `Utf8`). - Logical types (e.g., `Timestamp`, `Date`, `Decimal`, `UUID`). - Complex types (`Struct`, `List`, `Map`, `Dictionary`). Dictionaries are converted to Avro `enum` types. - **Name Sanitization**: Implemented a `NameGenerator` to ensure that field names derived from the `ArrowSchema` are valid according to Avro naming conventions and are unique within their scope. - **Metadata Handling**: The conversion preserves relevant metadata from the Arrow schema. - `arrow-avro` metadata constants to simplify working with Avro metadata in Arrow DataTypes. # Are these changes tested? Yes, this change is accompanied by new tests in `schema.rs`. The tests cover: - Correct mapping of all supported primitive, temporal, and logical types. - Conversion of complex and nested structures like `Struct`, `List`, and `Map`. - Proper handling of dictionary-encoded fields to Avro enums. - Validation of name sanitization logic. - Round-trip conversion tests for various data types to ensure correctness. # Are there any user-facing changes? N/A -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org