chenghuichen commented on PR #5974: URL: https://github.com/apache/paimon/pull/5974#issuecomment-3131300512
Hi team, This PR provides the initial implementation for schema handling in Python. In the process, I've identified two important follow-up items that require discussion and further work. **1. `Schema` vs. `TableSchema` in `rest_catalog`** There's a design conflict in how schemas are persisted. My understanding is that `paimon.schema.Schema` is the user-facing API, while `paimon.table.TableSchema` is the internal, persistable format. However, the `rest_catalog` currently persists the user-facing `Schema` object. This is problematic because `Schema` contains a non-serializable `pyarrow.schema` attribute. To avoid breaking existing tests, I have omitted this attribute in my implementation, but this is a temporary workaround. **Proposal:** The `rest_catalog` should be updated to use `TableSchema` for persistence to resolve this serialization issue and align with the intended design. **2. Incomplete PyArrow Schema Conversion** The conversion logic between Paimon's types and PyArrow's types is not yet complete. I have implemented support for atomic types, but **nested types (`ROW`, `MAP`, `ARRAY`) are not yet supported**. This means that code paths relying on `parse_data_fields_from_pyarrow_schema` will fail if they encounter a schema with nested structures. **Proposal:** We should create follow-up tasks to implement the missing nested type conversions. I'm happy to help, and we could divide this work among contributors. In summary, this PR is a foundational step. To make it complete, we need to coordinate on the `rest_catalog` refactoring and plan the completion of the type conversion logic. Thanks for your review and feedback -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@paimon.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org