nazq commented on PR #2451: URL: https://github.com/apache/iceberg-rust/pull/2451#issuecomment-4604061489
Stopping by from #2563 — I read this PR end-to-end and wanted to leave a downstream perspective. **I think this is the right architectural direction for `UpdateSchemaAction`.** The operations-buffer-then-`schema_update` shape is the only model that can correctly handle dependent operations like rename-then-update or add-then-make-required, because validation needs to see the in-flight state of the operations vector — not just the base schema. The `Rename` handler in `crates/iceberg/src/spec/schema/update.rs` already does this correctly when it consults the `updates: HashMap<i32, NestedFieldRef>` to merge a same-target update into a rename; that's exactly the kind of cross-operation reasoning the imperative #2120 design can't express. It's also what #697 explicitly asks for: parity with PyIceberg's `UpdateSchema` and Java's `SchemaUpdate`, both of which use this same operations-buffer model. To put my money where my mouth is: I'd opened #2563 to add `rename_column` in the #2120 style, but after reading this PR I'm convinced that's the wrong long-term path. I've **realigned #2563's API to use your `RenameColumn` typed-builder shape** so callers adopting it today can migrate to this PR with no source change, and explicitly framed it as a stepping stone rather than a competing direction. We're a downstream consumer (Rust data-processing service writing into Iceberg via the REST catalog over both MinIO and GCS) that ran into the missing-rename gap in production. Until this lands we'll continue carrying it on a fork — happy to be a guinea pig for API ergonomics if useful, and we have real-world data we could plug into the `#[ignore = "not yet implemented"]` cases. Specifically: - `add_required_column_with_default` / `add_required_column_with_update_column_default` - `add_required_column_case_insensitive` / `add_multiple_required_column_case_insensitive` / `require_column_case_insensitive` — we have a column-name corpus with mixed casing from upstream sources - `add_column_with_update_column_default_to_required_column` Let me know which of those (or others) would unblock review and I'll send a PR against your branch. Re: CTTY's earlier "not sure this is the right direction" — I'd weigh that against #697, which Fokko opened specifically asking for the Python/Java `UpdateSchema` model in Rust. This PR is the answer to that brief, and the API churn from #2120 is a one-time cost that gets cheaper the sooner it lands. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
