rdblue opened a new pull request #1475: URL: https://github.com/apache/iceberg/pull/1475
`REPLACE TABLE` replaces the table schema and partition spec in addition to the table data. When the new schema is passed to Iceberg, its field IDs are reassigned to ensure they are consistent (for example, no duplicate IDs), just like when a table is created. This updates how the IDs are reassigned to fix time travel and partition specs. Iceberg doesn't currently track what the table schema was at the time a snapshot was written. The current table schema is always used to read older snapshots. When a table is replaced and the ids are reassigned, this can lead to incompatible ID conflicts between an older schema and the current schema. To fix incompatible IDs, this updates reassignment to use the current table schema's ids if possible. If the current schema is `1: id bigint, 2: data string` and the new schema is `data string, id int`, then ids are reused by name: `2: data string, 1: id int`. Any field that is not in the current table schema is assigned a new unique ID using `lastColumnId`. This also fixes a bug where incompatible ID reassignment breaks old partition specs, resulting in errors like "Cannot create identity partition sourced from different field in schema: partititon_col". ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
