rdblue opened a new pull request #1475:
URL: https://github.com/apache/iceberg/pull/1475


   `REPLACE TABLE` replaces the table schema and partition spec in addition to 
the table data. When the new schema is passed to Iceberg, its field IDs are 
reassigned to ensure they are consistent (for example, no duplicate IDs), just 
like when a table is created. This updates how the IDs are reassigned to fix 
time travel and partition specs.
   
   Iceberg doesn't currently track what the table schema was at the time a 
snapshot was written. The current table schema is always used to read older 
snapshots. When a table is replaced and the ids are reassigned, this can lead 
to incompatible ID conflicts between an older schema and the current schema.
   
   To fix incompatible IDs, this updates reassignment to use the current table 
schema's ids if possible. If the current schema is `1: id bigint, 2: data 
string` and the new schema is `data string, id int`, then ids are reused by 
name: `2: data string, 1: id int`. Any field that is not in the current table 
schema is assigned a new unique ID using `lastColumnId`.
   
   This also fixes a bug where incompatible ID reassignment breaks old 
partition specs, resulting in errors like "Cannot create identity partition 
sourced from different field in schema: partititon_col".


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to