dramaticlly opened a new pull request, #14504: URL: https://github.com/apache/iceberg/pull/14504
This pull introduce 2 changes 1. Add a new table properties to control whether table metadata's last-updated timestamp follows snapshot timestamps. 2. change default behavior to not follow ## Background From iceberg spec per https://iceberg.apache.org/spec/#table-metadata-fields, the `last-updated-ms` represent > Timestamp in milliseconds from the unix epoch when the table was last updated. Each table metadata file should update this field just before writing. The existing implementation have implicit behavior when snapshot is added or set as the reference in main branch, the last updated timestamp follows the snapshot creation timestamp in addSnapshot: https://github.com/apache/iceberg/blob/f65558ec30a27fb26ed85ba0e3dc4e498f2c046b/core/src/main/java/org/apache/iceberg/TableMetadata.java#L1258 and set(snapshot)Ref: https://github.com/apache/iceberg/blob/f65558ec30a27fb26ed85ba0e3dc4e498f2c046b/core/src/main/java/org/apache/iceberg/TableMetadata.java#L1317 There's also requirements to be validated when building TableMetadata - [Chronological Ordering of Metadata Log Entries](https://github.com/apache/iceberg/blob/499fc8ab09eba0f29b3f70dc8d14fa6206b0ba06/core/src/main/java/org/apache/iceberg/TableMetadata.java#L376-L396) All metadata and data changes that affect the Iceberg state and result in a metadata.json file are recorded in MetadataLogEntry, which stores both the json file name and timestamp. These entries are required to be sorted chronologically by their lastUpdatedMillis, with a one-minute clock skew tolerance. - [Chronological Ordering of Snapshot History Log](https://github.com/apache/iceberg/blob/499fc8ab09eba0f29b3f70dc8d14fa6206b0ba06/core/src/main/java/org/apache/iceberg/TableMetadata.java#L358-L374) Only data or manifest changes will generate a new snapshot. The Snapshot history log maintains both snapshot-id and timestamp, requiring chronological sorting by lastUpdatedMillis, with a one-minute clock skew tolerance. ## Problem This models works well when a single table is true authority of the truth, however when table replication come into play, or when iceberg metadata is generated/translated from other data source/format, it's difficult to achieve the consistent time travel behavior. Given the iceberg time travel can be configured with both snapshot-id and timestamp, for consistent time travel experience, sometimes it's desired to have backdated snapshot where commit a slightly older snapshot at current time, or diverge on table metadata timestamp and snapshot timestamp. Generally it's easy to ensure the chronological ordering of snapshot, but other concurrent change to the table like properties update will unnecessarily fail the commit due to violation of chronological ordering on metadata log entries. ## How it helps - reduced commit failure on REST based catalog due to timing, with separation of concerns between the engine's role in writing snapshots and the catalog's role in writing `metadata.json` - flexibility to replication of iceberg snapshot with consistent time travel experience , keep same snapshot timestamp and allow table metadata to diverge -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
