Hello everyone, I’d like to ask for feedback on whether adding a `table_properties_log` metadata table is a direction worth pursuing.
PR: https://github.com/apache/iceberg/pull/16859 This PR adds a read-only metadata table that exposes the history of table properties from retained Iceberg metadata files. In the current version of Apache Iceberg, if users want to understand when a table property changed, they need to follow the metadata log/previous metadata files and inspect `metadata.json` files manually. The PR enables to retain table properties for each snapshot version through the existing metadata table mechanism. The proposed table returns one row per retained metadata version with: `timestamp`, `file`, `latest_snapshot_id` and `properties`. *Example use cases*: - Audit/RCA: check whether properties like `gc.enabled` or metadata cleanup settings were enabled before a maintenance operation. - Debugging regressions: correlate behavior changes with updates to properties like `write.update|delete|merge.mode`, `write.target-file-size-bytes` or `write.distribution-mode`. Note that the PR does NOT change the table spec or write path. It only exposes information that is already retained in metadata files, and makes it available through Spark/Flink metadata table syntax. *The primary questions* I’d like feedback on are below, but any other feedback or concerns are also welcome: 1. Is this metadata table useful enough to add? 2. Is `table_properties_log` the right user-facing name? 3. Is the proposed schema reasonable? 4. Is reading retained previous metadata files acceptable for this read-only metadata table? If this direction makes sense, I’d also appreciate review on the PR. If the community thinks this is too narrow or not worth adding, I’m happy to close it or rework the proposal. Best regards, Tom
