findepi commented on code in PR #4945: URL: https://github.com/apache/iceberg/pull/4945#discussion_r894207600
########## format/spec.md: ########## @@ -496,6 +496,7 @@ A snapshot consists of the following fields: | _optional_ | | **`manifests`** | A list of manifest file locations. Must be omitted if `manifest-list` is present | | _optional_ | _required_ | **`summary`** | A string map that summarizes the snapshot changes, including `operation` (see below) | | _optional_ | _optional_ | **`schema-id`** | ID of the table's current schema when the snapshot was created | +| | _optional_ | **`statistics`** | A list of [statistics files' metadata](#statistics-file). The field should be retained by writers, unless writer updates the statistics, or knows they became obsolete. | Review Comment: That means the transaction adding new statistics will need to rewrite the existing ones in the commit phase. That would also imply different API -- it will no longer make sense for the stats writing application to write the stats file (and then commit its registration). Instead the API should take statistics in-memory, because their file destination is to be determined. I am not yet convinced these are fair tradeoffs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
