rdblue commented on PR #4660: URL: https://github.com/apache/iceberg/pull/4660#issuecomment-1118020014
Thanks for the update, @homar. I'm debating whether this is a good idea. This adds quite a bit more content to each snapshot in metadata, especially when collecting partition summaries. The rationale for this is: > This is needed in trino in order to reliably provide number of deleted rows to the user, also when delete happens to delete no files. I'm not sure if that's a good reason to add this. If you want to rely on these summary properties, why not default to 0 if the property isn't there? Seems like the argument against that is the narrow case where the property is modified on the Iceberg side without anyone noticing, but I don't see a reason to modify these properties so the problem is very unlikely to occur. In addition, I wouldn't expect summary properties to be used to report back. Typically, I'd recommend checking the metadata tree for things like number of deleted records. Although if you're happy with these for user reporting then it should be fine. It is just a summary and not going to the source of truth (the metadata tree). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
