JanKaul commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2786374935
########## format/view-spec.md: ########## @@ -160,6 +176,109 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when the view's `current-version-id` was updated (ms from epoch) | | _required_ | `version-id` | ID that `current-version-id` was set to | +#### Storage Table Identifier + +The table identifier for the storage table that stores the precomputed results. + +| Requirement | Field name | Description | +|-------------|----------------|-------------| +| _required_ | `namespace` | A list of strings for namespace levels | +| _required_ | `name` | A string specifying the name of the table | + +### Storage table metadata + +This section describes additional metadata for the storage table that supplements the regular table metadata and is required for materialized views. +The property "refresh-state" is set on the [snapshot summary](https://iceberg.apache.org/spec/#snapshots) property of every storage table snapshot to determine the freshness of the precomputed data of the storage table. + +| Requirement | Field name | Description | +|-------------|-----------------|-------------| +| _required_ | `refresh-state` | A [refresh state](#refresh-state) record stored as a JSON-encoded string | + +#### Freshness + +Consumers should only read from the storage table if the materialized view is "fresh" and therefore adequately represents the logical query definition of the view. +Different systems define freshness differently based on time-based and logical factors. + +**Time-based freshness (consumer-defined):** + +Consumers may apply time-based freshness policies, such as allowing a certain staleness window based on `refresh-start-timestamp-ms`. +When evaluating freshness, consumers: +- Must first evaluate their own time-based freshness policy. +- May additionally compare the `source-states` list against the states loaded from the catalog to verify the producers logical freshness policy. +- May parse the view definition to implement a more sophisticated policy. +- When a materialized view is considered stale, can fail, refresh inline, or treat the materialized view as a logical view. +- Must not read from the storage table when the materialized view doesn't meet freshness criteria. + +**Logical freshness (producer-defined):** Review Comment: I struggled a bit as part of the responsibility lies with the producer and part lies with the consumer. That's why I wanted to separate the concerns to make it more clear. But I get that it's actually more confusing. I've updated the text such that the main responsibility lies with the consumer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
