findinpath commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r3239074944
########## format/view-spec.md: ########## @@ -160,7 +178,120 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when the view's `current-version-id` was updated (ms from epoch) | | _required_ | `version-id` | ID that `current-version-id` was set to | -## Appendix A: An Example +#### Storage Table Identifier + +The table identifier for the storage table that stores the precomputed results. + +| Requirement | Field name | Description | +|-------------|----------------|-------------| +| _required_ | `namespace` | A list of strings for namespace levels | +| _required_ | `name` | A string specifying the name of the table | + +### Storage table metadata + +This section describes additional metadata for the storage table that supplements the regular table metadata and is required for materialized views. +The `refresh-state` property is set on the [snapshot summary](https://iceberg.apache.org/spec/#snapshots) property of a storage table snapshot to provide information about the state of the precomputed data. + +| Requirement | Field name | Description | +|-------------|-----------------|-------------| +| _optional_ | `refresh-state` | A [refresh state](#refresh-state) record stored as a JSON-encoded string | + +#### Freshness + +A materialized view is **fresh** when the storage table represents the result of the current view query (at the materialized view's current `view-version-id`) over the current state of its dependencies. Dependencies are determined by parsing the SQL: base Iceberg tables, Iceberg views (whose own dependencies are transitively dependencies of the materialized view), and intermediate materialized views (treated as their storage tables, with their own freshness established recursively from their `refresh-state`). + +A change to the materialized view's definition produces a new `view-version-id`; any storage-table snapshot recorded at a prior `view-version-id` is not fresh under the current definition. + +The `refresh-state` summary on each storage-table snapshot records dependency state observed at refresh time. Producers populate it; consumers use it to assess freshness without re-executing the query. The spec does not mandate what producers record or how consumers assess. See [Appendix B](#appendix-b-what-counts-as-a-dependency) for what counts as a dependency. Review Comment: If the `refresh-state` is to be filled arbitrarily by the producers, what is the rationale of mentioning it in the spec? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
