stevenzwu commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2590206054
########## format/view-spec.md: ########## @@ -160,6 +181,57 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when the view's `current-version-id` was updated (ms from epoch) | | _required_ | `version-id` | ID that `current-version-id` was set to | +#### Storage Table Identifier + +The table identifier for the storage table that stores the precomputed results. + +| Requirement | Field name | Description | +|-------------|----------------|-------------| +| _required_ | `namespace` | A list of strings for namespace levels | +| _required_ | `name` | A string specifying the name of the table/view | + +### Storage table metadata + +This section describes additional metadata for the storage table that supplements the regular table metadata and is required for materialized views. +The property "refresh-state" is set on the [snapshot summary](https://iceberg.apache.org/spec/#snapshots) property of every storage table snapshot to determine the freshness of the precomputed data of the storage table. + +| Requirement | Field name | Description | +|-------------|-----------------|-------------| +| _required_ | `refresh-state` | A [refresh state](#refresh-state) record stored as a JSON-encoded string | + +#### Refresh state + +The refresh state record captures the state of all source tables, views, and materialized views in the materialized view's fully expanded query tree at refresh time. Source table states are stored in `source-table-states` and source view states in `source-view-states`. For source views, `source-view-states` includes indirect references — tables or views nested within other views (exluding MVs) but not directly referenced in the query. +For source materialized views, both the source view and its storage table are included in the refresh state. Indirect references are excluded for materialized view sources; during read time, query engines may recursively expand the query tree to determine freshness. The refresh state has the following fields: Review Comment: we need a blank line before this line for this to become a separate paragraph. We probably need some tweaks on the wording. ----- For directly referenced source materialized views, both the source view and its storage table are included in the refresh state. Indirect references (views or tables) from source materialized views are excluded in the refresh-state. During read time, a query engine recursively expand the query tree to determine freshness if it chooses to enforce recursive evaluation semantic. The refresh state has the following fields: ########## format/view-spec.md: ########## @@ -160,6 +181,57 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when the view's `current-version-id` was updated (ms from epoch) | | _required_ | `version-id` | ID that `current-version-id` was set to | +#### Storage Table Identifier + +The table identifier for the storage table that stores the precomputed results. + +| Requirement | Field name | Description | +|-------------|----------------|-------------| +| _required_ | `namespace` | A list of strings for namespace levels | +| _required_ | `name` | A string specifying the name of the table/view | + +### Storage table metadata + +This section describes additional metadata for the storage table that supplements the regular table metadata and is required for materialized views. +The property "refresh-state" is set on the [snapshot summary](https://iceberg.apache.org/spec/#snapshots) property of every storage table snapshot to determine the freshness of the precomputed data of the storage table. + +| Requirement | Field name | Description | +|-------------|-----------------|-------------| +| _required_ | `refresh-state` | A [refresh state](#refresh-state) record stored as a JSON-encoded string | + +#### Refresh state + +The refresh state record captures the state of all source tables, views, and materialized views in the materialized view's fully expanded query tree at refresh time. Source table states are stored in `source-table-states` and source view states in `source-view-states`. For source views, `source-view-states` includes indirect references — tables or views nested within other views (exluding MVs) but not directly referenced in the query. Review Comment: > in the materialized view's fully expanded query tree at refresh time. I am wondering if the wording of `fully expanded query tree` is still accurate with source MVs. Maybe sth like this --------- The refresh state record captures the state of source tables, views, and materialized views at refresh time. * Source view states are stored in `source-view-states`. It includes indirect references — views nested within other views (excluding MVs). * Source table states are stored in `source-table-states`. It includes indirect references - tables nested within other views (excluding MVs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
