stevenzwu commented on code in PR #11041:
URL: https://github.com/apache/iceberg/pull/11041#discussion_r2457420431


##########
format/view-spec.md:
##########
@@ -160,6 +179,56 @@ Each entry in `version-log` is a struct with the following 
fields:
 | _required_  | `timestamp-ms` | Timestamp when the view's 
`current-version-id` was updated (ms from epoch) |
 | _required_  | `version-id`   | ID that `current-version-id` was set to |
 
+#### Storage Table Identifier
+
+The table identifier for the storage table that stores the precomputed results.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `namespace`    | A list of strings for namespace levels |
+| _required_  | `name`         | A string specifying the name of the 
table/view |
+
+### Storage table metadata
+
+This section describes additional metadata for the storage table that 
supplements the regular table metadata and is required for materialized views.
+The property "refresh-state" is set on the [snapshot 
summary](https://iceberg.apache.org/spec/#snapshots) property of every storage 
table snapshot to determine the freshness of the precomputed data of the 
storage table.
+
+| Requirement | Field name      | Description |
+|-------------|-----------------|-------------|
+| _required_  | `refresh-state` | A [refresh state](#refresh-state) record 
stored as a JSON-encoded string | 
+
+#### Refresh state
+
+The refresh state record captures the state of all source tables, views, and 
materialized views in the materialized view's fully expanded query tree at 
refresh time. Source table states are stored in `source-table-states` and 
source view states in `source-view-states`. For source views, 
`source-view-states` includes indirect references — tables or views nested 
within other views but not directly referenced in the query. For source 
materialized views, both the source view and its storage table are included in 
the refresh state. Indirect references are excluded for materialized view 
sources; query engines may recursively expand the query tree to determine 
freshness. The refresh state has the following fields:
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `view-version-id`         | The `version-id` of the 
materialized view when the refresh operation was performed  | 
+| _required_  | `source-table-states`        | A list of [source 
table](#source-table) records for all tables that are directly or indirectly 
referenced in the materialized view query |
+| _required_  | `source-view-states`         | A list of [source 
view](#source-view) records for all views that are directly or indirectly 
referenced in the materialized view query |
+| _required_  | `refresh-start-timestamp-ms` | A timestamp of when the refresh 
operation was started |
+
+#### Source table
+
+A source table record captures the state of a source table at the time of the 
last refresh operation.

Review Comment:
   the state of a source table (including source MV's storage table)?



##########
format/view-spec.md:
##########
@@ -160,6 +179,56 @@ Each entry in `version-log` is a struct with the following 
fields:
 | _required_  | `timestamp-ms` | Timestamp when the view's 
`current-version-id` was updated (ms from epoch) |
 | _required_  | `version-id`   | ID that `current-version-id` was set to |
 
+#### Storage Table Identifier
+
+The table identifier for the storage table that stores the precomputed results.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `namespace`    | A list of strings for namespace levels |
+| _required_  | `name`         | A string specifying the name of the 
table/view |
+
+### Storage table metadata
+
+This section describes additional metadata for the storage table that 
supplements the regular table metadata and is required for materialized views.
+The property "refresh-state" is set on the [snapshot 
summary](https://iceberg.apache.org/spec/#snapshots) property of every storage 
table snapshot to determine the freshness of the precomputed data of the 
storage table.
+
+| Requirement | Field name      | Description |
+|-------------|-----------------|-------------|
+| _required_  | `refresh-state` | A [refresh state](#refresh-state) record 
stored as a JSON-encoded string | 
+
+#### Refresh state
+
+The refresh state record captures the state of all source tables, views, and 
materialized views in the materialized view's fully expanded query tree at 
refresh time. Source table states are stored in `source-table-states` and 
source view states in `source-view-states`. For source views, 
`source-view-states` includes indirect references — tables or views nested 
within other views but not directly referenced in the query. For source 
materialized views, both the source view and its storage table are included in 
the refresh state. Indirect references are excluded for materialized view 
sources; query engines may recursively expand the query tree to determine 
freshness. The refresh state has the following fields:

Review Comment:
   > For source views, `source-view-states` includes indirect references — 
tables or views nested within other views but not directly referenced in the 
query.
   
   Should we clarify as
   ```
   tables or views nested within other views but not directly referenced in the 
query.
   -->
   views nested within other views (excluding MVs) that are not directly 
referenced in the query
   ```
   -------------------------------
   
   nit: move `For source materialized view, ...` to a separate paragraph to 
highlight the special clarification for source MV?
   
   -------------------------------
   
   query engines may recursively expand the query tree to determine freshness
   ->
   during read time, query engines may recursively expand the query tree to 
determine freshness
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to