talatuyarer commented on code in PR #11041:
URL: https://github.com/apache/iceberg/pull/11041#discussion_r2607689461


##########
format/view-spec.md:
##########
@@ -160,6 +177,71 @@ Each entry in `version-log` is a struct with the following 
fields:
 | _required_  | `timestamp-ms` | Timestamp when the view's 
`current-version-id` was updated (ms from epoch) |
 | _required_  | `version-id`   | ID that `current-version-id` was set to |
 
+#### Storage Table Identifier
+
+The table identifier for the storage table that stores the precomputed results.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `namespace`    | A list of strings for namespace levels |
+| _required_  | `name`         | A string specifying the name of the table |
+
+### Storage table metadata
+
+This section describes additional metadata for the storage table that 
supplements the regular table metadata and is required for materialized views.
+The property "refresh-state" is set on the [snapshot 
summary](https://iceberg.apache.org/spec/#snapshots) property of every storage 
table snapshot to determine the freshness of the precomputed data of the 
storage table.
+
+| Requirement | Field name      | Description |
+|-------------|-----------------|-------------|
+| _required_  | `refresh-state` | A [refresh state](#refresh-state) record 
stored as a JSON-encoded string |
+
+#### Refresh state
+
+The refresh state record captures the state of source tables, views, and 
materialized views at refresh time.
+
+* Source view states are stored in `source-view-states`. It includes indirect 
references — views nested within other views (excluding MVs).
+* Source table states are stored in `source-table-states`. It includes 
indirect references - tables nested within other views (excluding MVs).
+
+For directly referenced source materialized views, both the source view and 
its storage table are included in the refresh state. Indirect references (views 
or tables) from source materialized views are excluded in the refresh-state. 
During read time, a query engine recursively expands the query tree to 
determine freshness if it chooses to enforce recursive evaluation semantic.
+

Review Comment:
   If the producer does not provide `source-table-states`(as empty list), the 
consumer has to determine freshness by using other mechanisms.



##########
format/view-spec.md:
##########
@@ -160,6 +177,71 @@ Each entry in `version-log` is a struct with the following 
fields:
 | _required_  | `timestamp-ms` | Timestamp when the view's 
`current-version-id` was updated (ms from epoch) |
 | _required_  | `version-id`   | ID that `current-version-id` was set to |
 
+#### Storage Table Identifier
+
+The table identifier for the storage table that stores the precomputed results.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `namespace`    | A list of strings for namespace levels |
+| _required_  | `name`         | A string specifying the name of the table |
+
+### Storage table metadata
+
+This section describes additional metadata for the storage table that 
supplements the regular table metadata and is required for materialized views.
+The property "refresh-state" is set on the [snapshot 
summary](https://iceberg.apache.org/spec/#snapshots) property of every storage 
table snapshot to determine the freshness of the precomputed data of the 
storage table.
+
+| Requirement | Field name      | Description |
+|-------------|-----------------|-------------|
+| _required_  | `refresh-state` | A [refresh state](#refresh-state) record 
stored as a JSON-encoded string |
+
+#### Refresh state
+
+The refresh state record captures the state of source tables, views, and 
materialized views at refresh time.
+
+* Source view states are stored in `source-view-states`. It includes indirect 
references — views nested within other views (excluding MVs).
+* Source table states are stored in `source-table-states`. It includes 
indirect references - tables nested within other views (excluding MVs).
+
+For directly referenced source materialized views, both the source view and 
its storage table are included in the refresh state. Indirect references (views 
or tables) from source materialized views are excluded in the refresh-state. 
During read time, a query engine recursively expands the query tree to 
determine freshness if it chooses to enforce recursive evaluation semantic.
+
+The refresh state has the following fields:
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `view-version-id`         | The `version-id` of the 
materialized view when the refresh operation was performed  |
+| _required_  | `source-table-states`        | A list of [source 
table](#source-table) records for tables directly or indirectly referenced 
through common views, plus storage tables of directly referenced source 
materialized views |
+| _required_  | `source-view-states`         | A list of [source 
view](#source-view) records for all views (including materialized views) that 
are directly referenced, plus common views indirectly referenced through other 
common views |
+| _required_  | `refresh-start-timestamp-ms` | A timestamp of when the refresh 
operation was started |
+
+#### Source table
+
+A source table record captures the state of a source table (including source 
MV's storage table) at the time of the last refresh operation.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `uuid`         | The uuid of the source table |
+| _required_  | `snapshot-id`  | Snapshot-id of when the last refresh 
operation was performed |
+| _optional_  | `ref`          | Branch name of the source table being 
referenced in the view query |
+
+When `ref` is `null` or not set, it defaults to "main".
+
+#### Source view
+
+A source view record captures the state of a source view at the time of the 
last refresh operation.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `uuid`         | The uuid of the source view |
+| _required_  | `version-id`   | Version-id of when the last refresh operation 
was performed |
+

Review Comment:
   We need to include the source view name in the `source-view` record would 
enable dummy clients to load and process data more effectively. 



##########
format/view-spec.md:
##########
@@ -160,6 +177,71 @@ Each entry in `version-log` is a struct with the following 
fields:
 | _required_  | `timestamp-ms` | Timestamp when the view's 
`current-version-id` was updated (ms from epoch) |
 | _required_  | `version-id`   | ID that `current-version-id` was set to |
 
+#### Storage Table Identifier
+
+The table identifier for the storage table that stores the precomputed results.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `namespace`    | A list of strings for namespace levels |
+| _required_  | `name`         | A string specifying the name of the table |
+
+### Storage table metadata
+
+This section describes additional metadata for the storage table that 
supplements the regular table metadata and is required for materialized views.
+The property "refresh-state" is set on the [snapshot 
summary](https://iceberg.apache.org/spec/#snapshots) property of every storage 
table snapshot to determine the freshness of the precomputed data of the 
storage table.
+
+| Requirement | Field name      | Description |
+|-------------|-----------------|-------------|
+| _required_  | `refresh-state` | A [refresh state](#refresh-state) record 
stored as a JSON-encoded string |
+
+#### Refresh state
+
+The refresh state record captures the state of source tables, views, and 
materialized views at refresh time.
+
+* Source view states are stored in `source-view-states`. It includes indirect 
references — views nested within other views (excluding MVs).
+* Source table states are stored in `source-table-states`. It includes 
indirect references - tables nested within other views (excluding MVs).
+
+For directly referenced source materialized views, both the source view and 
its storage table are included in the refresh state. Indirect references (views 
or tables) from source materialized views are excluded in the refresh-state. 
During read time, a query engine recursively expands the query tree to 
determine freshness if it chooses to enforce recursive evaluation semantic.
+
+The refresh state has the following fields:
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `view-version-id`         | The `version-id` of the 
materialized view when the refresh operation was performed  |
+| _required_  | `source-table-states`        | A list of [source 
table](#source-table) records for tables directly or indirectly referenced 
through common views, plus storage tables of directly referenced source 
materialized views |
+| _required_  | `source-view-states`         | A list of [source 
view](#source-view) records for all views (including materialized views) that 
are directly referenced, plus common views indirectly referenced through other 
common views |
+| _required_  | `refresh-start-timestamp-ms` | A timestamp of when the refresh 
operation was started |
+
+#### Source table
+
+A source table record captures the state of a source table (including source 
MV's storage table) at the time of the last refresh operation.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `uuid`         | The uuid of the source table |
+| _required_  | `snapshot-id`  | Snapshot-id of when the last refresh 
operation was performed |
+| _optional_  | `ref`          | Branch name of the source table being 
referenced in the view query |

Review Comment:
   We need to include the source table name in the `source-table` record would 
enable dummy clients to load and process data more effectively. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to