stevenzwu commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r3397278429
########## format/view-spec.md: ########## @@ -42,12 +42,27 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata files optimistically, assuming that the current metadata location will not be changed before the writer's commit. Once a writer has created an update, it commits by swapping the view's metadata file pointer from the base location to the new location. +### Materialized Views + +Materialized views are a type of view with precomputed results from the view query stored as a table. +When queried, engines may return the precomputed data for the materialized views, shifting the cost of query execution to the precomputation step. + +Iceberg materialized views are implemented as a combination of an Iceberg view and an underlying Iceberg table, the "storage-table", which stores the precomputed data. +Materialized View metadata is a superset of View metadata with an additional pointer to the storage table. The storage table is an Iceberg table with additional materialized view refresh state metadata. +Refresh metadata contains information about the "source tables", "source views", and/or "source materialized views", which are the tables/views/materialized views referenced in the query definition of the materialized view. + ## Specification ### Terms * **Schema** -- Names and types of fields in a view. * **Version** -- The state of a view at some point in time. +* **Storage table** -- Iceberg table that stores the precomputed data of a materialized view. +* **Refresh state** -- A record stored in the storage table's snapshot summary that captures the state of source tables and views at the time of the last refresh operation. +* **Dependency graph** -- The graph of all source tables, views, and materialized views that a materialized view depends on, including nested dependencies. +* **Source table** -- A table reference that occurs in the query definition of a materialized view. Review Comment: we define the sources as direct children here. In the refresh state, the field name is `source-states`. If we put those two together, it conflicts with the wording of `Producers may selectively choose a subset of their dependencies to record`. We need to nail down those terms. A few other names have been floated in other comment threads, like `base`, `upstream` etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
