sfc-gh-ibelianski commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2593951478
########## format/view-spec.md: ########## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata files optimistically, assuming that the current metadata location will not be changed before the writer's commit. Once a writer has created an update, it commits by swapping the view's metadata file pointer from the base location to the new location. +### Materialized Views + +Materialized views are a type of view with precomputed results from the view query stored as a table. +When queried, engines may return the precomputed data for the materialized views, shifting the cost of query execution to the precomputation step. + +Iceberg materialized views are implemented as a combination of an Iceberg view and an underlying Iceberg table, the "storage-table", which stores the precomputed data. +Materialized View metadata is a superset of View metadata with an additional pointer to the storage table. The storage table is an Iceberg table with additional materialized view refresh state metadata. +Refresh metadata contains information about the "source tables" and/or "source views", which are the tables/views referenced in the query definition of the materialized view. +During read time, a materialized view (storage table) can be interpreted as "fresh", "stale" or "invalid", depending on the following situations: +* **fresh** -- The `snapshot_id`s of the last refresh operation match the current `snapshot_id`s of all the source tables, OR all source table snapshots that differ from the last refresh have timestamps within a configured staleness window. Review Comment: I think during the last meeting we have agreed to use "delayed view definition/semantics" as it doesn't mandate optimizations and allows for false negatives (allows claiming that view is stale as freshness logic is naive ). The wording above mandates very specific optimization of additional checks needed if refresh is outside of the staleness window. proposal: Invalid: The current version_id of the materialized view does not match the view-version-id recorded in its refresh state. A read operation cannot proceed using the materialized view's data. Fresh: if it is Valid and the stored data represents the result set that would have been retrieved if the underlying View Query was executed at some point during the defined Staleness Window. stale: if it is valid but not Fresh. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
