Re: Dedicated sync for Iceberg materialized view

Steven Wu Tue, 18 Nov 2025 10:54:11 -0800

A reminder for tomorrow's community sync for the MV spec.
https://calendar.app.google/T4zSk6qKWoy1vV6P7


We have one open question from the last meeting on how `max-stalenesss-ms`
should be interpreted. You can find more details in the meeting notes.
https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o

Please also bring other topics that we should discuss.

On Sat, Nov 1, 2025 at 10:14 PM Steven Wu <[email protected]> wrote:

> Sorry for the delay. Here are the recording and meeting notes for the MV
> sync meeting on Wednesday, Oct 29.
>
> https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o
>
> We have started to collect them in the above google doc.
>
> On Mon, Oct 27, 2025 at 8:58 AM Péter Váry <[email protected]>
> wrote:
>
>> If we have materialized views (MVs) and support for incremental change
>> scans, then by introducing a Java-based representation of the view, we can
>> expose a scan API that always returns up-to-date results for the MV.
>>
>> The scan could include multiple tasks:
>>
>>    - A task for reading the current version of the MV.
>>    - An incremental change log scan covering the range between the
>>    snapshot ID of the source table at the time the MV was last refreshed and
>>    its current snapshot ID. Applying the Java representation of the view when
>>    transformations are required.
>>
>> This approach allows us to build an always up-to-date index table/single
>> source MV, using existing components.
>>
>> Benny Chow <[email protected]> ezt írta (időpont: 2025. okt. 24., P,
>> 7:44):
>>
>>> Hi Peter
>>>
>>> I think the current proposal would support your example.  In most
>>> situations, replace table operations after a view is materialized wouldn’t
>>> invalidate the materialization.  However, if the view includes metadata
>>> columns, then the replace operations should invalidate the materialization.
>>>
>>>
>>> This also brings up another important point that engines will differ on
>>> what views can be materialized or not.  For example, maybe metadata columns
>>> are not allowed similar to non deterministic functions like random.  But
>>> some engines like Dremio may allow views that use current date functions.
>>> It should be possible for one engine to materialize a view and another
>>> engine to look at the query tree and decide it’s not a view it supports
>>> materializations on and choose not to use that materialization.
>>>
>>> Thanks
>>> Benny
>>>
>>>
>>>
>>> On Oct 23, 2025, at 8:44 AM, Péter Váry <[email protected]>
>>> wrote:
>>>
>>> 
>>> Hi All,
>>>
>>> I’ve been catching up on the discussion and wanted to share an
>>> observation. One aspect that stands out to me in the proposed staleness
>>> evaluation logic is that snapshots which don’t modify data can still affect
>>> the view’s contents if the view includes metadata columns.
>>>
>>> I was considering using a materialized view as an index for a given
>>> table to accelerate the conversion of equality deletes to position deletes.
>>> For example, the query might look like:
>>>
>>> *SELECT _POS, _FILE, id FROM target_table*
>>>
>>>
>>> During compaction, the materialized view would need to be refreshed to
>>> ensure it reflects the correct data.
>>>
>>> Does this seem like a valid use case? Or should we explicitly exclude
>>> scenarios like this?
>>>
>>> Thanks,
>>> Peter
>>>
>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. okt. 20., H,
>>> 17:30):
>>>
>>>> Walaa,
>>>>
>>>> > while Option 2 is described in your summary as "giving engines
>>>> *flexibility* to determine freshness recursively beyond a source MV",
>>>> that *isn’t achievable* under the MV evaluation model itself.
>>>> Because each MV treats upstream MVs as physical tables, recursion stops
>>>> at the first materialized boundary; *deeper staleness cannot be
>>>> discovered without switching to a logical-view evaluation model, i.e.,
>>>> stepping outside the MV model altogether (note that in Option 3 we can
>>>> determine recursive staleness while still inside the MV model).*
>>>>
>>>> In option 2, when determining the freshness of mv_3, engines can choose
>>>> to recursively evaluate the freshness of mv_1 and mv_2 since they are also
>>>> MVs. But engines can also choose not to.
>>>>
>>>> > This means that there seems to be an implicit “Option 3”. This option
>>>> treats MVs as logical views, i.e., storing only view versions + base table
>>>> snapshot IDs (no MV storage snapshot IDs, no per-path lineage).
>>>>
>>>> In the new option 3 you described, how could the engine update mv3's
>>>> refresh state for base table_a and table_b? unless all connected MVs are
>>>> refreshed and committed in one single transaction, one entry per base table
>>>> doesn't seem feasible. That's the main reason for option 1 to require the
>>>> lineage path information in refresh state for base tables.
>>>>
>>>> It also seems that option 3 can only interpret freshness recursively,
>>>> while today there are engines that support MVs without recursively
>>>> evaluating source MVs.
>>>>
>>>> Thanks,
>>>> Steven
>>>>
>>>>
>>>> On Mon, Oct 20, 2025 at 1:44 AM Walaa Eldin Moustafa <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Steven,
>>>>>
>>>>> Thanks for organizing the series and summarizing the outcome.
>>>>>
>>>>> After re-reading the Option 1/2 proposal, initially I interpreted
>>>>> Option 1 as simply expanding MVs like regular logical views. On closer
>>>>> look, it is actually more complex. It also preserves per-path lineage 
>>>>> state
>>>>> (e.g., multiple entries for the same base table via different parents),
>>>>> which increases expressiveness but significantly increases metadata
>>>>> complexity. So I agree it is not a practical option.
>>>>>
>>>>> This means that there seems to be an implicit “Option 3”. This option
>>>>> treats MVs as logical views, i.e., storing only view versions + base table
>>>>> snapshot IDs (no MV storage snapshot IDs, no per-path lineage). Under this
>>>>> model, mv_3’s metadata might look like:
>>>>>
>>>>> Type   Name     Tracked State
>>>>> -----  -------  -----------------------
>>>>> view   mv_1     view_version_id
>>>>> view   mv_2     view_version_id
>>>>> table  table_a  table_snapshot_id
>>>>> table  table_b  table_snapshot_id
>>>>>
>>>>> This preserves logical semantics and aligns MV behavior with pure
>>>>> views.
>>>>>
>>>>> *If we choose Option 2 (treat source MV as a materialized table), we
>>>>> may have to be consider those constraints:*
>>>>>
>>>>> * Staleness only degrades up the chain. mv_1 and mv_2 may already be
>>>>> stale relative to the base tables, but if mv_3 is refreshed using their
>>>>> storage snapshots, then mv_3 will be marked as fresh under Option 2, even
>>>>> though all three MVs are stale relative to the base tables.
>>>>>
>>>>> * Engines can no longer discover staleness beyond mv_1. Once mv_3 sees
>>>>> mv_1 (or mv_2) as fresh based only on their storage snapshots, it will not
>>>>> expand into mv_1 or mv_2 to check whether they are stale relative to the
>>>>> base tables.
>>>>>
>>>>> * If mv_2 and mv_3 were purely logical views instead of MVs, they
>>>>> would evaluate directly against base tables and return newer data. Under
>>>>> Option 2, the same definitions but materialized upstream produce different
>>>>> data, not just different metadata.
>>>>>
>>>>> Therefore, while Option 2 is described in your summary as "giving
>>>>> engines *flexibility* to determine freshness recursively beyond a
>>>>> source MV", that *isn’t achievable* under the MV evaluation model
>>>>> itself.
>>>>> Because each MV treats upstream MVs as physical tables, recursion
>>>>> stops at the first materialized boundary; *deeper staleness cannot be
>>>>> discovered without switching to a logical-view evaluation model, i.e.,
>>>>> stepping outside the MV model altogether (note that in Option 3 we can
>>>>> determine recursive staleness while still inside the MV model).*
>>>>>
>>>>> Let me know your thoughts. I slightly prefer Option 3. I’m also fine
>>>>> with Option 2, but I don’t think the flexibility to recursively determine
>>>>> freshness actually exists under its evaluation model. Not sure if this
>>>>> changes anyone’s view, but I wanted to clarify how I’m reading it.
>>>>>
>>>>> Thanks,
>>>>> Walaa.
>>>>>
>>>>>
>>>>> On Wed, Oct 8, 2025 at 11:11 PM Benny Chow <[email protected]> wrote:
>>>>>
>>>>>> I just listened to the recording.  I'm the tech lead for MVs at
>>>>>> Dremio and responsible for both refresh management and query rewrites 
>>>>>> with
>>>>>> MVs.
>>>>>>
>>>>>> It's great that we seem to agree that Iceberg MV spec won't require
>>>>>> that MVs always be up to date in order to be usable for query rewrites.
>>>>>> There can be many data consistency issues (as Dan pointed out) but that 
>>>>>> is
>>>>>> the state of affairs today.
>>>>>>
>>>>>> It sounds like we are converging on the following scenarios for an
>>>>>> engine to validate the MV freshness:
>>>>>>
>>>>>> 1.  Use storage table without any validation.  This might be the
>>>>>> extreme "async MV" example.
>>>>>> 2.  Ignore storage table even if one exists because SQL command or
>>>>>> use case requires that.
>>>>>> 3.  Use storage table only if data is not more than x hours old.
>>>>>> This can be achieved with the proposed refresh-start-timestamp-ms which
>>>>>> is currently in the proposed spec.  For this to work with MVs built on 
>>>>>> MVs,
>>>>>> we should probably state in the spec that if a MV is built on another MV,
>>>>>> then it needs to inherit the refresh-start-timestamp-ms of the child MV.
>>>>>> In Steven's example, when building mv3, refresh-start-timestamp-ms needs 
>>>>>> to
>>>>>> be set to the minimum of mv1 or mv2's refresh-start-timestamp-ms.  If 
>>>>>> this
>>>>>> property name is confusing, we can rename it to
>>>>>> "refresh-earliest-table-timestamp-ms".  I originally proposed this 
>>>>>> property
>>>>>> and also listed out other benefits here:
>>>>>> https://github.com/apache/iceberg/pull/11041#discussion_r1779797796
>>>>>> Also, at the time, MVs built on MVs weren't being considered.  Now that 
>>>>>> it
>>>>>> is, I would recommend we have both "refresh-start-timestamp-ms" (when the
>>>>>> refresh was started on the storage table) and
>>>>>> "refresh-earliest-table-timestamp-ms" (used for freshness validation).
>>>>>> 4.  Don't use the storage table if it is older than X hours.  This is
>>>>>> what I had originally proposed for the
>>>>>> *materialization.max-stalessness-ms* view property here:
>>>>>> https://github.com/apache/iceberg/pull/11041#discussion_r1744837644
>>>>>> It wasn't meant to validate the freshness but more to prevent use of a
>>>>>> materialization after some criteria.
>>>>>> 5.  Use storage table if recursive validation passes... i.e.
>>>>>> refresh-state matches the current expanded query tree state.  This is 
>>>>>> what
>>>>>> I think Steven is calling the "synchronous MV".
>>>>>>
>>>>>> For scenario 1-4, it would support the nice use case of an Iceberg
>>>>>> client using a view's data through the storage table without needing to
>>>>>> know how to parse/validate/expand any view SQLs.
>>>>>>
>>>>>> In Dremio's planner, we primarily use scenario 1 and 4 together to
>>>>>> determine MV validity for query rewrite.  Scenario 2 and 5 also apply in
>>>>>> certain situations.  For scenario 3, Dremio only exposes the
>>>>>> "refresh-earliest-table-timestamp-ms" as an fyi to the user but it would 
>>>>>> be
>>>>>> interesting to allow the user to set this time so that they could run
>>>>>> queries and be 100% certain that they were not seeing data older than x
>>>>>> hours.
>>>>>>
>>>>>> Thanks
>>>>>> Benny
>>>>>>
>>>>>> On Wed, Oct 8, 2025 at 3:37 PM Steven Wu <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> correction for a typo.
>>>>>>>
>>>>>>> Prashanth brought up another scenario of compaction/rewrite where a
>>>>>>> new snapshot was added *with* actual data change
>>>>>>> -->
>>>>>>> Prashanth brought up another scenario of compaction/rewrite where a
>>>>>>> new snapshot was added *without* actual data change
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 8, 2025 at 2:12 PM Steven Wu <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Thanks everyone for joining the MV discussion meeting. We will
>>>>>>>> continue to have the recurring sync meeting on Wednesday 9 am (Pacific)
>>>>>>>> every 3 weeks until we get to the finish line where Jan's MV spec PR 
>>>>>>>> [1] is
>>>>>>>> merged. I have scheduled our next meeting on Oct 29 in the Iceberg dev
>>>>>>>> events calendar.
>>>>>>>>
>>>>>>>> Here is the video recording for today's meeting.
>>>>>>>>
>>>>>>>> https://drive.google.com/file/d/1-nfhBPDWLoAFDu5cKP0rwLd_30HB6byR/view?usp=sharing
>>>>>>>>
>>>>>>>> We mostly discussed freshness evaluation. Here is the meeting
>>>>>>>> summary.
>>>>>>>>
>>>>>>>>    1. For tracking the refresh state for the source MV [2], the
>>>>>>>>    consensus is option 2 (treating source MV as a materialized table) 
>>>>>>>> which
>>>>>>>>    would give engines the flexibility on freshness determination 
>>>>>>>> (recursive
>>>>>>>>    beyond source MV or not).
>>>>>>>>    2. Earlier design doc [3] discussed max staleness config. But
>>>>>>>>    it wasn't reflected in the spec PR. The general opinion is to add 
>>>>>>>> the
>>>>>>>>    config to the spec PR. The open question is whether the `
>>>>>>>>    materialization.max-staleness-ms` config should be added to the
>>>>>>>>    view metadata or the storage table metadata. Either can work. We 
>>>>>>>> just need
>>>>>>>>    to decide which makes a little better fit.
>>>>>>>>    3. Prashanth brought up schema change with default value and
>>>>>>>>    how it may affect the MV refresh state (for SQL representation with 
>>>>>>>> select
>>>>>>>>    *). Jan mentioned that snapshot contains schema id when the 
>>>>>>>> snapshot was
>>>>>>>>    created. Engine can compare the snapshot schema id to the source 
>>>>>>>> table
>>>>>>>>    schema id during freshness evaluation. There is no need for 
>>>>>>>> additional
>>>>>>>>    schema info in refresh-state tracking in the storage table.
>>>>>>>>    4. Prashanth brought up another scenario of compaction/rewrite
>>>>>>>>    where a new snapshot was added with actual data change. The general 
>>>>>>>> take is
>>>>>>>>    that the engine can optimize and decide that MV is fresh as the new
>>>>>>>>    snapshot doesn't have any data change.
>>>>>>>>
>>>>>>>>
>>>>>>>> We can add some clarifications in the spec PR for freshness
>>>>>>>> evaluation based on the above discussions.
>>>>>>>>
>>>>>>>> [1] https://github.com/apache/iceberg/pull/11041
>>>>>>>> [2]
>>>>>>>> https://docs.google.com/document/d/1_StBW5hCQhumhIvgbdsHjyW0ED3dWMkjtNzyPp9Sfr8/edit?tab=t.0
>>>>>>>> [3]
>>>>>>>> https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?tab=t.0#heading=h.3wigecex0zls
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Sep 25, 2025 at 9:27 AM Steven Wu <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Iceberg materialized view has been discussed in the community for
>>>>>>>>> a long time. Thanks Jan Kaul for driving the discussion and the spec 
>>>>>>>>> PR. It
>>>>>>>>> has been stalled for a long time due to lack of consensus on 1 or 2 
>>>>>>>>> topics.
>>>>>>>>> In Wed's Iceberg community sync meeting, Talat brought up the 
>>>>>>>>> question on
>>>>>>>>> how to move forward and if we can have a dedicated meeting for MV.
>>>>>>>>>
>>>>>>>>> I have set up a meeting on *Oct 8 (9-10 am Pacific)*. If you
>>>>>>>>> subscribe to the "Iceberg Dev Events" calendar, you should be
>>>>>>>>> able to see it. If not, here is the link:
>>>>>>>>> https://meet.google.com/nfe-guyq-pqf
>>>>>>>>>
>>>>>>>>> We are going to discuss
>>>>>>>>> * remaining open questions
>>>>>>>>> * unresolved concerns
>>>>>>>>> * the next step and hopefully some consensus on moving forward
>>>>>>>>>
>>>>>>>>> MV spec PR is up to date. Jan has incorporated recent feedback.
>>>>>>>>> This should be the base of the discussion.
>>>>>>>>> https://github.com/apache/iceberg/pull/11041
>>>>>>>>> <https://www.google.com/url?q=https://github.com/apache/iceberg/pull/11041&sa=D&source=calendar&usd=2&usg=AOvVaw3w0TjRpwbC17AGzmxZmElM>
>>>>>>>>>
>>>>>>>>> Dev discussion thread (a long-running thread started by Jan).
>>>>>>>>> https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf
>>>>>>>>> <https://www.google.com/url?q=https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf&sa=D&source=calendar&usd=2&usg=AOvVaw0fotlsrnRBOb820mA5JRyB>
>>>>>>>>>
>>>>>>>>> The mail archive has broken lineage and doesn't show all replies.
>>>>>>>>> Email subject is "*[DISCUSS] Iceberg Materialzied Views*".
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Steven
>>>>>>>>>
>>>>>>>>>

Re: Dedicated sync for Iceberg materialized view

Reply via email to