Re: Dedicated sync for Iceberg materialized view

Jan Kaul via dev Tue, 25 Nov 2025 23:49:45 -0800

Thank you Steven,

I've included the "max-staleness" in the PR. Please have a look and givefeedback on the phrasing.


Thanks,

Jan

On 11/19/25 22:37, Steven Wu wrote:

Thanks everyone for joining today's sync. We had a good discussion onhow to interpret the "max staleness" config.


You can find the meeting notes here.
https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.eho7jgm13usg

Recording is also linked in the doc (thanks Kevin).

For the next step, maybe we can collaborate on the MV spec PR to flushthe exact wording for staleness config and semantic.

https://github.com/apache/iceberg/pull/11041/files


On Tue, Nov 18, 2025 at 1:05 PM Benny Chow <[email protected]> wrote:

    Thanks Igor.  The PR has a suggestion for exactly what you
    suggested.  I called it a "*/warm/*" state which is a state where
    stale materialization can still be used.
    https://github.com/apache/iceberg/pull/11041/files#r2474661166

    I think if we continue with the assumption that MVs can only
    reference iceberg tables and views, then it makes sense for the
    max-staleness grace period to be dynamic based on snapshot
    history.   This is what Trino does:
    
https://trino.io/docs/current/connector/iceberg.html?utm_source=chatgpt.com#materialized-views

    If there are non-Iceberg tables in the view SQL, then the grace
    period will have to be based on last refresh which is also what
    Trino describes here:
    
https://trino.io/docs/current/sql/create-materialized-view.html#mv-grace-period

    Should we call out both scenarios in the MV spec?  I think this is
    worth being explicit here.

    Thanks


    On Tue, Nov 18, 2025 at 11:03 AM Igor Belianski
    <[email protected]> wrote:

        Re:  max-stalenss-ms interpretation
        proposal:
           A Materialized View(MNV) considered fresh if and only if
        the results stored are equivalent to the those that would have
        been obtained by running MV's defining query at some point in
        time within interval :
         [CurrentTime-max-staleness-ms, Current_time]

        Note: this definition allows for optimization proposed by
        option 2 (implementing which is definitely a great idea) , but
        doesn't mandate it.
         One can also imagine some other optimization that would be
        possible given definition above , and would be left up to the
        engines toi implement.





        On Tue, Nov 18, 2025 at 10:54 AM Steven Wu
        <[email protected]> wrote:

            A reminder for tomorrow's community sync for the MV spec.
            https://calendar.app.google/T4zSk6qKWoy1vV6P7

            We have one open question from the last meeting on how
            `max-stalenesss-ms` should be interpreted. You can find
            more details in the meeting notes.
            
https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o

            Please also bring other topics that we should discuss.

            On Sat, Nov 1, 2025 at 10:14 PM Steven Wu
            <[email protected]> wrote:

                Sorry for the delay. Here are the recording and
                meeting notes for the MV sync meeting on Wednesday,
                Oct 29.
                
https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o

                We have started to collect them in the above google doc.

                On Mon, Oct 27, 2025 at 8:58 AM Péter Váry
                <[email protected]> wrote:

                    If we have materialized views (MVs) and support
                    for incremental change scans, then by introducing
                    a Java-based representation of the view, we can
                    expose a scan API that always returns up-to-date
                    results for the MV.

                    The scan could include multiple tasks:

                      * A task for reading the current version of the MV.
                      * An incremental change log scan covering the
                        range between the snapshot ID of the source
                        table at the time the MV was last refreshed
                        and its current snapshot ID. Applying the Java
                        representation of the view when
                        transformations are required.

                    This approach allows us to build an always
                    up-to-date index table/single source MV, using
                    existing components.

                    Benny Chow <[email protected]> ezt írta (időpont:
                    2025. okt. 24., P, 7:44):

                        Hi Peter

                        I think the current proposal would support
                        your example. In most situations, replace
                        table operations after a view is materialized
                        wouldn’t invalidate the materialization. 
                        However, if the view includes metadata
                        columns, then the replace operations should
                        invalidate the materialization.

                        This also brings up another important point
                        that engines will differ on what views can be
                        materialized or not.  For example, maybe
                        metadata columns are not allowed similar to
                        non deterministic functions like random.  But
                        some engines like Dremio may allow views that
                        use current date functions.  It should be
                        possible for one engine to materialize a view
                        and another engine to look at the query tree
                        and decide it’s not a view it supports
                        materializations on and choose not to use that
                        materialization.

                        Thanks
                        Benny

                        On Oct 23, 2025, at 8:44 AM, Péter Váry
                        <[email protected]> wrote:

                        
                        Hi All,

                        I’ve been catching up on the discussion and
                        wanted to share an observation. One aspect
                        that stands out to me in the proposed
                        staleness evaluation logic is that snapshots
                        which don’t modify data can still affect the
                        view’s contents if the view includes metadata
                        columns.

                        I was considering using a materialized view
                        as an index for a given table to accelerate
                        the conversion of equality deletes to
                        position deletes. For example, the query
                        might look like:

                            /SELECT *_POS*, *_FILE*, id FROM
                            target_table/


                        During compaction, the materialized view
                        would need to be refreshed to ensure it
                        reflects the correct data.

                        Does this seem like a valid use case? Or
                        should we explicitly exclude scenarios like this?

                        Thanks,
                        Peter

                        Steven Wu <[email protected]> ezt írta
                        (időpont: 2025. okt. 20., H, 17:30):

                            Walaa,

                            > while Option 2 is described in your
                            summary as "giving engines
                            */flexibility/* to determine freshness
                            recursively beyond a source MV", that
                            *isn’t achievable* under the MV
                            evaluation model itself.
                            Because each MV treats upstream MVs as
                            physical tables, recursion stops at the
                            first materialized boundary; *deeper
                            staleness cannot be discovered without
                            switching to a logical-view evaluation
                            model, i.e., stepping outside the MV
                            model altogether /(note that in Option 3
                            we can determine recursive staleness
                            while still inside the MV model)./*

                            In option 2, when determining the
                            freshness of mv_3, engines can choose to
                            recursively evaluate the freshness of
                            mv_1 and mv_2 since they are also MVs.
                            But engines can also choose not to.

                            > This means that there seems to be an
                            implicit “Option 3”. This option treats
                            MVs as logical views, i.e., storing only
                            view versions + base table snapshot IDs
                            (no MV storage snapshot IDs, no per-path
                            lineage).

                            In the new option 3 you described, how
                            could the engine update mv3's refresh
                            state for base table_a and table_b?
                            unless all connected MVs are refreshed
                            and committed in one single transaction,
                            one entry per base table doesn't seem
                            feasible. That's the main reason for
                            option 1 to require the lineage path
                            information in refresh state for base tables.

                            It also seems that option 3 can only
                            interpret freshness recursively, while
                            today there are engines that support MVs
                            without recursively evaluating source MVs.

                            Thanks,
                            Steven


                            On Mon, Oct 20, 2025 at 1:44 AM Walaa
                            Eldin Moustafa <[email protected]> wrote:

                                Hi Steven,

                                Thanks for organizing the series and
                                summarizing the outcome.

                                After re-reading the Option 1/2
                                proposal, initially I interpreted
                                Option 1 as simply expanding MVs like
                                regular logical views. On closer
                                look, it is actually more complex. It
                                also preserves per-path lineage state
                                (e.g., multiple entries for the same
                                base table via different parents),
                                which increases expressiveness but
                                significantly increases metadata
                                complexity. So I agree it is not a
                                practical option.

                                This means that there seems to be an
                                implicit “Option 3”. This option
                                treats MVs as logical views, i.e.,
                                storing only view versions + base
                                table snapshot IDs (no MV storage
                                snapshot IDs, no per-path lineage).
                                Under this model, mv_3’s metadata
                                might look like:

                                Type   Name     Tracked State
                                -----  -------  -----------------------
                                view   mv_1 view_version_id
                                view   mv_2 view_version_id
                                table  table_a  table_snapshot_id
                                table  table_b  table_snapshot_id

                                This preserves logical semantics and
                                aligns MV behavior with pure views.

                                *If we choose Option 2 (treat source
                                MV as a materialized table), we may
                                have to be consider those constraints:*

                                * Staleness only degrades up the
                                chain. mv_1 and mv_2 may already be
                                stale relative to the base tables,
                                but if mv_3 is refreshed using their
                                storage snapshots, then mv_3 will be
                                marked as fresh under Option 2, even
                                though all three MVs are stale
                                relative to the base tables.

                                * Engines can no longer discover
                                staleness beyond mv_1. Once mv_3 sees
                                mv_1 (or mv_2) as fresh based only on
                                their storage snapshots, it will not
                                expand into mv_1 or mv_2 to check
                                whether they are stale relative to
                                the base tables.

                                * If mv_2 and mv_3 were purely
                                logical views instead of MVs, they
                                would evaluate directly against base
                                tables and return newer data. Under
                                Option 2, the same definitions but
                                materialized upstream produce
                                different data, not just different
                                metadata.

                                Therefore, while Option 2 is
                                described in your summary as "giving
                                engines */flexibility/* to determine
                                freshness recursively beyond a source
                                MV", that *isn’t achievable* under
                                the MV evaluation model itself.
                                Because each MV treats upstream MVs
                                as physical tables, recursion stops
                                at the first materialized boundary;
                                *deeper staleness cannot be
                                discovered without switching to a
                                logical-view evaluation model, i.e.,
                                stepping outside the MV model
                                altogether /(note that in Option 3 we
                                can determine recursive staleness
                                while still inside the MV model)./*

                                Let me know your thoughts. I slightly
                                prefer Option 3. I’m also fine with
                                Option 2, but I don’t think the
                                flexibility to recursively determine
                                freshness actually exists under its
                                evaluation model. Not sure if this
                                changes anyone’s view, but I wanted
                                to clarify how I’m reading it.

                                Thanks,
                                Walaa.


                                On Wed, Oct 8, 2025 at 11:11 PM Benny
                                Chow <[email protected]> wrote:

                                    I just listened to the
                                    recording.  I'm the tech lead for
                                    MVs at Dremio and responsible for
                                    both refresh management and query
                                    rewrites with MVs.

                                    It's great that we seem to agree
                                    that Iceberg MV spec won't
                                    require that MVs always be up to
                                    date in order to be usable for
                                    query rewrites.  There can be
                                    many data consistency issues (as
                                    Dan pointed out) but that is the
                                    state of affairs today.

                                    It sounds like we are converging
                                    on the following scenarios for an
                                    engine to validate the MV freshness:

                                    1.  Use storage table without any
                                    validation.  This might be the
                                    extreme "async MV" example.
                                    2.  Ignore storage table even if
                                    one exists because SQL command or
                                    use case requires that.
                                    3.  Use storage table only if
                                    data is not more than x hours
                                    old.  This can be achieved with
                                    the
                                    proposedrefresh-start-timestamp-ms
                                    which is currently in the
                                    proposed spec. For this to work
                                    with MVs built on MVs, we should
                                    probably state in the spec that
                                    if a MV is built on another MV,
                                    then it needs to inherit the
                                    refresh-start-timestamp-ms of the
                                    child MV. In Steven's example,
                                    when building mv3,
                                    refresh-start-timestamp-ms needs
                                    to be set to the minimum of mv1
                                    or mv2's
                                    refresh-start-timestamp-ms. If
                                    this property name is confusing,
                                    we can rename it to
                                    "refresh-earliest-table-timestamp-ms".
                                    I originally proposed this
                                    property and also listed out
                                    other benefits here:
                                    
https://github.com/apache/iceberg/pull/11041#discussion_r1779797796
                                    Also, at the time, MVs built on
                                    MVs weren't being considered. 
                                    Now that it is, I would recommend
                                    we have both
                                    "refresh-start-timestamp-ms"
                                    (when the refresh was started on
                                    the storage table) and
                                    "refresh-earliest-table-timestamp-ms"
                                    (used for freshness validation).
                                    4. Don't use the storage table if
                                    it is older than X hours.  This
                                    is what I had originally proposed
                                    for the
                                    *materialization.max-stalessness-ms*
                                    view property here:
                                    
https://github.com/apache/iceberg/pull/11041#discussion_r1744837644
                                    It wasn't meant to validate the
                                    freshness but more to prevent use
                                    of a materialization after some
                                    criteria.
                                    5. Use storage table if recursive
                                    validation passes... i.e.
                                    refresh-state matches the current
                                    expanded query tree state. This
                                    is what I think Steven is calling
                                    the "synchronous MV".

                                    For scenario 1-4, it would
                                    support the nice use case of an
                                    Iceberg client using a view's
                                    data through the storage table
                                    without needing to know how to
                                    parse/validate/expand any view SQLs.

                                    In Dremio's planner, we primarily
                                    use scenario 1 and 4 together to
                                    determine MV validity for query
                                    rewrite.  Scenario 2 and 5 also
                                    apply in certain situations. For
                                    scenario 3, Dremio only exposes
                                    the
                                    "refresh-earliest-table-timestamp-ms"
                                    as an fyi to the user but it
                                    would be interesting to allow the
                                    user to set this time so that
                                    they could run queries and be
                                    100% certain that they were not
                                    seeing data older than x hours.

                                    Thanks
                                    Benny

                                    On Wed, Oct 8, 2025 at 3:37 PM
                                    Steven Wu <[email protected]>
                                    wrote:

                                        correction for a typo.

                                        Prashanth brought up another
                                        scenario of
                                        compaction/rewrite where a
                                        new snapshot was added *with*
                                        actual data change
                                        -->
                                        Prashanth brought up another
                                        scenario of
                                        compaction/rewrite where a
                                        new snapshot was added
                                        *without* actual data change


                                        On Wed, Oct 8, 2025 at
                                        2:12 PM Steven Wu
                                        <[email protected]> wrote:

                                            Hi,

                                            Thanks everyone for
                                            joining the MV discussion
                                            meeting. We will continue
                                            to have the recurring
                                            sync meeting on Wednesday
                                            9 am (Pacific) every 3
                                            weeks until we get to the
                                            finish line where Jan's
                                            MV spec PR [1] is merged.
                                            I have scheduled our next
                                            meeting on Oct 29 in the
                                            Iceberg dev events calendar.

                                            Here is the video
                                            recording for today's
                                            meeting.
                                            
https://drive.google.com/file/d/1-nfhBPDWLoAFDu5cKP0rwLd_30HB6byR/view?usp=sharing

                                            We mostly discussed
                                            freshness evaluation.
                                            Here is the meeting summary.

                                             1. For tracking the
                                                refresh state for the
                                                source MV [2], the
                                                consensus is option 2
                                                (treating source MV
                                                as a materialized
                                                table) which would
                                                give engines the
                                                flexibility on
                                                freshness
                                                determination
                                                (recursive beyond
                                                source MV or not).
                                             2. Earlier design doc
                                                [3] discussed max
                                                staleness config. But
                                                it wasn't reflected
                                                in the spec PR. The
                                                general opinion is to
                                                add the config to the
                                                spec PR. The open
                                                question is whether
                                                the
                                                
`materialization.max-staleness-ms`
                                                config should be
                                                added to the view
                                                metadata or the
                                                storage table
                                                metadata. Either can
                                                work. We just need to
                                                decide which makes a
                                                little better fit.
                                             3. Prashanth brought up
                                                schema change with
                                                default value and how
                                                it may affect the MV
                                                refresh state (for
                                                SQL representation
                                                with select *). Jan
                                                mentioned that
                                                snapshot contains
                                                schema id when the
                                                snapshot was created.
                                                Engine can compare
                                                the snapshot schema
                                                id to the source
                                                table schema id
                                                during freshness
                                                evaluation. There is
                                                no need for
                                                additional schema
                                                info in refresh-state
                                                tracking in the
                                                storage table.
                                             4. Prashanth brought up
                                                another scenario of
                                                compaction/rewrite
                                                where a new snapshot
                                                was added with actual
                                                data change. The
                                                general take is that
                                                the engine can
                                                optimize and decide
                                                that MV is fresh as
                                                the new snapshot
                                                doesn't have any data
                                                change.


                                            We can add some
                                            clarifications in the
                                            spec PR for freshness
                                            evaluation based on the
                                            above discussions.

                                            [1]
                                            
https://github.com/apache/iceberg/pull/11041
                                            [2]
                                            
https://docs.google.com/document/d/1_StBW5hCQhumhIvgbdsHjyW0ED3dWMkjtNzyPp9Sfr8/edit?tab=t.0
                                            [3]
                                            
https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?tab=t.0#heading=h.3wigecex0zls




                                            On Thu, Sep 25, 2025 at
                                            9:27 AM Steven Wu
                                            <[email protected]> wrote:

                                                Hi all,

                                                Iceberg materialized
                                                view has been
                                                discussed in the
                                                community for a long
                                                time. Thanks Jan Kaul
                                                for driving the
                                                discussion and the
                                                spec PR. It has been
                                                stalled for a long
                                                time due to lack of
                                                consensus on 1 or 2
                                                topics. In Wed's
                                                Iceberg community
                                                sync meeting, Talat
                                                brought up the
                                                question on how to
                                                move forward and if
                                                we can have a
                                                dedicated meeting for MV.

                                                I have set up a
                                                meeting on *Oct 8
                                                (9-10 am Pacific)*.
                                                If you subscribe to
                                                the "Iceberg Dev
                                                Events" calendar, you
                                                should be able to see
                                                it. If not, here is
                                                the link:
                                                
https://meet.google.com/nfe-guyq-pqf

                                                We are going to discuss
                                                * remaining open
                                                questions
                                                * unresolved concerns
                                                * the next step and
                                                hopefully some
                                                consensus on moving
                                                forward

                                                MV spec PR is up to
                                                date. Jan has
                                                incorporated recent
                                                feedback. This should
                                                be the base of the
                                                discussion.
                                                
https://github.com/apache/iceberg/pull/11041
                                                
<https://www.google.com/url?q=https://github.com/apache/iceberg/pull/11041&sa=D&source=calendar&usd=2&usg=AOvVaw3w0TjRpwbC17AGzmxZmElM>

                                                Dev discussion thread
                                                (a long-running
                                                thread started by Jan).
                                                
https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf
                                                
<https://www.google.com/url?q=https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf&sa=D&source=calendar&usd=2&usg=AOvVaw0fotlsrnRBOb820mA5JRyB>

                                                The mail archive has
                                                broken lineage and
                                                doesn't show all
                                                replies. Email
                                                subject is
                                                "*[DISCUSS] Iceberg
                                                Materialzied Views*".

                                                Thanks,
                                                Steven

Re: Dedicated sync for Iceberg materialized view

Reply via email to