On Oct 23, 2025, at 8:44 AM, Péter Váry
<[email protected]> wrote:
Hi All,
I’ve been catching up on the discussion and
wanted to share an observation. One aspect
that stands out to me in the proposed
staleness evaluation logic is that snapshots
which don’t modify data can still affect the
view’s contents if the view includes metadata
columns.
I was considering using a materialized view
as an index for a given table to accelerate
the conversion of equality deletes to
position deletes. For example, the query
might look like:
/SELECT *_POS*, *_FILE*, id FROM
target_table/
During compaction, the materialized view
would need to be refreshed to ensure it
reflects the correct data.
Does this seem like a valid use case? Or
should we explicitly exclude scenarios like this?
Thanks,
Peter
Steven Wu <[email protected]> ezt írta
(időpont: 2025. okt. 20., H, 17:30):
Walaa,
> while Option 2 is described in your
summary as "giving engines
*/flexibility/* to determine freshness
recursively beyond a source MV", that
*isn’t achievable* under the MV
evaluation model itself.
Because each MV treats upstream MVs as
physical tables, recursion stops at the
first materialized boundary; *deeper
staleness cannot be discovered without
switching to a logical-view evaluation
model, i.e., stepping outside the MV
model altogether /(note that in Option 3
we can determine recursive staleness
while still inside the MV model)./*
In option 2, when determining the
freshness of mv_3, engines can choose to
recursively evaluate the freshness of
mv_1 and mv_2 since they are also MVs.
But engines can also choose not to.
> This means that there seems to be an
implicit “Option 3”. This option treats
MVs as logical views, i.e., storing only
view versions + base table snapshot IDs
(no MV storage snapshot IDs, no per-path
lineage).
In the new option 3 you described, how
could the engine update mv3's refresh
state for base table_a and table_b?
unless all connected MVs are refreshed
and committed in one single transaction,
one entry per base table doesn't seem
feasible. That's the main reason for
option 1 to require the lineage path
information in refresh state for base tables.
It also seems that option 3 can only
interpret freshness recursively, while
today there are engines that support MVs
without recursively evaluating source MVs.
Thanks,
Steven
On Mon, Oct 20, 2025 at 1:44 AM Walaa
Eldin Moustafa <[email protected]> wrote:
Hi Steven,
Thanks for organizing the series and
summarizing the outcome.
After re-reading the Option 1/2
proposal, initially I interpreted
Option 1 as simply expanding MVs like
regular logical views. On closer
look, it is actually more complex. It
also preserves per-path lineage state
(e.g., multiple entries for the same
base table via different parents),
which increases expressiveness but
significantly increases metadata
complexity. So I agree it is not a
practical option.
This means that there seems to be an
implicit “Option 3”. This option
treats MVs as logical views, i.e.,
storing only view versions + base
table snapshot IDs (no MV storage
snapshot IDs, no per-path lineage).
Under this model, mv_3’s metadata
might look like:
Type Name Tracked State
----- ------- -----------------------
view mv_1 view_version_id
view mv_2 view_version_id
table table_a table_snapshot_id
table table_b table_snapshot_id
This preserves logical semantics and
aligns MV behavior with pure views.
*If we choose Option 2 (treat source
MV as a materialized table), we may
have to be consider those constraints:*
* Staleness only degrades up the
chain. mv_1 and mv_2 may already be
stale relative to the base tables,
but if mv_3 is refreshed using their
storage snapshots, then mv_3 will be
marked as fresh under Option 2, even
though all three MVs are stale
relative to the base tables.
* Engines can no longer discover
staleness beyond mv_1. Once mv_3 sees
mv_1 (or mv_2) as fresh based only on
their storage snapshots, it will not
expand into mv_1 or mv_2 to check
whether they are stale relative to
the base tables.
* If mv_2 and mv_3 were purely
logical views instead of MVs, they
would evaluate directly against base
tables and return newer data. Under
Option 2, the same definitions but
materialized upstream produce
different data, not just different
metadata.
Therefore, while Option 2 is
described in your summary as "giving
engines */flexibility/* to determine
freshness recursively beyond a source
MV", that *isn’t achievable* under
the MV evaluation model itself.
Because each MV treats upstream MVs
as physical tables, recursion stops
at the first materialized boundary;
*deeper staleness cannot be
discovered without switching to a
logical-view evaluation model, i.e.,
stepping outside the MV model
altogether /(note that in Option 3 we
can determine recursive staleness
while still inside the MV model)./*
Let me know your thoughts. I slightly
prefer Option 3. I’m also fine with
Option 2, but I don’t think the
flexibility to recursively determine
freshness actually exists under its
evaluation model. Not sure if this
changes anyone’s view, but I wanted
to clarify how I’m reading it.
Thanks,
Walaa.
On Wed, Oct 8, 2025 at 11:11 PM Benny
Chow <[email protected]> wrote:
I just listened to the
recording. I'm the tech lead for
MVs at Dremio and responsible for
both refresh management and query
rewrites with MVs.
It's great that we seem to agree
that Iceberg MV spec won't
require that MVs always be up to
date in order to be usable for
query rewrites. There can be
many data consistency issues (as
Dan pointed out) but that is the
state of affairs today.
It sounds like we are converging
on the following scenarios for an
engine to validate the MV freshness:
1. Use storage table without any
validation. This might be the
extreme "async MV" example.
2. Ignore storage table even if
one exists because SQL command or
use case requires that.
3. Use storage table only if
data is not more than x hours
old. This can be achieved with
the
proposedrefresh-start-timestamp-ms
which is currently in the
proposed spec. For this to work
with MVs built on MVs, we should
probably state in the spec that
if a MV is built on another MV,
then it needs to inherit the
refresh-start-timestamp-ms of the
child MV. In Steven's example,
when building mv3,
refresh-start-timestamp-ms needs
to be set to the minimum of mv1
or mv2's
refresh-start-timestamp-ms. If
this property name is confusing,
we can rename it to
"refresh-earliest-table-timestamp-ms".
I originally proposed this
property and also listed out
other benefits here:
https://github.com/apache/iceberg/pull/11041#discussion_r1779797796
Also, at the time, MVs built on
MVs weren't being considered.
Now that it is, I would recommend
we have both
"refresh-start-timestamp-ms"
(when the refresh was started on
the storage table) and
"refresh-earliest-table-timestamp-ms"
(used for freshness validation).
4. Don't use the storage table if
it is older than X hours. This
is what I had originally proposed
for the
*materialization.max-stalessness-ms*
view property here:
https://github.com/apache/iceberg/pull/11041#discussion_r1744837644
It wasn't meant to validate the
freshness but more to prevent use
of a materialization after some
criteria.
5. Use storage table if recursive
validation passes... i.e.
refresh-state matches the current
expanded query tree state. This
is what I think Steven is calling
the "synchronous MV".
For scenario 1-4, it would
support the nice use case of an
Iceberg client using a view's
data through the storage table
without needing to know how to
parse/validate/expand any view SQLs.
In Dremio's planner, we primarily
use scenario 1 and 4 together to
determine MV validity for query
rewrite. Scenario 2 and 5 also
apply in certain situations. For
scenario 3, Dremio only exposes
the
"refresh-earliest-table-timestamp-ms"
as an fyi to the user but it
would be interesting to allow the
user to set this time so that
they could run queries and be
100% certain that they were not
seeing data older than x hours.
Thanks
Benny
On Wed, Oct 8, 2025 at 3:37 PM
Steven Wu <[email protected]>
wrote:
correction for a typo.
Prashanth brought up another
scenario of
compaction/rewrite where a
new snapshot was added *with*
actual data change
-->
Prashanth brought up another
scenario of
compaction/rewrite where a
new snapshot was added
*without* actual data change
On Wed, Oct 8, 2025 at
2:12 PM Steven Wu
<[email protected]> wrote:
Hi,
Thanks everyone for
joining the MV discussion
meeting. We will continue
to have the recurring
sync meeting on Wednesday
9 am (Pacific) every 3
weeks until we get to the
finish line where Jan's
MV spec PR [1] is merged.
I have scheduled our next
meeting on Oct 29 in the
Iceberg dev events calendar.
Here is the video
recording for today's
meeting.
https://drive.google.com/file/d/1-nfhBPDWLoAFDu5cKP0rwLd_30HB6byR/view?usp=sharing
We mostly discussed
freshness evaluation.
Here is the meeting summary.
1. For tracking the
refresh state for the
source MV [2], the
consensus is option 2
(treating source MV
as a materialized
table) which would
give engines the
flexibility on
freshness
determination
(recursive beyond
source MV or not).
2. Earlier design doc
[3] discussed max
staleness config. But
it wasn't reflected
in the spec PR. The
general opinion is to
add the config to the
spec PR. The open
question is whether
the
`materialization.max-staleness-ms`
config should be
added to the view
metadata or the
storage table
metadata. Either can
work. We just need to
decide which makes a
little better fit.
3. Prashanth brought up
schema change with
default value and how
it may affect the MV
refresh state (for
SQL representation
with select *). Jan
mentioned that
snapshot contains
schema id when the
snapshot was created.
Engine can compare
the snapshot schema
id to the source
table schema id
during freshness
evaluation. There is
no need for
additional schema
info in refresh-state
tracking in the
storage table.
4. Prashanth brought up
another scenario of
compaction/rewrite
where a new snapshot
was added with actual
data change. The
general take is that
the engine can
optimize and decide
that MV is fresh as
the new snapshot
doesn't have any data
change.
We can add some
clarifications in the
spec PR for freshness
evaluation based on the
above discussions.
[1]
https://github.com/apache/iceberg/pull/11041
[2]
https://docs.google.com/document/d/1_StBW5hCQhumhIvgbdsHjyW0ED3dWMkjtNzyPp9Sfr8/edit?tab=t.0
[3]
https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?tab=t.0#heading=h.3wigecex0zls
On Thu, Sep 25, 2025 at
9:27 AM Steven Wu
<[email protected]> wrote:
Hi all,
Iceberg materialized
view has been
discussed in the
community for a long
time. Thanks Jan Kaul
for driving the
discussion and the
spec PR. It has been
stalled for a long
time due to lack of
consensus on 1 or 2
topics. In Wed's
Iceberg community
sync meeting, Talat
brought up the
question on how to
move forward and if
we can have a
dedicated meeting for MV.
I have set up a
meeting on *Oct 8
(9-10 am Pacific)*.
If you subscribe to
the "Iceberg Dev
Events" calendar, you
should be able to see
it. If not, here is
the link:
https://meet.google.com/nfe-guyq-pqf
We are going to discuss
* remaining open
questions
* unresolved concerns
* the next step and
hopefully some
consensus on moving
forward
MV spec PR is up to
date. Jan has
incorporated recent
feedback. This should
be the base of the
discussion.
https://github.com/apache/iceberg/pull/11041
<https://www.google.com/url?q=https://github.com/apache/iceberg/pull/11041&sa=D&source=calendar&usd=2&usg=AOvVaw3w0TjRpwbC17AGzmxZmElM>
Dev discussion thread
(a long-running
thread started by Jan).
https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf
<https://www.google.com/url?q=https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf&sa=D&source=calendar&usd=2&usg=AOvVaw0fotlsrnRBOb820mA5JRyB>
The mail archive has
broken lineage and
doesn't show all
replies. Email
subject is
"*[DISCUSS] Iceberg
Materialzied Views*".
Thanks,
Steven