Hi Everyone,

Just a follow up on this thread. Thanks Benny and Micah for the discussion
on the doc [1]. We have been converging more on using UUIDs from the
discussion. The only open question was related to UUIDs (of underlying
views/tables) being stale upon a REPLACE (or DROP and CREATE) operation on
those tables/views. The response to that question is:

(1) While stale, those UUIDs are still good for freshness calculations.
(2) If we really want to refresh them, that might be in scope of a REFRESH
VIEW operation (not to be confused with REFRESH MV).

(2) seems appropriate for a separate discussion since REFRESH VIEW
operation behavior can be concerned with other things as well like
refreshing the schema.

Detailed discussion on this question can be found in this thread in the doc
[2]. If people are okay with the stance above, we can move forward towards
a vote on using UUIDs. I will leave a couple of days for the discussion to
wrap up in this thread/the doc.

[1]
https://docs.google.com/document/d/1-OaPqm8ahVT3_OCbVdAPQ_wZ8I3ToeqU3RLUjcyKQM0/edit
[2]
https://docs.google.com/document/d/1-OaPqm8ahVT3_OCbVdAPQ_wZ8I3ToeqU3RLUjcyKQM0/edit?disco=AAABTXtgBFw

Thanks,
Walaa.


On Thu, Aug 8, 2024 at 4:43 PM Walaa Eldin Moustafa <wa.moust...@gmail.com>
wrote:

> Thanks Benny! We discussed this option during the meeting but we did not
> prefer it because we did not want to leak the SQL identifiers to the
> storage table since SQL identifiers are view concepts and fit better with
> the view.
>
> Thanks,
> Walaa.
>
> On Thu, Aug 8, 2024 at 4:12 PM Benny Chow <btc...@gmail.com> wrote:
>
>> Maybe a third option is to decouple the view lineage and materialization
>> state.
>>
>> The view lineage can just list out the SQL identifiers+ref... we can
>> still decide whether this is just direct children or fully expanded.
>> The materialization state doesn't have to depend on the view lineage
>> (through either sequence ID or UUIDs).  It's a direct mapping of SQL
>> identifiers (possibly with refs) to their corresponding snapshot id or view
>> versions.
>>
>> The materialization state was our original design until Dan brought up
>> the idea to also capture the view lineage.  So, what I am suggesting is
>> that we do both but not couple them together through any sort of sequence
>> ID or UUID.
>>
>> Thanks
>> Benny
>>
>> On Thu, Aug 8, 2024 at 2:04 PM Walaa Eldin Moustafa <
>> wa.moust...@gmail.com> wrote:
>>
>>> Hi Everyone,
>>>
>>> In the last community sync on Materialized Views [1], we agreed to split
>>> the information that is used to determine the materialized view staleness
>>> to two parts: Lineage Information and State Information. We have made a lot
>>> of progress on representing both but one issue remains open:
>>>
>>> Both lineage information and state information express facts about the
>>> child objects of the MV, and to correlate those facts we need some sort of
>>> an ID for those objects.
>>> The open question is whether the ID should be of type sequence ID or
>>> UUID (i.e., the Iceberg UUID of the object).
>>>
>>> I have described the issue in more detail in this doc [2]. I have also
>>> added some thoughts on the advantages of using UUIDs in the same doc.
>>> Further, there is a discussion thread on that topic in the MV spec doc [3].
>>>
>>> Please feel free to comment on the docs or share your thoughts in the
>>> thread. After the discussion we can move forward to a vote.
>>>
>>> Thanks,
>>> Walaa.
>>>
>>> [1] https://lists.apache.org/thread/bhmxo1w1bdp1p2hh842kpm2gy1g5rscp
>>> [2]
>>> https://docs.google.com/document/d/1-OaPqm8ahVT3_OCbVdAPQ_wZ8I3ToeqU3RLUjcyKQM0/edit
>>> [3]
>>> https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?disco=AAABPUgvras
>>>
>>

Reply via email to