Re: [DISCUSS] v4 - One file commits

Amogh Jahagirdar Wed, 01 Apr 2026 20:59:32 -0700

Thanks Kevin! Also, since it's been a while since the last sync and there
was a free slot on the calendar this Friday, so I setup another one. To
summarize where we ended up last time, there was general agreement on
having a combined data file/DV entry + leveraging the efficient column
updates mechanism for DV churning workloads.


For this discussion, I was thinking we focus on:

   1.

   Picking back up on the file status/change detection debate we ended up
   at the end of the last discussion.
   2.

   Upgrade path and allowed file content types—specifically how we
   distinguish older manifests from V4 (some options include new type
   definition, format inference, projections).
   3.

   Partition tuple modeling. In the past, we discussed modeling the tuple
   as stats on expressions (with some nuances around identity partitioned
   string/binary truncation). With the expression extensions
   
<https://docs.google.com/document/d/1VthBz0S2I39TeQM8oiF9_gSPQu_gHAjWXvdFpv0QqDk/edit?tab=t.0>
   and unified table ID space
   
<https://docs.google.com/document/d/1oV8dapKVzB4pZy5pKHUCj5j9i2_1p37BJSeT7hyKPpg/edit?tab=t.0>proposals
   out for a while, it’d be good to see if there are any remaining questions
   on this topic since from offline conversations it sounds like there may be
   interest in this.


Feel free to add on to this if there are other topics you want to discuss!
In the interim, I'll make sure the doc is up to date.

Thanks,
Amogh Jahagirdar

On Mon, Mar 30, 2026 at 10:55 AM Kevin Liu <[email protected]> wrote:

> Done https://youtu.be/IVPHvZcJ07Q
>
> Amogh, I also added your gmail as an owner for the Youtube channel.
>
> On Mon, Mar 30, 2026 at 8:32 AM Steven Wu <[email protected]> wrote:
>
>> Amogh, can you upload the video to the YouTube channel?
>> https://www.youtube.com/playlist?list=PLkifVhhWtccxt1TE7w_HbNGhY5gpDTaX7
>>
>> On Mon, Mar 30, 2026 at 8:28 AM Amogh Jahagirdar <[email protected]>
>> wrote:
>>
>>> Hey a few folks reached out indicating that I didn't properly share the
>>> last v4 metadata tree meeting recording. So sorry about that! Here's
>>> the link
>>> <https://drive.google.com/file/d/1LhDL0Iy8YR4RN_W3D8APOUtkSBYk61fD/view?usp=drive_link>
>>>  ,
>>> do let me know if there are still issues.
>>>
>>> On Tue, Mar 3, 2026 at 9:17 AM Steven Wu <[email protected]> wrote:
>>>
>>>> My takeaway from the conversation is also that we don't need row-level
>>>> column updates. Manifest DV can be used for row-level updates instead.
>>>> Basically, a file (manifest or data) can be updated via (1) delete vector +
>>>> updated rows in a new file (2) column file overlay. Depends on the
>>>> percentage of modified rows, engines can choose which way to go.
>>>>
>>>> On Tue, Mar 3, 2026 at 6:24 AM Gábor Kaszab <[email protected]>
>>>> wrote:
>>>>
>>>>> Thanks for the summary, Micah! I tried to watch the recording linked
>>>>> to the calendar event, but apparently I don't have permission to do so. 
>>>>> Not
>>>>> sure about others.
>>>>>
>>>>> So if 'm not mistaken, one way to reduce the write cost of an UPDATE
>>>>> for colocated DVs is to use the column updates. As I see there was some
>>>>> agreement that row-level partial column updates aren't desired, and we aim
>>>>> for at least file-level column updates. This is very useful information 
>>>>> for
>>>>> the other conversation
>>>>> <https://lists.apache.org/thread/w90rqyhmh6pb0yxp0bqzgzk1y1rotyny>
>>>>> going on for the column update proposal. We can bring this up on the 
>>>>> column
>>>>> update sync tomorrow, but I'm wondering if the consensus on avoiding
>>>>> row-level column updates is something we can incorporate into the column
>>>>> update proposal too or if it's something still up to debate.
>>>>>
>>>>> Best Regards,
>>>>> Gabor
>>>>>
>>>>> Micah Kornfield <[email protected]> ezt írta (időpont: 2026.
>>>>> febr. 25., Sze, 22:30):
>>>>>
>>>>>> Just wanted to summarize my main takeaways of Monday's sync.
>>>>>>
>>>>>> The approach will always collocate DVs with the data files (i.e.
>>>>>> every data file row in a manifest has an optional DV reference).  This
>>>>>> implies that there is not a separate "Deletion manifest".  Rather in V4 
>>>>>> all
>>>>>> manifests are "combined" where data files and DVs are colocated.
>>>>>>
>>>>>> Write amplification is avoided in two ways:
>>>>>> 1.  For small updates we will need to  carry through metadata
>>>>>> statistics (and other relevant data file fields) in memory (rescanning
>>>>>> these is likely two expensive).    Once updates are available they will 
>>>>>> be
>>>>>> written out a new manifest (either root or leaf) and use metadata DVs to
>>>>>> remove the old rows.
>>>>>> 2.  For larger updates we will only carry through the DV update parts
>>>>>> in memory and use column level updates to replace existing DVs (this 
>>>>>> would
>>>>>> require rescanning the DV columns for any updated manifest to merge with
>>>>>> the updated DVs in memory, and then writing out the column update). The
>>>>>> consensus on the call is that we didn't want to support partial  column
>>>>>> updates (a.k.a. merge-on-read column updates).
>>>>>>
>>>>>> The idea is that engines would decide which path to follow based on
>>>>>> the number of affected files.
>>>>>>
>>>>>> To help understand the implications of the new proposal, I put
>>>>>> together a quick spreadsheet [1] to analyze trade-offs between separate
>>>>>> deletion manifests and the new approach under scenario 1 and 2.  This
>>>>>> represents the worst case scenario where file updates are uniformly
>>>>>> distributed across a single update operation.  It does not account for
>>>>>> repeated writes (e.g. on-going compaction).  My main take-aways is that
>>>>>> keeping at most 1 affiliated DV separate might still help (akin to a 
>>>>>> merge
>>>>>> on read column update), but maybe not enough relative to other parts of 
>>>>>> the
>>>>>> system (e.g. the churn on data files) that the complexity.
>>>>>>
>>>>>> Hope this is helpful.
>>>>>>
>>>>>> Micah
>>>>>>
>>>>>> [1]
>>>>>> https://docs.google.com/spreadsheets/d/1klZQxV7ST2C-p9LTMmai_5rtFiyupj6jSLRPRkdI-u8/edit?gid=0#gid=0
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 19, 2026 at 3:52 PM Amogh Jahagirdar <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey folks, I've set up an additional initial discussion on DVs for
>>>>>>> Monday. This topic is fairly complex and there is also now a free 
>>>>>>> calendar
>>>>>>> slot. I think it'd be helpful for us to first make sure we're all on the
>>>>>>> same page in terms of what the approach proposed by Anton earlier in the
>>>>>>> thread means and the high level mechanics. I should also have more to 
>>>>>>> share
>>>>>>> on the doc about how the entry structure and change detection could look
>>>>>>> like in this approach. Then on Thursday we can get into more details and
>>>>>>> targeted points of discussion on this topic.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Amogh Jahagirdar
>>>>>>>
>>>>>>> On Tue, Feb 17, 2026 at 9:27 PM Amogh Jahagirdar <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Steven! I've set up some time next Thursday for the
>>>>>>>> community to discuss this. We're also looking at how the content entry
>>>>>>>> would look like in a combined DV with potential column updates for DV
>>>>>>>> changes, and how change detection could look like in this approach. I
>>>>>>>> should have more to share on this by the time of the community 
>>>>>>>> discussion
>>>>>>>> next week.
>>>>>>>> We should also consider potential root churn and memory consumption
>>>>>>>> stemming from expected root entry inflation due to a combined data 
>>>>>>>> file +
>>>>>>>> DV entry with possible column updates for certain DV workloads; though 
>>>>>>>> at
>>>>>>>> least for memory consumption of stats being held after planning, that
>>>>>>>> arguably is an implementation problem for certain integrations.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Amogh Jahagirdar
>>>>>>>>
>>>>>>>> On Fri, Feb 13, 2026 at 10:58 AM Steven Wu <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I wrote up some analysis with back-of-the-envelope calculations
>>>>>>>>> about the column update approach for DV colocation. It mainly 
>>>>>>>>> concerns the
>>>>>>>>> 2nd use case: deleting a large number of rows from a small number of 
>>>>>>>>> files.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.gvdulzy486n7
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 4, 2026 at 1:02 AM Péter Váry <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> I fully agree with Anton and Steven that we need benchmarks
>>>>>>>>>> before choosing any direction.
>>>>>>>>>>
>>>>>>>>>> I ran some preliminary column‑stitching benchmarks last summer:
>>>>>>>>>>
>>>>>>>>>>    - Results are available in the doc:
>>>>>>>>>>    
>>>>>>>>>> https://docs.google.com/document/d/1OHuZ6RyzZvCOQ6UQoV84GzwVp3UPiu_cfXClsOi03ww
>>>>>>>>>>    - Code is here: https://github.com/apache/iceberg/pull/13306
>>>>>>>>>>
>>>>>>>>>> I’ve summarized the most relevant results at the end of this
>>>>>>>>>> email. They show roughly a 10% slowdown on the read path with column
>>>>>>>>>> stitching in similar scenarios when using local SSDs. I expect that 
>>>>>>>>>> in real
>>>>>>>>>> deployments the metadata read cost will mostly be driven by blob I/O
>>>>>>>>>> (assuming no caching). If blob access becomes the dominant factor in 
>>>>>>>>>> read
>>>>>>>>>> latency, multithreaded fetching should be able to absorb the overhead
>>>>>>>>>> introduced by column stitching, resulting in latency similar to the
>>>>>>>>>> single‑file layout (unless IO is already the bottleneck)
>>>>>>>>>>
>>>>>>>>>> We should definitely rerun the benchmarks once we have a clearer
>>>>>>>>>> understanding of the intended usage patterns.
>>>>>>>>>> Thanks,
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The relevant(ish) results are for 100 columns, with 2 families
>>>>>>>>>> with 50-50 columns and local read:
>>>>>>>>>>
>>>>>>>>>> The base is:
>>>>>>>>>> MultiThreadedParquetBenchmark.read        100           0
>>>>>>>>>>    false    ss   20   3.739 ±  0.096   s/op
>>>>>>>>>>
>>>>>>>>>> The read for single threaded:
>>>>>>>>>> MultiThreadedParquetBenchmark.read        100           2
>>>>>>>>>>    false    ss   20   4.036 ±  0.082   s/op
>>>>>>>>>>
>>>>>>>>>> The read for multi threaded:
>>>>>>>>>> MultiThreadedParquetBenchmark.read        100           2
>>>>>>>>>>     true    ss   20   4.063 ±  0.080   s/op
>>>>>>>>>>
>>>>>>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2026. febr.
>>>>>>>>>> 3., K, 23:27):
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I agree with Anton in this
>>>>>>>>>>> <https://docs.google.com/document/d/1jZy4g6UDi3hdblpkSzDnqgzgATFKFoMaHmt4nNH8M7o/edit?disco=AAAByzDx21w>
>>>>>>>>>>> comment thread that we probably need to run benchmarks for a few 
>>>>>>>>>>> common
>>>>>>>>>>> scenarios to guide this decision. We need to write down detailed 
>>>>>>>>>>> plans for
>>>>>>>>>>> those scenarios and what are we measuring. Also ideally, we want to 
>>>>>>>>>>> measure
>>>>>>>>>>> using the V4 metadata structure (like Parquet manifest file, column 
>>>>>>>>>>> stats
>>>>>>>>>>> structs, adaptive tree). There are PoC PRs available for column 
>>>>>>>>>>> stats,
>>>>>>>>>>> Parquet manifest, and root manifest. It would probably be tricky to 
>>>>>>>>>>> piece
>>>>>>>>>>> them together to run the benchmark considering the PoC status. We 
>>>>>>>>>>> also need
>>>>>>>>>>> the column stitching capability on the read path to test the column 
>>>>>>>>>>> file
>>>>>>>>>>> approach.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 3, 2026 at 1:53 PM Anoop Johnson <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm in favor of co-located DV metadata with column file
>>>>>>>>>>>> override and not doing affiliated/unaffiliated delete manifests. 
>>>>>>>>>>>> This is
>>>>>>>>>>>> conceptually similar to strictly affiliated delete manifests with
>>>>>>>>>>>> positional joins, and will halve the number of I/Os when there is 
>>>>>>>>>>>> no DV
>>>>>>>>>>>> column override. It is simpler to implement
>>>>>>>>>>>> and will speed up reads.
>>>>>>>>>>>>
>>>>>>>>>>>> Unaffiliated DV manifests are flexible for writers. They reduce
>>>>>>>>>>>> the chance of physical conflicts when there are concurrent 
>>>>>>>>>>>> large/random
>>>>>>>>>>>> deletes that change DVs on different files in the same manifest. 
>>>>>>>>>>>> But the
>>>>>>>>>>>> flexibility comes at a read-time cost. If the number of 
>>>>>>>>>>>> unaffiliated DVs
>>>>>>>>>>>> exceeds a threshold, it could cause driver OOMs or require 
>>>>>>>>>>>> distributed join
>>>>>>>>>>>> to pair up DVs with data files. With colocated metadata, manifest 
>>>>>>>>>>>> DVs can
>>>>>>>>>>>> reduce the chance of conflicts up to a certain write size.
>>>>>>>>>>>>
>>>>>>>>>>>> I assume we will still support unaffiliated manifests for
>>>>>>>>>>>> equality deletes, but perhaps we can restrict it to just equality 
>>>>>>>>>>>> deletes.
>>>>>>>>>>>>
>>>>>>>>>>>> -Anoop
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 2, 2026 at 4:27 PM Anton Okolnychyi <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I added the approach with column files to the doc.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To sum up, separate data and delete manifests with affinity
>>>>>>>>>>>>> would perform somewhat on par with co-located DV metadata (a.k.a. 
>>>>>>>>>>>>> direct
>>>>>>>>>>>>> assignment) if we add support for column files when we need to 
>>>>>>>>>>>>> replace most
>>>>>>>>>>>>> or all DVs (use case 1). That said, the support for direct 
>>>>>>>>>>>>> assignment with
>>>>>>>>>>>>> in-line metadata DVs can help us avoid unaffiliated delete 
>>>>>>>>>>>>> manifests when
>>>>>>>>>>>>> we need to replace a few DVs (use case 2).
>>>>>>>>>>>>>
>>>>>>>>>>>>> So the key question is whether we want to allow
>>>>>>>>>>>>> unaffiliated delete manifests with DVs... If we don't, then we 
>>>>>>>>>>>>> would likely
>>>>>>>>>>>>> want to have co-located DV metadata and must support efficient 
>>>>>>>>>>>>> column
>>>>>>>>>>>>> updates not to regress compared to V2 and V3 for large MERGE jobs 
>>>>>>>>>>>>> that
>>>>>>>>>>>>> modify a small set of records for most files.
>>>>>>>>>>>>>
>>>>>>>>>>>>> пн, 2 лют. 2026 р. о 13:20 Anton Okolnychyi <
>>>>>>>>>>>>> [email protected]> пише:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Anoop, correct, if we keep data and delete manifests
>>>>>>>>>>>>>> separate, there is a better way to combine the entries and we 
>>>>>>>>>>>>>> should NOT
>>>>>>>>>>>>>> rely on the referenced data file path. Reconciling by implicit 
>>>>>>>>>>>>>> position
>>>>>>>>>>>>>> will reduce the size of the DV entry (no need to store the 
>>>>>>>>>>>>>> referenced data
>>>>>>>>>>>>>> file path) and will improve the planning performance (no 
>>>>>>>>>>>>>> equals/hashCode on
>>>>>>>>>>>>>> the path).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Steven, I agree. Most notes in the doc pre-date discussions
>>>>>>>>>>>>>> we had on column updates. You are right, given that we are 
>>>>>>>>>>>>>> gravitating
>>>>>>>>>>>>>> towards a native way to handle column updates, it seems logical 
>>>>>>>>>>>>>> to use the
>>>>>>>>>>>>>> same approach for replacing DVs, since they’re essentially 
>>>>>>>>>>>>>> column updates.
>>>>>>>>>>>>>> Let me add one more approach to the doc based on what Anurag and 
>>>>>>>>>>>>>> Peter have
>>>>>>>>>>>>>> so far.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> нд, 1 лют. 2026 р. о 20:59 Steven Wu <[email protected]>
>>>>>>>>>>>>>> пише:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Anton, thanks for raising this. I agree this deserves
>>>>>>>>>>>>>>> another look. I added a comment in your doc that we can 
>>>>>>>>>>>>>>> potentially apply
>>>>>>>>>>>>>>> the column update proposal for data file update to the manifest 
>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>> updates as well, to colocate the data DV and data manifest 
>>>>>>>>>>>>>>> files. Data DVs
>>>>>>>>>>>>>>> can be a separate column in the data manifest file and updated 
>>>>>>>>>>>>>>> separately
>>>>>>>>>>>>>>> in a column file. This is the same as the coalesced positional 
>>>>>>>>>>>>>>> join that
>>>>>>>>>>>>>>> Anoop mentioned.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 1, 2026 at 4:14 PM Anoop Johnson <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you for raising this, Anton. I had a similar
>>>>>>>>>>>>>>>> observation while prototyping
>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/14533> the
>>>>>>>>>>>>>>>> adaptive metadata tree. The overhead of doing a path-based 
>>>>>>>>>>>>>>>> hash join of a
>>>>>>>>>>>>>>>> data manifest with the affiliated delete manifest is high: my 
>>>>>>>>>>>>>>>> estimate was
>>>>>>>>>>>>>>>> that the join adds about 5-10% overhead. The hash table 
>>>>>>>>>>>>>>>> build/probe alone
>>>>>>>>>>>>>>>> takes about 5 ms for manifests with 25K entries. There are 
>>>>>>>>>>>>>>>> engines that can
>>>>>>>>>>>>>>>> do vectorized hash joins that can lower this, but the overhead 
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> complexity of a SIMD-friendly hash join is non-trivial.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> An alternative to relying on the external file feature in
>>>>>>>>>>>>>>>> Parquet, is to make affiliated manifests order-preserving: ie 
>>>>>>>>>>>>>>>> DVs in an
>>>>>>>>>>>>>>>> affiliated delete manifest must appear in the same position as 
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> corresponding data file in the data manifest the delete 
>>>>>>>>>>>>>>>> manifest is
>>>>>>>>>>>>>>>> affiliated to.  If a data file does not have a DV, the DV 
>>>>>>>>>>>>>>>> manifest must
>>>>>>>>>>>>>>>> store a NULL. This would allow us to do positional joins, 
>>>>>>>>>>>>>>>> which are much
>>>>>>>>>>>>>>>> faster. If we wanted, we could even have multiple affiliated 
>>>>>>>>>>>>>>>> DV manifests
>>>>>>>>>>>>>>>> for a data manifest and the reader would do a COALESCED 
>>>>>>>>>>>>>>>> positional join
>>>>>>>>>>>>>>>> (i.e. pick the first non-null value as the DV). It puts the 
>>>>>>>>>>>>>>>> sorting
>>>>>>>>>>>>>>>> responsibility to the writers, but it might be a reasonable 
>>>>>>>>>>>>>>>> tradeoff.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also, the options don't necessarily have to be mutually
>>>>>>>>>>>>>>>> exclusive. We could still allow affiliated DVs to be "folded" 
>>>>>>>>>>>>>>>> into data
>>>>>>>>>>>>>>>> manifest (e.g. by background optimization jobs or the writer 
>>>>>>>>>>>>>>>> itself). That
>>>>>>>>>>>>>>>> might be the optimal choice for read-heavy tables because it 
>>>>>>>>>>>>>>>> will halve the
>>>>>>>>>>>>>>>> number of I/Os readers have to make.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Anoop
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 30, 2026 at 6:03 PM Anton Okolnychyi <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I had a chance to catch up on some of the V4 discussions.
>>>>>>>>>>>>>>>>> Given that we are getting rid of the manifest list and 
>>>>>>>>>>>>>>>>> switching to
>>>>>>>>>>>>>>>>> Parquet, I wanted to re-evaluate the possibility of direct DV 
>>>>>>>>>>>>>>>>> assignment
>>>>>>>>>>>>>>>>> that we discarded in V3 to avoid regressions. I have put 
>>>>>>>>>>>>>>>>> together my
>>>>>>>>>>>>>>>>> thoughts in a doc [1].
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> TL;DR:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - I think the current V4 proposal that keeps data and
>>>>>>>>>>>>>>>>> delete manifests separate but introduces affinity is a solid 
>>>>>>>>>>>>>>>>> choice for
>>>>>>>>>>>>>>>>> cases when we need to replace DVs in many / most files. I 
>>>>>>>>>>>>>>>>> outlined an
>>>>>>>>>>>>>>>>> approach with column-split Parquet files but it doesn't 
>>>>>>>>>>>>>>>>> improve the
>>>>>>>>>>>>>>>>> performance and takes dependency on a portion of the Parquet 
>>>>>>>>>>>>>>>>> spec that is
>>>>>>>>>>>>>>>>> not really implemented.
>>>>>>>>>>>>>>>>> - Pushing unaffiliated DVs directly into the root to
>>>>>>>>>>>>>>>>> replace a small set of DVs is going to be fast on write but 
>>>>>>>>>>>>>>>>> does require
>>>>>>>>>>>>>>>>> resolving where those DVs apply at read time. Using inline 
>>>>>>>>>>>>>>>>> metadata DVs
>>>>>>>>>>>>>>>>> with column-split Parquet files is a little more promising in 
>>>>>>>>>>>>>>>>> this case as
>>>>>>>>>>>>>>>>> it allows to avoid unaffiliated DVs. That said, it again 
>>>>>>>>>>>>>>>>> relies on
>>>>>>>>>>>>>>>>> something Parquet doesn't implement right now, requires 
>>>>>>>>>>>>>>>>> changing
>>>>>>>>>>>>>>>>> maintenance operations, and yields minimal benefits.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> All in all, the V4 proposal seems like a strict
>>>>>>>>>>>>>>>>> improvement over V3 but I insist that we reconsider usage of 
>>>>>>>>>>>>>>>>> the referenced
>>>>>>>>>>>>>>>>> data file path when resolving DVs to data files.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1] -
>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1jZy4g6UDi3hdblpkSzDnqgzgATFKFoMaHmt4nNH8M7o
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - Anton
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> сб, 22 лист. 2025 р. о 13:37 Amogh Jahagirdar <
>>>>>>>>>>>>>>>>> [email protected]> пише:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hey all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Here is the meeting recording
>>>>>>>>>>>>>>>>>> <https://drive.google.com/file/d/1lG9sM-JTwqcIgk7JsAryXXCc1vMnstJs/view?usp=sharing>
>>>>>>>>>>>>>>>>>>  and generated meeting summary
>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1e50p8TXL2e3CnUwKMOvm8F4s2PeVMiKWHPxhxOW1fIM/edit?usp=sharing>.
>>>>>>>>>>>>>>>>>> Thanks all for attending yesterday!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Nov 20, 2025 at 8:49 AM Amogh Jahagirdar <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hey folks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I was out for some time, but set up a sync for tomorrow
>>>>>>>>>>>>>>>>>>> at 9am PST. For this discussion, I do think it would be 
>>>>>>>>>>>>>>>>>>> great to focus on
>>>>>>>>>>>>>>>>>>> the manifest DV representation, factoring in analyses on 
>>>>>>>>>>>>>>>>>>> bitmap
>>>>>>>>>>>>>>>>>>> representation storage footprints, and the entry structure 
>>>>>>>>>>>>>>>>>>> considering how
>>>>>>>>>>>>>>>>>>> we want to approach change detection. If there are other 
>>>>>>>>>>>>>>>>>>> topics that people
>>>>>>>>>>>>>>>>>>> want to highlight, please do bring those up as well!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I also recognize that this is a bit short term
>>>>>>>>>>>>>>>>>>> scheduling, so please do reach out to me if this time is 
>>>>>>>>>>>>>>>>>>> difficult to work
>>>>>>>>>>>>>>>>>>> with; next week is the Thanksgiving holidays here, and 
>>>>>>>>>>>>>>>>>>> since people would
>>>>>>>>>>>>>>>>>>> be travelling/out I figured I'd try to schedule before then.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Oct 17, 2025 at 9:03 AM Amogh Jahagirdar <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hey folks,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Sorry for the delay, here's the recording link
>>>>>>>>>>>>>>>>>>>> <https://drive.google.com/file/d/1YOmPROXjAKYAWAcYxqAFHdADbqELVVf2/view>
>>>>>>>>>>>>>>>>>>>>   from
>>>>>>>>>>>>>>>>>>>> last week's discussion.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, Oct 10, 2025 at 9:44 AM Péter Váry <
>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Same here.
>>>>>>>>>>>>>>>>>>>>> Please record if you can.
>>>>>>>>>>>>>>>>>>>>> Thanks, Peter
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Fri, Oct 10, 2025, 17:39 Fokko Driesprong <
>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hey Amogh,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks for the write-up. Unfortunately, I won’t be
>>>>>>>>>>>>>>>>>>>>>> able to attend. Will it be recorded? Thanks!
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kind regards,
>>>>>>>>>>>>>>>>>>>>>> Fokko
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Op di 7 okt 2025 om 20:36 schreef Amogh Jahagirdar <
>>>>>>>>>>>>>>>>>>>>>> [email protected]>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hey all,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I've setup time this Friday at 9am PST for another
>>>>>>>>>>>>>>>>>>>>>>> sync on single file commits. In terms of what would be 
>>>>>>>>>>>>>>>>>>>>>>> great to focus on
>>>>>>>>>>>>>>>>>>>>>>> for the discussion:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> 1. Whether it makes sense or not to eliminate the
>>>>>>>>>>>>>>>>>>>>>>> tuple, and instead representing the tuple via 
>>>>>>>>>>>>>>>>>>>>>>> lower/upper boundaries. As a
>>>>>>>>>>>>>>>>>>>>>>> reminder, one of the goals is to avoid tying a 
>>>>>>>>>>>>>>>>>>>>>>> partition spec to a
>>>>>>>>>>>>>>>>>>>>>>> manifest; in the root we can have a mix of files 
>>>>>>>>>>>>>>>>>>>>>>> spanning different
>>>>>>>>>>>>>>>>>>>>>>> partition specs, and even in leaf manifests avoiding 
>>>>>>>>>>>>>>>>>>>>>>> this coupling can
>>>>>>>>>>>>>>>>>>>>>>> enable more desirable clustering of metadata.
>>>>>>>>>>>>>>>>>>>>>>> In the vast majority of cases, we could leverage the
>>>>>>>>>>>>>>>>>>>>>>> property that a file is effectively partitioned if the 
>>>>>>>>>>>>>>>>>>>>>>> lower/upper for a
>>>>>>>>>>>>>>>>>>>>>>> given field is equal. The nuance here is with the 
>>>>>>>>>>>>>>>>>>>>>>> particular case of
>>>>>>>>>>>>>>>>>>>>>>> identity partitioned string/binary columns which can be 
>>>>>>>>>>>>>>>>>>>>>>> truncated in stats.
>>>>>>>>>>>>>>>>>>>>>>> One approach is to require that writers must not 
>>>>>>>>>>>>>>>>>>>>>>> produce truncated stats
>>>>>>>>>>>>>>>>>>>>>>> for identity partitioned columns. It's also important 
>>>>>>>>>>>>>>>>>>>>>>> to keep in mind that
>>>>>>>>>>>>>>>>>>>>>>> all of this is just for the purpose of reconstructing 
>>>>>>>>>>>>>>>>>>>>>>> the partition tuple,
>>>>>>>>>>>>>>>>>>>>>>> which is only required during equality delete matching. 
>>>>>>>>>>>>>>>>>>>>>>> Another area we
>>>>>>>>>>>>>>>>>>>>>>> need to cover as part of this is on exact bounds on 
>>>>>>>>>>>>>>>>>>>>>>> stats. There are other
>>>>>>>>>>>>>>>>>>>>>>> options here as well such as making all new equality 
>>>>>>>>>>>>>>>>>>>>>>> deletes in V4 be
>>>>>>>>>>>>>>>>>>>>>>> global and instead match based on bounds, or keeping 
>>>>>>>>>>>>>>>>>>>>>>> the tuple but each
>>>>>>>>>>>>>>>>>>>>>>> tuple is effectively based off a union schema of all 
>>>>>>>>>>>>>>>>>>>>>>> partition specs. I am
>>>>>>>>>>>>>>>>>>>>>>> adding a separate appendix section outlining the span 
>>>>>>>>>>>>>>>>>>>>>>> of options here and
>>>>>>>>>>>>>>>>>>>>>>> the different tradeoffs.
>>>>>>>>>>>>>>>>>>>>>>> Once we get this more to a conclusive state, I'll
>>>>>>>>>>>>>>>>>>>>>>> move a summarized version to the main doc.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> 2. @[email protected] <[email protected]> has
>>>>>>>>>>>>>>>>>>>>>>> updated the doc with a section
>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.rrpksmp8zkb#heading=h.qau0y5xkh9mn>
>>>>>>>>>>>>>>>>>>>>>>>  on
>>>>>>>>>>>>>>>>>>>>>>> how we can do change detection from the root in a 
>>>>>>>>>>>>>>>>>>>>>>> variety of write
>>>>>>>>>>>>>>>>>>>>>>> scenarios. I've done a review on it, and it covers the 
>>>>>>>>>>>>>>>>>>>>>>> cases I would
>>>>>>>>>>>>>>>>>>>>>>> expect. It'd be good for folks to take a look and 
>>>>>>>>>>>>>>>>>>>>>>> please give feedback
>>>>>>>>>>>>>>>>>>>>>>> before we discuss. Thank you Steven for adding that 
>>>>>>>>>>>>>>>>>>>>>>> section and all the
>>>>>>>>>>>>>>>>>>>>>>> diagrams.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Thu, Sep 18, 2025 at 3:19 PM Amogh Jahagirdar <
>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hey folks just following up from the discussion
>>>>>>>>>>>>>>>>>>>>>>>> last Friday with a summary and some next steps:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> 1.) For the various change detection cases, we
>>>>>>>>>>>>>>>>>>>>>>>> concluded it's best just to go through those in an 
>>>>>>>>>>>>>>>>>>>>>>>> offline manner on the
>>>>>>>>>>>>>>>>>>>>>>>> doc since it's hard to verify all that correctness in 
>>>>>>>>>>>>>>>>>>>>>>>> a large meeting
>>>>>>>>>>>>>>>>>>>>>>>> setting.
>>>>>>>>>>>>>>>>>>>>>>>> 2.) We mostly discussed eliminating the
>>>>>>>>>>>>>>>>>>>>>>>> partition tuple. On the original proposal, I was 
>>>>>>>>>>>>>>>>>>>>>>>> mostly aiming for the
>>>>>>>>>>>>>>>>>>>>>>>> ability to re-constructing the tuple from the stats 
>>>>>>>>>>>>>>>>>>>>>>>> for the purpose of
>>>>>>>>>>>>>>>>>>>>>>>> equality delete matching (a file is partitioned if the 
>>>>>>>>>>>>>>>>>>>>>>>> lower and upper
>>>>>>>>>>>>>>>>>>>>>>>> bounds are equal); There's some nuance in how we need 
>>>>>>>>>>>>>>>>>>>>>>>> to handle identity
>>>>>>>>>>>>>>>>>>>>>>>> partition values since for string/binary they cannot 
>>>>>>>>>>>>>>>>>>>>>>>> be truncated.
>>>>>>>>>>>>>>>>>>>>>>>> Another potential option is to treat all equality 
>>>>>>>>>>>>>>>>>>>>>>>> deletes as effectively
>>>>>>>>>>>>>>>>>>>>>>>> global and narrow their application based on the stats 
>>>>>>>>>>>>>>>>>>>>>>>> values. This may
>>>>>>>>>>>>>>>>>>>>>>>> require defining tight bounds. I'm still collecting my 
>>>>>>>>>>>>>>>>>>>>>>>> thoughts on this one.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks folks! Please also let me know if any of the
>>>>>>>>>>>>>>>>>>>>>>>> following links are inaccessible for any reason.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Meeting recording link:
>>>>>>>>>>>>>>>>>>>>>>>> https://drive.google.com/file/d/1gv8TrR5xzqqNxek7_sTZkpbwQx1M3dhK/view
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Meeting summary:
>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/131N0CDpzZczURxitN0HGS7dTqRxQT_YS9jMECkGGvQU
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 8, 2025 at 3:40 PM Amogh Jahagirdar <
>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Update: I moved the discussion time to this Friday
>>>>>>>>>>>>>>>>>>>>>>>>> at 9 am PST since I found out that quite a few folks 
>>>>>>>>>>>>>>>>>>>>>>>>> involved in the
>>>>>>>>>>>>>>>>>>>>>>>>> proposals will be out next week, and I also know some 
>>>>>>>>>>>>>>>>>>>>>>>>> folks will also be
>>>>>>>>>>>>>>>>>>>>>>>>> out the week after that.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>> Amogh J
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 8, 2025 at 8:57 AM Amogh Jahagirdar <
>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Hey folks sorry for the late follow up here,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks @Kevin Liu <[email protected]> for
>>>>>>>>>>>>>>>>>>>>>>>>>> sharing the recording link of the previous 
>>>>>>>>>>>>>>>>>>>>>>>>>> discussion! I've set up another
>>>>>>>>>>>>>>>>>>>>>>>>>> sync for next Tuesday 09/16 at 9am PST. This time 
>>>>>>>>>>>>>>>>>>>>>>>>>> I've set it up from my
>>>>>>>>>>>>>>>>>>>>>>>>>> corporate email so we can get recordings and 
>>>>>>>>>>>>>>>>>>>>>>>>>> transcriptions (and I've made
>>>>>>>>>>>>>>>>>>>>>>>>>> sure to keep the meeting invite open so we don't 
>>>>>>>>>>>>>>>>>>>>>>>>>> have to manually let
>>>>>>>>>>>>>>>>>>>>>>>>>> people in).
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> In terms of next steps of areas which I think
>>>>>>>>>>>>>>>>>>>>>>>>>> would be good to focus on for establishing consensus:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> 1. How do we model the manifest entry structure
>>>>>>>>>>>>>>>>>>>>>>>>>> so that changes to manifest DVs can be obtained 
>>>>>>>>>>>>>>>>>>>>>>>>>> easily from the root? There
>>>>>>>>>>>>>>>>>>>>>>>>>> are a few options here; the most promising approach 
>>>>>>>>>>>>>>>>>>>>>>>>>> is to keep an
>>>>>>>>>>>>>>>>>>>>>>>>>> additional DV which encodes the diff in additional 
>>>>>>>>>>>>>>>>>>>>>>>>>> positions which have
>>>>>>>>>>>>>>>>>>>>>>>>>> been removed from a leaf manifest.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Modeling partition transforms via expressions
>>>>>>>>>>>>>>>>>>>>>>>>>> and establishing a unified table ID space so that we 
>>>>>>>>>>>>>>>>>>>>>>>>>> can simplify how
>>>>>>>>>>>>>>>>>>>>>>>>>> partition tuples may be represented via stats and 
>>>>>>>>>>>>>>>>>>>>>>>>>> also have a way in the
>>>>>>>>>>>>>>>>>>>>>>>>>> future to store stats on any derived column. I have 
>>>>>>>>>>>>>>>>>>>>>>>>>> a short
>>>>>>>>>>>>>>>>>>>>>>>>>> proposal
>>>>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1oV8dapKVzB4pZy5pKHUCj5j9i2_1p37BJSeT7hyKPpg/edit?tab=t.0>
>>>>>>>>>>>>>>>>>>>>>>>>>>  for
>>>>>>>>>>>>>>>>>>>>>>>>>> this that probably still needs some tightening up on 
>>>>>>>>>>>>>>>>>>>>>>>>>> the expression
>>>>>>>>>>>>>>>>>>>>>>>>>> modeling itself (and some prototyping) but the 
>>>>>>>>>>>>>>>>>>>>>>>>>> general idea for
>>>>>>>>>>>>>>>>>>>>>>>>>> establishing a unified table ID space is covered. 
>>>>>>>>>>>>>>>>>>>>>>>>>> All feedback welcome!
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Aug 25, 2025 at 1:34 PM Kevin Liu <
>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Amogh. Looks like the recording for last
>>>>>>>>>>>>>>>>>>>>>>>>>>> week's sync is available on Youtube. Here's the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> link,
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://www.youtube.com/watch?v=uWm-p--8oVQ
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Kevin Liu
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Aug 12, 2025 at 9:10 PM Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hey folks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Just following up on this to give the community
>>>>>>>>>>>>>>>>>>>>>>>>>>>> as to where we're at and my proposed next steps.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've been editing and merging the contents from
>>>>>>>>>>>>>>>>>>>>>>>>>>>> our proposal into the proposal
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0#heading=h.unn922df0zzw>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  from
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Russell and others. For any future comments on 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> docs, please comment on the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> linked proposal. I've also marked it on our doc in 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> red text so it's clear
>>>>>>>>>>>>>>>>>>>>>>>>>>>> to redirect to the other proposal as a source of 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> truth for comments.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> In terms of next steps,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. An important design decision point is around
>>>>>>>>>>>>>>>>>>>>>>>>>>>> inline manifest DVs, external manifest DVs or 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> enabling both. I'm working on
>>>>>>>>>>>>>>>>>>>>>>>>>>>> measuring different approaches for representing 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the compressed DV
>>>>>>>>>>>>>>>>>>>>>>>>>>>> representation since that will inform how many 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> entries can reasonably fit
>>>>>>>>>>>>>>>>>>>>>>>>>>>> in a small root manifest; from that we can derive 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> implications on different
>>>>>>>>>>>>>>>>>>>>>>>>>>>> write patterns and determine the right approach 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> for storing these manifest
>>>>>>>>>>>>>>>>>>>>>>>>>>>> DVs.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Another key point is around determining
>>>>>>>>>>>>>>>>>>>>>>>>>>>> if/how we can reasonably enable V4 to represent 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> changes in the root
>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifest so that readers can effectively just 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> infer file level changes from
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the root.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. One of the aspects of the proposal is
>>>>>>>>>>>>>>>>>>>>>>>>>>>> getting away from partition tuple requirement in 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the root which currently
>>>>>>>>>>>>>>>>>>>>>>>>>>>> holds us to have associativity between a partition 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> spec and a manifest.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> These aspects can be modeled as essentially column 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> stats which gives a lot
>>>>>>>>>>>>>>>>>>>>>>>>>>>> of flexibility into the organization of the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifest. There are important
>>>>>>>>>>>>>>>>>>>>>>>>>>>> details around field ID spaces here which tie into 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> how the stats are
>>>>>>>>>>>>>>>>>>>>>>>>>>>> structured. What we're proposing here is to have a 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> unified expression ID
>>>>>>>>>>>>>>>>>>>>>>>>>>>> space that could also benefit us for storing 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> things like virtual columns
>>>>>>>>>>>>>>>>>>>>>>>>>>>> down the line. I go into this in the proposal but 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm working on separating
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the appropriate parts so that the original 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposal can mostly just focus
>>>>>>>>>>>>>>>>>>>>>>>>>>>> on the organization of the content metadata tree 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> and not how we want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> solve this particular ID space problem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4. I'm planning on scheduling a recurring
>>>>>>>>>>>>>>>>>>>>>>>>>>>> community sync starting next Tuesday at 9am PST, 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> every 2 weeks. If I get
>>>>>>>>>>>>>>>>>>>>>>>>>>>> feedback from folks that this time will never 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> work, I can certainly adjust.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> For some reason, I don't have the ability to add 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> to the Iceberg Dev
>>>>>>>>>>>>>>>>>>>>>>>>>>>> calendar, so I'll figure that out and update the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> thread when the event is
>>>>>>>>>>>>>>>>>>>>>>>>>>>> scheduled.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 22, 2025 at 11:47 AM Russell
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Spitzer <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think this is a great way forward, starting
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> out with this much parallel development shows 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that we have a lot of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consensus already :)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 22, 2025 at 12:42 PM Amogh
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jahagirdar <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hey folks, just following up on this. It
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> looks like our proposal and the proposal that 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> @Russell
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Spitzer <[email protected]> shared
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are pretty aligned. I was just chatting with 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Russell about this, and we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think it'd be best to combine both proposals and 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have a singular large
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effort on this. I can also set up a focused 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> community discussion (similar
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to what we're doing on the other V4 proposals) 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on this starting sometime
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> next week just to get things moving, if that 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> works for people.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 9:48 PM Amogh
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jahagirdar <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hey Russell,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for sharing the proposal! A few of us
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (Ryan, Dan, Anoop and I) have also been working 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on a proposal for an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> adaptive metadata tree structure as part of 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enabling more efficient one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> file commits. From a read of the summary, it's 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> great to see that we're
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thinking along the same lines about how to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tackle this fundamental area!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Here is our proposal:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 8:08 PM Russell
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Spitzer <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hey y'all!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We (Yi Fang, Steven Wu and Myself) wanted
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to share some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of the thoughts we had on how one-file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> commits could work in Iceberg. This is pretty
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much just a high level overview of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> concepts we think we need and how Iceberg 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would behave.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We haven't gone very far into the actual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> implementation and changes that would need to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> occur in the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SDK to make this happen.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The high level summary is:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Manifest Lists are out
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Root Manifests take their place
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   A Root manifest can have data manifests,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> delete manifests, manifest delete vectors, 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data delete vectors and data
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   Manifest delete vectors allow for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> modifying a manifest without deleting it 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entirely
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   Data files let you append without writing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an intermediary manifest
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   Having child data and delete
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifests lets you still scale
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please take a look if you like,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm excited to see what other proposals and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ideas are floating around the community,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Russ
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jul 2, 2025 at 6:29 PM John Zhuge <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Very excited about the idea!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jul 2, 2025 at 1:17 PM Anoop
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Johnson <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm very interested in this initiative.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Micah Kornfield and I presented
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://youtu.be/4d4nqKkANdM?si=9TXgaUIXbq-l8idi&t=1405>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on high-throughput ingestion for Iceberg 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tables at the 2024 Iceberg Summit,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which leveraged Google infrastructure like 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Colossus for efficient appends.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This new proposal is particularly
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exciting because it offers significant 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> advancements in commit latency and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata storage footprint. Furthermore, a 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistent manifest structure
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> promises to simplify the design and 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> codebase, which is a major benefit.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A related idea I've been exploring is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> having a loose affinity between data and 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> delete manifests. While the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current separation of data and delete 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifests in Iceberg is valuable for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> avoiding data file rewrites (and stats 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updates) when deletes change, it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does necessitate a join operation during 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reads. I'd be keen to discuss
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> approaches that could potentially reduce 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this read-side cost while
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retaining the benefits of separate manifests.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Anoop
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jun 13, 2025 at 11:06 AM Jagdeep
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sidhu <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am new to the Iceberg community but
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would love to participate in these 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> discussions to reduce the number of file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> writes, especially for small writes/commits.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -Jagdeep
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Jun 5, 2025 at 4:02 PM Anurag
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mantripragada 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We have been hitting all the metadata
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problems you mentioned, Ryan. I’m on-board 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to help however I can to improve
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this area.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~ Anurag Mantripragada
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Jun 3, 2025, at 2:22 AM,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Huang-Hsiang Cheng 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am interested in this idea and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> looking forward to collaboration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Huang-Hsiang
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Jun 2, 2025, at 10:14 AM, namratha
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mk <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am interested in contributing to this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effort.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Namratha
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, May 29, 2025 at 1:36 PM Amogh
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jahagirdar <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for kicking this thread off
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryan, I'm interested in helping out here! 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've been working on a proposal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in this area and it would be great to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> collaborate with different folks and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exchange ideas here, since I think a lot 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of people are interested in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> solving this problem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, May 29, 2025 at 2:25 PM Ryan
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blue <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Like Russell’s recent note, I’m
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> starting a thread to connect those of us 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that are interested in the idea of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> changing Iceberg’s metadata in v4 so 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that in most cases committing a change
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> only requires writing one additional 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Idea: One-file commits*
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The current Iceberg metadata
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> structure requires writing at least one 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifest and a new manifest list to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> produce a new snapshot. The goal of this 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> work is to allow more flexibility
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> by allowing the manifest list layer to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> store data and delete files. As a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> result, only one file write would be 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> needed before committing the new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> snapshot. In addition, this work will 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also try to explore:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - Avoiding small manifests that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    must be read in parallel and later 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compacted (metadata maintenance changes)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - Extend metadata skipping to use
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    aggregated column ranges that are 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compatible with geospatial data (manifest
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    metadata)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - Using soft deletes to avoid
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    rewriting existing manifests 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (metadata DVs)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you’re interested in these
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problems, please reply!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryan
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> John Zhuge
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: [DISCUSS] v4 - One file commits

Reply via email to