Hi all, It seems this thread has become conflated with the metadata representation discussion <https://lists.apache.org/thread/7jryw9dfvc02s411twn4o7s5gjrybfxg>. While all the points raised here are noted, let’s continue those specific parts of the conversation in the metadata thread.
Regarding data representation, we discussed the following during this <https://www.youtube.com/watch?v=kuxFBm-j5hw&t=3s> sync: - Implementation Details: Specific writer implementation details such as choosing between dense or sparse representations will be left to individual engines. - Specification Scope: The specification will not mandate these internal implementation choices, provided that engines adhere to writing the explicit *_pos* column. Please let me know if you have concerns. ~ Anurag On Tue, Jun 2, 2026 at 11:44 AM Xiening Dai <[email protected]> wrote: > We also need to think about the DV only case. > > If we have f0 with dv0, then we do column update and generate f1. Do we > also bump the sequence number for f0 in this case? There are multiple > options: > > 1) We bump the sequence number, then we will need to copy dv0 into dv1 and > assign the same sequence number to dv1 so that the delete positions won't > get lost. > 2) We don't bump the sequence number, then we don't need to re-write dv0 > and everything would remain working. But this creates a small inconsistency > with eq delete case, and requires a special case handling at write path. > 3) We bump sequence number for both data file f0, and dv0. We don't need > to rewrite dv, but instead we bump the sequence number for the dv as well. > > I'd suggest we write down these details into a spec change proposal and > examine the read write work flow carefully. > > On 2026/06/02 12:42:10 Gábor Kaszab wrote: > > Thanks for the summary, Amogh! > > > > I think the missing building block to make this eq-delete rewrite work is > > the decision made yesterday, to bump the base file-level sequence number > > when adding a column file. With this, we can make sure that after we have > > rewritten the eq-deletes into DVs in the process of adding column files, > we > > don't have to apply the eq-deletes we had previously on the base file. > > > > Just some thoughts on implementation: > > > > - Write path in general: When writing the update file, we designed > this > > in the PoC to receive _path and _pos from the base file. With this we > can > > identify if some positions are missing and we can convert them into > DVs > > - Trailing deletes: The tricky part is when trailing rows are > deleted. I > > see 2 approaches to get around this: > > - Broadcast base file row counts to writers (this is done by the > > PoC): When we received the last row from the base file with pos X, > but we > > know there are more rows in the base file, we have to add the > trailing > > positions to the DV > > - Enrich the input rows fed to the writer with the "_deleted" > > metadata column. False => write to update file, true => write pos > to DV > > > > Regards, > > Gabor > > > > Amogh Jahagirdar <[email protected]> ezt írta (időpont: 2026. jún. 1., H, > > 22:48): > > > > > >The real challenge comes from the read path. In the case when we have > a > > > data file f0, an equality delete file d0, and column file f1, and the > > > materialized dv d1. How do we reconcile the deletes during read? If we > > > don't do anything special, following the existing spec (based on > sequence > > > number rule), we would apply d0 on f0, and then apply d1 on f1, which > > > should still give us the correct results as both d0 and d1 represent > the > > > same set of positions. But this is undesired because we dont want to > load > > > and re-evaluate the old column values. So we need a change in the spec > so > > > that in this scenario the new d1 supersede the existing equality delete > > > file (d0). > > > > > > So given the following invariants/rules: > > > > > > 1. In a dense representation, column updates must carry over all active > > > values for the column (and there's a _pos column referencing the > position > > > from the original base file). > > > 2. Column updates must know what rows were deleted (either to omit the > row > > > or materialize the default value) > > > 3. Data sequence numbers are updated on column appends/updates (this > would > > > be a spec change in v4). I think reusing the same seq. number is key > since > > > we don't have a different sequence number definition that's temporal in > > > dimension for delete matching and another one that's not temporal but > for > > > column updates. Having a single sequence number simplifies a lot of > this. > > > 4. The requirement that a column update must also rewrite existing > > > equality deletes into DV > > > > > > I think this combination (and the fact that DVs are 1:1 to with data > > > files) naturally addresses this because > > > f1 in this example would have the column values for all the active > rows. > > > Then the DV v1 just deletes row positions as usual. There's never a > need to > > > actually read the old column values in this model. > > > > > > There's a broader discussion around eliminating new equality deletes > in v4 > > > but in that case this rule would still apply to handle older equality > > > deletes from v3 and earlier + column updates on older data files as > well. > > > > > > We actually talked about this a bit in todays v4 amt sync > > > <https://youtu.be/7mVes-6pM1c?t=861> > > > > > > Thanks, > > > Amogh Jahagirdar > > > > > > On Mon, Jun 1, 2026 at 12:17 PM Xiening Dai <[email protected]> wrote: > > > > > >> > but we should develop some concreteness around how feasible it is > for > > >> engines to produce the DVs on the column update. > > >> > > >> Actually I don't think this would be a problem. As mentioned, in > order to > > >> generate correct column file, we already need to product the correct > set of > > >> deleted positions, and we just need an extra step to materialize these > > >> positions into DV. > > >> > > >> The real challenge comes from the read path. In the case when we have > a > > >> data file f0, an equality delete file d0, and column file f1, and the > > >> materialized dv d1. How do we reconcile the deletes during read? If we > > >> don't do anything special, following the existing spec (based on > sequence > > >> number rule), we would apply d0 on f0, and then apply d1 on f1, which > > >> should still give us the correct results as both d0 and d1 represent > the > > >> same set of positions. But this is undesired because we dont want to > load > > >> and re-evaluate the old column values. So we need a change in the > spec so > > >> that in this scenario the new d1 supersede the existing equality > delete > > >> file (d0). > > >> > > >> On 2026/05/29 23:21:33 Amogh Jahagirdar wrote: > > >> > One approach that’s helped me reason about all this is to treat each > > >> base > > >> > file as its own little mini‑table inside the larger table: the row > > >> range of > > >> > the base file keyed by row_id, and column files/deletes just layer > on > > >> top.Once > > >> > a row is deleted in that mini‑table, it stays deleted in that > > >> mini‑table’s > > >> > state (whether that’s via equality deletes, or DVs), and column > updates > > >> are > > >> > just layering changed or additional columns on top of whatever > rowsare > > >> > still there. Then I can reason about "what are desirable properties > of > > >> this > > >> > mini-table". > > >> > > > >> > Once I look at it that way, stacking equality deletes with column > > >> updates > > >> > on the same column, and then forcing the write path to read all the > > >> older > > >> > column files when producing new column updates, feels like the worst > > >> > outcome; and it gets worse the more column updates there are for the > > >> > column. It blows up complexity and performance and compromises the > > >> value of > > >> > efficient column updates. > > >> > > > >> > If we eliminate that option, I think we’re left with two high‑level > > >> > approaches: > > >> > > > >> > 1. Equality deletes cannot be allowed with column updates. This > > >> > simplifies both the read and write paths when column update > files are > > >> > present. I would generally prefer this option but there is a > > >> legitimate > > >> > problem around the “how” for checking for the presence equality > > >> deletes. We > > >> > can’t rely on snapshot summaries, which means we’d have to look > at > > >> delete > > >> > manifests to really know if equality deletes exist. There were > ideas > > >> in the > > >> > V4 AMT sync about constraining equality deletes to be in the root > > >> manifest; > > >> > in that model, the amount of work needed to check for equality > > >> deletes is > > >> > bounded by the root size. I’d keep that as a separate open > question > > >> because > > >> > there are other challenges with requiring equality deletes to > only > > >> appear > > >> > in the root manifest, especially on the upgrade path. > > >> > 2. After an equality delete, subsequent updates must produce a > DV. As > > >> > Xiening highlighted, once you’ve had an equality delete on a > column, > > >> any > > >> > subsequent updates on that column would be required to produce a > DV > > >> (or > > >> > positional delete) for the deleted positions at the new sequence > > >> number, > > >> > making the original equality delete obsolete. This is attractive > > >> because > > >> > it’s not too constraining for writers: they’re already doing the > > >> work of > > >> > reconciling deleted positions to decide what to write into the > > >> column file, > > >> > so the additional work is basically emitting the DV. The main > thing > > >> to > > >> > think through is how exactly the plumbing to engines looks, but > in > > >> theory > > >> > it’s just a matter of plumbing through explicitly deleted > positions > > >> (or, > > >> > less ideally, inferring them from a sentinel value in the tuple). > > >> > > > >> > > > >> > So far I’m leaning towards option 2, but we should develop some > > >> > concreteness around how feasible it is for engines to produce the > DVs on > > >> > the column update. Again, should all be theoretically possible > based off > > >> > plumbing deleted positions; we shouldn't let implementations drive > the > > >> spec > > >> > but I think sniff testing the practicality of it is well worth it to > > >> make > > >> > sure that restriction is reasonably implementable. > > >> > > > >> > Interested in hearing what others think about this one. > > >> > > > >> > > > >> > Thanks, > > >> > > > >> > Amogh Jahagirdar > > >> > > > >> > > > > > >
