Hi all,

It seems this thread has become conflated with the metadata representation
discussion
<https://lists.apache.org/thread/7jryw9dfvc02s411twn4o7s5gjrybfxg>. While
all the points raised here are noted, let’s continue those specific parts
of the conversation in the metadata thread.

Regarding data representation, we discussed the following during this
<https://www.youtube.com/watch?v=kuxFBm-j5hw&t=3s> sync:

   -  Implementation Details: Specific writer implementation details such
   as choosing between dense or sparse representations will be left to
   individual engines.
   -  Specification Scope: The specification will not mandate these
   internal implementation choices, provided that engines adhere to writing
   the explicit *_pos* column.

Please let me know if you have concerns.

~ Anurag


On Tue, Jun 2, 2026 at 11:44 AM Xiening Dai <[email protected]> wrote:

> We also need to think about the DV only case.
>
> If we have f0 with dv0, then we do column update and generate f1. Do we
> also bump the sequence number for f0 in this case? There are multiple
> options:
>
> 1) We bump the sequence number, then we will need to copy dv0 into dv1 and
> assign the same sequence number to dv1 so that the delete positions won't
> get lost.
> 2) We don't bump the sequence number, then we don't need to re-write dv0
> and everything would remain working. But this creates a small inconsistency
> with eq delete case, and requires a special case handling at write path.
> 3) We bump sequence number for both data file f0, and dv0. We don't need
> to rewrite dv, but instead we bump the sequence number for the dv as well.
>
> I'd suggest we write down these details into a spec change proposal and
> examine the read write work flow carefully.
>
> On 2026/06/02 12:42:10 Gábor Kaszab wrote:
> > Thanks for the summary, Amogh!
> >
> > I think the missing building block to make this eq-delete rewrite work is
> > the decision made yesterday, to bump the base file-level sequence number
> > when adding a column file. With this, we can make sure that after we have
> > rewritten the eq-deletes into DVs in the process of adding column files,
> we
> > don't have to apply the eq-deletes we had previously on the base file.
> >
> > Just some thoughts on implementation:
> >
> >    - Write path in general: When writing the update file, we designed
> this
> >    in the PoC to receive _path and _pos from the base file. With this we
> can
> >    identify if some positions are missing and we can convert them into
> DVs
> >    - Trailing deletes: The tricky part is when trailing rows are
> deleted. I
> >    see 2 approaches to get around this:
> >       - Broadcast base file row counts to writers (this is done by the
> >       PoC): When we received the last row from the base file with pos X,
> but we
> >       know there are more rows in the base file, we have to add the
> trailing
> >       positions to the DV
> >       - Enrich the input rows fed to the writer with the "_deleted"
> >       metadata column. False => write to update file, true => write pos
> to DV
> >
> > Regards,
> > Gabor
> >
> > Amogh Jahagirdar <[email protected]> ezt írta (időpont: 2026. jún. 1., H,
> > 22:48):
> >
> > > >The real challenge comes from the read path. In the case when we have
> a
> > > data file f0, an equality delete file d0, and column file f1, and the
> > > materialized dv d1. How do we reconcile the deletes during read? If we
> > > don't do anything special, following the existing spec (based on
> sequence
> > > number rule), we would apply d0 on f0, and then apply d1 on f1, which
> > > should still give us the correct results as both d0 and d1 represent
> the
> > > same set of positions. But this is undesired because we dont want to
> load
> > > and re-evaluate the old column values. So we need a change in the spec
> so
> > > that in this scenario the new d1 supersede the existing equality delete
> > > file (d0).
> > >
> > > So given the following invariants/rules:
> > >
> > > 1. In a dense representation, column updates must carry over all active
> > > values for the column (and there's a _pos column referencing the
> position
> > > from the original base file).
> > > 2. Column updates must know what rows were deleted (either to omit the
> row
> > > or materialize the default value)
> > > 3. Data sequence numbers are updated on column appends/updates (this
> would
> > > be a spec change in v4). I think reusing the same seq. number is key
> since
> > > we don't have a different sequence number definition that's temporal in
> > > dimension for delete matching and another one that's not temporal but
> for
> > > column updates. Having a single sequence number simplifies a lot of
> this.
> > > 4. The requirement that a column update must also rewrite existing
> > > equality deletes into DV
> > >
> > > I think this combination (and the fact that DVs are 1:1 to with data
> > > files) naturally addresses this because
> > > f1 in this example would have the column values for all the active
> rows.
> > > Then the DV v1 just deletes row positions as usual. There's never a
> need to
> > > actually read the old column values in this model.
> > >
> > > There's a broader discussion around eliminating new equality deletes
> in v4
> > > but in that case this rule would still apply to handle older equality
> > > deletes from v3 and earlier + column updates on older data files as
> well.
> > >
> > > We actually talked about this a bit in todays v4 amt sync
> > > <https://youtu.be/7mVes-6pM1c?t=861>
> > >
> > > Thanks,
> > > Amogh Jahagirdar
> > >
> > > On Mon, Jun 1, 2026 at 12:17 PM Xiening Dai <[email protected]> wrote:
> > >
> > >> > but we should develop some concreteness around how feasible it is
> for
> > >> engines to produce the DVs on the column update.
> > >>
> > >> Actually I don't think this would be a problem. As mentioned, in
> order to
> > >> generate correct column file, we already need to product the correct
> set of
> > >> deleted positions, and we just need an extra step to materialize these
> > >> positions into DV.
> > >>
> > >> The real challenge comes from the read path. In the case when we have
> a
> > >> data file f0, an equality delete file d0, and column file f1, and the
> > >> materialized dv d1. How do we reconcile the deletes during read? If we
> > >> don't do anything special, following the existing spec (based on
> sequence
> > >> number rule), we would apply d0 on f0, and then apply d1 on f1, which
> > >> should still give us the correct results as both d0 and d1 represent
> the
> > >> same set of positions. But this is undesired because we dont want to
> load
> > >> and re-evaluate the old column values. So we need a change in the
> spec so
> > >> that in this scenario the new d1 supersede the existing equality
> delete
> > >> file (d0).
> > >>
> > >> On 2026/05/29 23:21:33 Amogh Jahagirdar wrote:
> > >> > One approach that’s helped me reason about all this is to treat each
> > >> base
> > >> > file as its own little mini‑table inside the larger table: the row
> > >> range of
> > >> > the base file keyed by row_id, and column files/deletes just layer
> on
> > >> top.Once
> > >> > a row is deleted in that mini‑table, it stays deleted in that
> > >> mini‑table’s
> > >> > state (whether that’s via equality deletes, or DVs), and column
> updates
> > >> are
> > >> > just layering changed or additional columns on top of whatever
> rowsare
> > >> > still there. Then I can reason about "what are desirable properties
> of
> > >> this
> > >> > mini-table".
> > >> >
> > >> > Once I look at it that way, stacking equality deletes with column
> > >> updates
> > >> > on the same column, and then forcing the write path to read all the
> > >> older
> > >> > column files when producing new column updates, feels like the worst
> > >> > outcome; and it gets worse the more column updates there are for the
> > >> > column. It blows up complexity and performance and compromises the
> > >> value of
> > >> > efficient column updates.
> > >> >
> > >> > If we eliminate that option, I think we’re left with two high‑level
> > >> > approaches:
> > >> >
> > >> >    1. Equality deletes cannot be allowed with column updates. This
> > >> >    simplifies both the read and write paths when column update
> files are
> > >> >    present. I would generally prefer this option but there is a
> > >> legitimate
> > >> >    problem around the “how” for checking for the presence equality
> > >> deletes. We
> > >> >    can’t rely on snapshot summaries, which means we’d have to look
> at
> > >> delete
> > >> >    manifests to really know if equality deletes exist. There were
> ideas
> > >> in the
> > >> >    V4 AMT sync about constraining equality deletes to be in the root
> > >> manifest;
> > >> >    in that model, the amount of work needed to check for equality
> > >> deletes is
> > >> >    bounded by the root size. I’d keep that as a separate open
> question
> > >> because
> > >> >    there are other challenges with requiring equality deletes to
> only
> > >> appear
> > >> >    in the root manifest, especially on the upgrade path.
> > >> >    2. After an equality delete, subsequent updates must produce a
> DV. As
> > >> >    Xiening highlighted, once you’ve had an equality delete on a
> column,
> > >> any
> > >> >    subsequent updates on that column would be required to produce a
> DV
> > >> (or
> > >> >    positional delete) for the deleted positions at the new sequence
> > >> number,
> > >> >    making the original equality delete obsolete. This is attractive
> > >> because
> > >> >    it’s not too constraining for writers: they’re already doing the
> > >> work of
> > >> >    reconciling deleted positions to decide what to write into the
> > >> column file,
> > >> >    so the additional work is basically emitting the DV. The main
> thing
> > >> to
> > >> >    think through is how exactly the plumbing to engines looks, but
> in
> > >> theory
> > >> >    it’s just a matter of plumbing through explicitly deleted
> positions
> > >> (or,
> > >> >    less ideally, inferring them from a sentinel value in the tuple).
> > >> >
> > >> >
> > >> > So far I’m leaning towards option 2, but we should develop some
> > >> > concreteness around how feasible it is for engines to produce the
> DVs on
> > >> > the column update. Again, should all be theoretically possible
> based off
> > >> > plumbing deleted positions; we shouldn't let implementations drive
> the
> > >> spec
> > >> > but I think sniff testing the practicality of it is well worth it to
> > >> make
> > >> > sure that restriction is reasonably implementable.
> > >> >
> > >> > Interested in hearing what others think about this one.
> > >> >
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Amogh Jahagirdar
> > >> >
> > >>
> > >
> >
>

Reply via email to