Hi everyone,

I was wondering if it would be of interest to start a thread on a proposal
for a tombstone feature in Iceberg.
Well, maybe it could be less strict than a general case implementation of
tombstone [*1*].
I would be looking for at least a couple of things to be covered on this
thread:

   1. [*Opportunity*] I was wondering if others on this list would find
   such a feature useful and/ or if folks would support such a feature be
   provided by Iceberg
   2. [*Feasibility*] Also wondering if there's any intersection between a
   tombstone feature (i.e. filter by column predicate) and the upcoming Upsert
   spec/ implementation or if these two may very well serve different
   use-cases so it's wise they shouldn't be mixed up, only indirectly I guess,
   by the sheer implications of accidental complexity :)

The current Iceberg codebase is quite generous wrt to the Open/Closed
principle and we've been doing some spikes to implement such a feature in a
new datasource and I've thought I'd share some touchpoints of our work so
far (would gladly share this if community is interested):
> [*extension*] implementing tombstoning as a column/values predicate
should be associated w/ some specific metadata (snapshot id, version?) and
basic metrics (i.e. count, basic histograms) - mostly thinking that any
tombstone operation feature is accompanied by a compaction task so metadata
and metrics would help with building generic solutions for optimal
scheduling of these maintenance tasks - tombstones could be modeled/
programmed against the org.apache.iceberg.Table interface
> [*atomic guarantee*] a simple solution is to make the tombstone operation
atomic by assigning a new snapshot summary property point to a file
reference of an immutable file holding the tombstone predicates/expressions
> [*new API*] append tombstones
> [*new API*] remove tombstones
> [*new API*] append files and add tombstones
> [*new API*] append files and remove tombstones
> [*new API*] vacuum tombstones - a task as in clean up tombstoned rows and
evict the associated tombstone metadata as well, oh and maybe not `vacuum`
(I remember reading about this on this list in a different context and it's
probably reserved for a different Iceberg feature, right?)
> [*extend*] extend the spark reader/writer to account for tombstone based
filtering and tombstone committing respectively - writing may prove easier
to implement than reading, reading comes in many flavours so applying a
filter expression may not be as accessible at the moment as one would
prefer to extend on top of Iceberg [*2*]
> [*extend*] removing snapshots would also account for their associated
tombstone files to be dropped as well (where available)

[*1*] I believe that in a general tombstone implementation the data
filtering is applied only for the data that was added prior to the
tombstone setting/ assignment operation but that might prove quite
difficult to implement w/ Iceberg and it could be considered a
specialization of a more generic and basic use-case of adding tombstone
support as filtering by column values regardless of order of operations
(data vs tombstone).
[*2*] We could benefit from adding like an extension point/ hook into
row-level filtering that we could leverage to translate tombstone options
into row level filter/ predicates into
https://github.com/apache/incubator-iceberg/blob/6048e5a794242cb83871e3838d0d40aa71e36a91/spark/src/main/java/org/apache/iceberg/spark/source/Reader.java#L438-L439

/Filip

Reply via email to