@Ryan Blue <rb...@netflix.com> your evaluation of the use-case is fairly
accurate w/ a caveat that during compaction we're reducing both rows and
delete predicates. So to both your point and OpenInx's it seems that the
row-level delete support as currently spec'ed out is a superset of what
we've managed to support with the "tombstone extension". Also I guess our
implementation may be considered somehow a naive approach, reasoning from
the lack of sorting support and extra metadata that could accommodate
highly efficient filtering based on range indices for key columns, etc.
So after revisiting the spec I got a few fast points that I could use some
guidance on mostly to make sure we can reason on an eventual transition
plan over to the community support for delete operation w/ Iceberg:

   - I guess it is highly recommended that we align with the current
   sorting spec to make sure tombstone columns are key columns basically?
   Sorting seems to condition deletes at least, right?
   - Also it looks that update/delete/upsert operations with predicates on
   arbitrary columns are called out as nice-to-have requirements - is this
   because such a feature may account for unpredictable performance
   degradation on read/ compaction operations or is there something else that
   may account for whether this gets supported or not?
   - One fuzzy aspect related to the implications of deleting rows by say
   an <IN> predicate and running compaction. Compaction will reduce all the
   affected rows matching the delete predicate up to that point [1] but still
   filtering will be applied for rows matching the predicate that land after
   compaction has successfully completed too or is it that rows landing after
   compaction has completed become visible?
   - Oh and on a real edge-case scenario but which is tightly coupled to
   the previous question - given we delete by predicate IN then it's perfectly
   reasonable that Iceberg will accommodate an append of files should key
   columns indicate that the rows completely match ALL deleted data as per the
   IN predicate? Reason I'm asking is that eventually that data will not be
   effectively read but only end up put pressure on the compaction process (if
   any).

I will do my best to keep close to the current efforts to get
update/delete/upserts support in Iceberg.

[1]  Assuming compaction runs against a particular snapshot id and commits
against a possibly later one should no conflict surface wrt to the list of
files matching the filtering predicates.

/Filip

On Tue, Jan 28, 2020 at 1:34 AM Miao Wang <miw...@adobe.com.invalid> wrote:

> Hi Ryan,
>
>
>
> Just found your comment in my junk mail box.
>
>
>
> ` It sounds like a tombstone in independent of the table snapshot, like
> blacklisting a row. ` -> It goes into snapshot. It was first developed for
> Adobe use case which is not a row-level delete marker. However, it is
> extensible/generic to support row-level deletes.
>
>
>
> @Filip Bocse <fbo...@adobe.com> can provide more details on Tombstone
> design and use case.
>
>
>
> Miao
>
>
>
> *From: *Ryan Blue <rb...@netflix.com.INVALID>
> *Reply-To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org>, "
> rb...@netflix.com" <rb...@netflix.com>
> *Date: *Tuesday, January 21, 2020 at 5:40 PM
> *To: *Iceberg Dev List <dev@iceberg.apache.org>
> *Subject: *Re: Iceberg tombstone?
>
>
>
> Hi everyone,
>
>
>
> Thanks for bringing this up, Filip. I've been thinking about it over the
> last few days. One thing that I'm not clear about is how
> exactly tombstones differ from the row-level deletes that we're already
> working on. It sounds like a tombstone in independent of the table
> snapshot, like blacklisting a row. So if I tombstone user = 'rdblue', it
> doesn't just apply to all the data currently in the table; it also deletes
> any future additions of that user. Is that correct?
>
>
>
> If that's right, then I think that could be added fairly easily using the
> existing filter code and our Table interface to wrap existing readers. On
> the other hand, if that's not what you want then I'm not sure how it
> differs from the existing plan to add row-level deletes. Can you clarify?
>
>
>
> rb
>
>
>
> On Wed, Jan 15, 2020 at 11:04 AM Miao Wang <miw...@adobe.com.invalid>
> wrote:
>
> Hi Filip,
>
>
>
> I vote for this feature because it is extensible for [1].
>
>
>
> Besides Openlnx’s comments, can you provide a high level description on
> how tombstone aligns with milestone [1]?
>
>
>
> *[1]. https://github.com/apache/incubator-iceberg/milestone/4
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-iceberg%2Fmilestone%2F4&data=02%7C01%7Cmiwang%40adobe.com%7Ca7266ac170474f301c4108d79edc1b92%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637152540544316361&sdata=GMl3IGr8XUA3unFkuvu4ayoKalwq%2FmYSMZTERsDLt2s%3D&reserved=0>*
>
>
>
> Thanks!
>
>
>
> Miao
>
>
>
> *From: *OpenInx <open...@gmail.com>
> *Reply-To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org>
> *Date: *Tuesday, January 14, 2020 at 6:34 PM
> *To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org>
> *Subject: *Re: Iceberg tombstone?
>
>
>
> Hi Filip
>
>
>
> So you team have implemented the tombstone feature in your internal
> branch.  For my understanding, the tombstone
>
> you mean is similar to the delete marker in HBase, so you're trying to
> implement the update/delete feature I think. For this part,
>
> Anton and Miguel have a design doc for this [1], IMO your work should
> intersect with it.
>
>
>
> Another question is: one rule which we shouldn't break (my personal view)
> is the open file format rule, means encoding data in
>
> open format and can view them by using non-iceberg tool, your
> implementation will follow the rule or not? encoding the tombstone
>
> in your own format, or just make them into a separate file and mark it as
> tombstone file ?
>
>
>
> For the column predicate filter or row-level filter,  I'm not familiar
> with this part. Mind to provide more details?  :-)
>
>
>
> Thanks for your information :-)
>
>
>
> [1].
> https://docs.google.com/document/d/1Pk34C3diOfVCRc-sfxfhXZfzvxwum1Odo-6Jj9mwK38/edit#
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1Pk34C3diOfVCRc-sfxfhXZfzvxwum1Odo-6Jj9mwK38%2Fedit%23&data=02%7C01%7Cmiwang%40adobe.com%7Ca7266ac170474f301c4108d79edc1b92%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637152540544326346&sdata=kZiRYMphDfwzHRykL8L%2B%2Blmy%2F0yaLMnaYFxHvHO32zg%3D&reserved=0>
>
>
>
>
>
> On Tue, Jan 14, 2020 at 10:20 PM Filip <filip....@gmail.com> wrote:
>
> Hi everyone,
>
> I was wondering if it would be of interest to start a thread on a proposal
> for a tombstone feature in Iceberg.
> Well, maybe it could be less strict than a general case implementation of
> tombstone [*1*].
> I would be looking for at least a couple of things to be covered on this
> thread:
>
>    1. [*Opportunity*] I was wondering if others on this list would find
>    such a feature useful and/ or if folks would support such a feature be
>    provided by Iceberg
>    2. [*Feasibility*] Also wondering if there's any intersection between
>    a tombstone feature (i.e. filter by column predicate) and the upcoming
>    Upsert spec/ implementation or if these two may very well serve different
>    use-cases so it's wise they shouldn't be mixed up, only indirectly I guess,
>    by the sheer implications of accidental complexity :)
>
> The current Iceberg codebase is quite generous wrt to the Open/Closed
> principle and we've been doing some spikes to implement such a feature in a
> new datasource and I've thought I'd share some touchpoints of our work so
> far (would gladly share this if community is interested):
> > [*extension*] implementing tombstoning as a column/values predicate
> should be associated w/ some specific metadata (snapshot id, version?) and
> basic metrics (i.e. count, basic histograms) - mostly thinking that any
> tombstone operation feature is accompanied by a compaction task so metadata
> and metrics would help with building generic solutions for optimal
> scheduling of these maintenance tasks - tombstones could be modeled/
> programmed against the org.apache.iceberg.Table interface
> > [*atomic guarantee*] a simple solution is to make the tombstone
> operation atomic by assigning a new snapshot summary property point to a
> file reference of an immutable file holding the tombstone
> predicates/expressions
> > [*new API*] append tombstones
> > [*new API*] remove tombstones
> > [*new API*] append files and add tombstones
> > [*new API*] append files and remove tombstones
> > [*new API*] vacuum tombstones - a task as in clean up tombstoned rows
> and evict the associated tombstone metadata as well, oh and maybe not
> `vacuum` (I remember reading about this on this list in a different context
> and it's probably reserved for a different Iceberg feature, right?)
> > [*extend*] extend the spark reader/writer to account for tombstone
> based filtering and tombstone committing respectively - writing may prove
> easier to implement than reading, reading comes in many flavours so
> applying a filter expression may not be as accessible at the moment as one
> would prefer to extend on top of Iceberg [*2*]
> > [*extend*] removing snapshots would also account for their associated
> tombstone files to be dropped as well (where available)
>
> [*1*] I believe that in a general tombstone implementation the data
> filtering is applied only for the data that was added prior to the
> tombstone setting/ assignment operation but that might prove quite
> difficult to implement w/ Iceberg and it could be considered a
> specialization of a more generic and basic use-case of adding tombstone
> support as filtering by column values regardless of order of operations
> (data vs tombstone).
> [*2*] We could benefit from adding like an extension point/ hook into
> row-level filtering that we could leverage to translate tombstone options
> into row level filter/ predicates into
> https://github.com/apache/incubator-iceberg/blob/6048e5a794242cb83871e3838d0d40aa71e36a91/spark/src/main/java/org/apache/iceberg/spark/source/Reader.java#L438-L439
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-iceberg%2Fblob%2F6048e5a794242cb83871e3838d0d40aa71e36a91%2Fspark%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ficeberg%2Fspark%2Fsource%2FReader.java%23L438-L439&data=02%7C01%7Cmiwang%40adobe.com%7Ca7266ac170474f301c4108d79edc1b92%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637152540544326346&sdata=OILuzXZnKBRXe2Yi3uP3jcy%2Fz4JUYraEdzuBHZEpkTA%3D&reserved=0>
>
>
>
> /Filip
>
>
>
>
> --
>
> Ryan Blue
>
> Software Engineer
>
> Netflix
>


-- 
Filip Bocse

Reply via email to