Re: Row-level delete sync notes - July 2019

Ryan Blue Wed, 07 Aug 2019 09:41:54 -0700

Thanks for the update Erik!

In addition, Anton has opened PR #351
<https://github.com/apache/incubator-iceberg/pull/351> to update the API so
that we can implement eager row-level overwrites. I think that's the only
part that needs to be done for the eager overwrite case because the rest of
the overwrite would be handled by query engines.


I've also opened a few issues and a Row-level Delete Milestone
<https://github.com/apache/incubator-iceberg/milestone/4> to track the
features. Feel free to open more issues or pull requests for what you're
working on and link them to that milestone.

On Wed, Aug 7, 2019 at 7:48 AM Erik Wright <erik.wri...@shopify.com> wrote:

> Hi Folks,
>
> I've been on holiday (and will be again next week) but I've started taking
> some steps internally to dedicate some engineering time to this project.
> Around the last week of August I expect to be able to dedicate some
> meaningful time each week to this.
>
> On Thu, Jul 18, 2019 at 3:47 PM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
>> Hi everyone, sorry it took a while for me to get these notes sent out.
>> Please reply with discussion or corrections.
>>
>> *Attendees*:
>>
>> Ryan Blue
>> Anjali Norwood
>> Jacques Nadeau
>> Anton Okolnychyi
>> David Muto
>> Erik Wright
>> Owen O’Malley
>>
>> *Topics*:
>>
>>    - Quick summary of the current approach with sequence numbers
>>    - Should global delete be supported?
>>    - What is the scope of deletes within a snapshot?
>>    - Should synthetic delete files use sequence numbers?
>>    - What should be used as a record identifier? Does offset work?
>>    - What is the format of a delete diff?
>>    - What is the scope of a delete diff in a table?
>>    - How will per-file delete diffs work?
>>    - Next steps
>>
>> *DIscussion*:
>>
>>    - Quick summary: we agree that Iceberg will add sequence numbers to
>>    metadata to scope deletes across snapshots (time). Deletes will apply to
>>    all all files with an earlier sequence number. Iceberg will use two
>>    formats, a synthetic delete diff using file/offset and an equality delete
>>    diff using a set of values to match to row data.
>>    - Global deletes
>>       - Ryan: this is for GDPR. Data layout is for query performance,
>>       not delete performance. Deleting a records that could be anywhere 
>> should be
>>       possible without eagerly scanning all data files in tables that are 
>> tens of
>>       petabytes
>>       - Owen: customers have this use case as well, it should be
>>       supported
>>       - Erik (I think): these are slow to apply because they probably
>>       are not sorted
>>       - Jacques: hash-set deletes are not a format constraint, it is an
>>       engine constraint
>>       - Ryan: we can’t always depend on sorting. That is an
>>       optimization, but engines may need to use a hash-set
>>       - Owen (I think): table maintenance and delete compaction is
>>       important to keep merge costs low
>>    - Scope of deletes within a snapshot:
>>       - Erik suggested using all the same metadata as data files
>>       - Partition data will be used to scope deletes within a snapshot.
>>    - Sequence numbers and synthetic delete files:
>>       - Anton: will sequence numbers be used?
>>       - Ryan: Yes. More ways to eliminate delete diffs that do not need
>>       to be applied is good for performance. Simpler to always apply the same
>>       rules, too.
>>       - Also, reusing files (or un-deleting files) could be a
>>       correctness problem.
>>    - Synthetic deletes and offsets: What should be used as a record
>>    identifier
>>       - The confusion was that “offset” could be interpreted as byte
>>       offset in a file. The intent was row position. Will use “position” 
>> from now
>>       on.
>>    - What is the format of a delete diff?
>>       - Equality deletes: the data columns to match. For example, a, b
>>       for a = ? and b = ? with ? filled in by row data
>>       - Positional deletes: file name and row position (sparse format,
>>       multiple data files covered by in a single delete file)
>>       - Jacques (I think): How would a positional delete file apply to
>>       just one data file?
>>       - Erik: Delete files should also have column lower/upper bounds
>>       for the deleted rows. This can be copied and merged for data files to 
>> also
>>       use stats to eliminate deletes that do not need to be applied
>>       - Erik: The latest write-up uses all existing data file metadata
>>       fields, unchanged
>>       - Ryan: that’s a clever idea and would help performance
>>       - It wouldn’t be possible to add a filename field to lower/upper
>>       bounds, so this wouldn’t work for scoping to a single file
>>       - Should a single file name, list of files names, or bloom filter
>>       be added to identify data files? Maybe as a future optimization
>>    - How will scope work for global deletes?
>>       - Ryan: use a manifest with a different partition spec. The
>>       unpartitioned spec for global delete files.
>>       - Erik: that could be used to apply deletes to other levels as
>>       well. For example, a table partitioned with bucketing could encode 
>> deletes
>>       with a partition spec that doesn’t include the bucketing level to 
>> apply to
>>       all buckets.
>>       - Ryan: that is difficult because Iceberg would need to decide
>>       that one partition contains another to apply diffs across partitions. 
>> This
>>       complicates scope, so maybe we should only allow global and 
>> partition-level
>>       for now.
>>    - Next steps:
>>       - Add preconditions for table format version (done!)
>>       - Add sequence numbers to metadata and update the spec (compatible
>>       with v1)
>>       - Define and document the delete diff formats (Anton and Erik)
>>       - Update readers to apply deletes
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
Ryan Blue
Software Engineer
Netflix

Re: Row-level delete sync notes - July 2019

Reply via email to