I think the summary looks mostly correct, where we already can do "low
latency" append and delete by primary key, or delete with any predicate
that resolves to finitely many equality constraints without the need to
scan all rows.

Secondary index would be useful for scan, which is subsequently used by the
generation of delete files for complex delete predicates. This is still in
design, we should probably push progress in this domain a bit more because
I haven't heard any update since a few months ago:
-
https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ
-
https://docs.google.com/document/d/11o3T7XQVITY_5F9Vbri9lF9oJjDZKjHIso7K8tEaFfY

For metadata, I agree with Yufei that a change set approach sounds like
quite a lot of extra complexity just to keep the commit phase a little bit
faster. I understand that your single commit time is in the end bounded by
the time you rewrite that file, but this almost sounds like we want to do
merge-on-read even for metadata, which would be cool but likely an overkill
for most users. In my mind, the metadata layer should be managed by the
admin constantly, and it is much simpler to just periodically optimize the
number of manifests in order to control the size of the manifest list file
during each commit, which would also benefit scan planning performance at
the same time. I am curious about your "low latency" requirement, do you
have any actual numbers you need to hit, if you could share them here?

-Jack Ye

On Tue, Jun 22, 2021 at 5:20 PM Yufei Gu <[email protected]> wrote:

> For 3, it may not be worth adding extra complexity by introducing a
> "change set", unless we get solid data that shows writing a "change set" is
> faster than a complete rewrite.
>
> Best,
>
> Yufei
>
> `This is not a contribution`
>
>
> On Tue, Jun 22, 2021 at 12:43 PM Sreeram Garlapati <
> [email protected]> wrote:
>
>> Hello Iceberg devs!
>>
>> Did any of you solve "low latency writes to Iceberg"? Overall, it boiled
>> down to 2 questions:
>> 1. Is there a way to add indexes to Iceberg table - to support equality
>> based filters (pl. see the point #2 below for more explanation)
>> 2. Is there a workstream to support writing delta's of metadata changes
>> (pl. see point #3 below)?
>>
>> Best,
>> Sreeram
>>
>> https://github.com/apache/iceberg/issues/2723
>>
>> *Truly appreciate any inputs.*
>> Supporting low latency writes to iceberg table entails the below
>> sub-problems:
>>
>>    1. Optimizing the data payload: optimizing the data payload to be
>>    written to the table.
>>       - There is an Optimization that is specific to the write pattern
>>          1. Appends: in case of Appends - this is already solved - as
>>          iceberg always writes the new new inserts as new files.
>>          2. Deletes (/Upserts): in case of Deletes (or Upserts - which
>>          are broken down into Insert + Delete in 2.0) - this problem is 
>> solved as
>>          well.
>>       - File Format: There is another optimization knob useful at file
>>       format level. It might not make sense to generate data in the columnar
>>       format here - al the time and space spent for encoding, storing stats 
>> etc
>>       (assuming the writes are small number of rows for ex: < 5) can be 
>> saved.
>>       So, thankfully, for these low-latency writes with iceberg table -
>>       AVRO file format can be used.
>>    2. Locating the records that need to be updated with low-latency: In
>>    case of Upserts - locating the records that need to be updated is the key
>>    problem to be solved.
>>       - One popular solution for this is to maintain indexes to support
>>       the equality filters used for upserts. Do you know if there is any
>>       ongoing effort for this!?
>>    3. Optimizing the Metadata payload: for every write to Iceberg table
>>    - the schema file & manifest list file are rewritten. To further push
>>    the payload down - we can potentially write the "change set" here. Is this
>>    the current direction of thought? *If so, pointers to any any work
>>    stream in this regard is truly appreciated.*
>>
>>

Reply via email to