Looks like there are people that did not see the email thread responses. I have copied all conversations to the Github issue linked, let's continue the discussion there. -Jack Ye
On Wed, Jun 23, 2021 at 12:15 AM Jack Ye <[email protected]> wrote: > I think the summary looks mostly correct, where we already can do "low > latency" append and delete by primary key, or delete with any predicate > that resolves to finitely many equality constraints without the need to > scan all rows. > > Secondary index would be useful for scan, which is subsequently used by > the generation of delete files for complex delete predicates. This is still > in design, we should probably push progress in this domain a bit more > because I haven't heard any update since a few months ago: > - > https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ > - > https://docs.google.com/document/d/11o3T7XQVITY_5F9Vbri9lF9oJjDZKjHIso7K8tEaFfY > > For metadata, I agree with Yufei that a change set approach sounds like > quite a lot of extra complexity just to keep the commit phase a little bit > faster. I understand that your single commit time is in the end bounded by > the time you rewrite that file, but this almost sounds like we want to do > merge-on-read even for metadata, which would be cool but likely an overkill > for most users. In my mind, the metadata layer should be managed by the > admin constantly, and it is much simpler to just periodically optimize the > number of manifests in order to control the size of the manifest list file > during each commit, which would also benefit scan planning performance at > the same time. I am curious about your "low latency" requirement, do you > have any actual numbers you need to hit, if you could share them here? > > -Jack Ye > > On Tue, Jun 22, 2021 at 5:20 PM Yufei Gu <[email protected]> wrote: > >> For 3, it may not be worth adding extra complexity by introducing a >> "change set", unless we get solid data that shows writing a "change set" is >> faster than a complete rewrite. >> >> Best, >> >> Yufei >> >> `This is not a contribution` >> >> >> On Tue, Jun 22, 2021 at 12:43 PM Sreeram Garlapati < >> [email protected]> wrote: >> >>> Hello Iceberg devs! >>> >>> Did any of you solve "low latency writes to Iceberg"? Overall, it boiled >>> down to 2 questions: >>> 1. Is there a way to add indexes to Iceberg table - to support equality >>> based filters (pl. see the point #2 below for more explanation) >>> 2. Is there a workstream to support writing delta's of metadata changes >>> (pl. see point #3 below)? >>> >>> Best, >>> Sreeram >>> >>> https://github.com/apache/iceberg/issues/2723 >>> >>> *Truly appreciate any inputs.* >>> Supporting low latency writes to iceberg table entails the below >>> sub-problems: >>> >>> 1. Optimizing the data payload: optimizing the data payload to be >>> written to the table. >>> - There is an Optimization that is specific to the write pattern >>> 1. Appends: in case of Appends - this is already solved - as >>> iceberg always writes the new new inserts as new files. >>> 2. Deletes (/Upserts): in case of Deletes (or Upserts - which >>> are broken down into Insert + Delete in 2.0) - this problem is >>> solved as >>> well. >>> - File Format: There is another optimization knob useful at file >>> format level. It might not make sense to generate data in the columnar >>> format here - al the time and space spent for encoding, storing stats >>> etc >>> (assuming the writes are small number of rows for ex: < 5) can be >>> saved. >>> So, thankfully, for these low-latency writes with iceberg table - >>> AVRO file format can be used. >>> 2. Locating the records that need to be updated with low-latency: In >>> case of Upserts - locating the records that need to be updated is the key >>> problem to be solved. >>> - One popular solution for this is to maintain indexes to support >>> the equality filters used for upserts. Do you know if there is >>> any ongoing effort for this!? >>> 3. Optimizing the Metadata payload: for every write to Iceberg table >>> - the schema file & manifest list file are rewritten. To further >>> push the payload down - we can potentially write the "change set" here. >>> Is >>> this the current direction of thought? *If so, pointers to any any >>> work stream in this regard is truly appreciated.* >>> >>>
