+1 great doc. Thank you.

On Thu, Jan 22, 2026 at 8:53 PM Ahmed Abualsaud via dev <[email protected]>
wrote:

> Hey everyone,
>
> I’d like to share a design proposal for a new Iceberg Incremental CDC
> Source in Apache Beam:
>
> *Design Doc*:
> https://docs.google.com/document/d/1_W6nDpiHKCk2oKrs-ICBn5IEeYm1AXhE0ItaK2-rrng
> *Draft PR*: https://github.com/apache/beam/pull/37191
>
> Currently, Beam’s IcebergIO supports streaming reads for append-only
> snapshots. This proposal introduces a native streaming source capable of
> processing full CDC events (inserts, updates, and deletes) using Iceberg’s
> IncrementalChangelogScan API.
>
> The doc has an initial intro to Iceberg CDC then jumps into some
> performance optimizations, specifically using snapshot, partition, and file
> metadata to bypass expensive shuffles when possible.
>
> Would appreciate any feedback or thoughts on this approach!
>
> Thanks,
> Ahmed Abualsaud
>

Reply via email to