Thanks for sharing the doc! Left some comments but I think the general
design looks great and I'm exited to see this new source in Beam.

Best,
Ahmed

On Tue, May 12, 2026 at 2:28 PM Chamikara Jayalath via dev <
[email protected]> wrote:

> Hi all,
>
> I'm looking into implementing a Delta Lake [1] source for Apache Beam.
>
> Some of the highlights are listed below.
>
> *Add support for reading data from an existing Delta Lake table (at HEAD,
> which could be past the latest checkpoint).
> * Support reading from a specific checkpoint (latest or past).
> * Use the new Delta Kernel API to implement the source.
> * Support parallelized reading via initial splitting and/or dynamic work
> rebalancing.
> * Support for Beam managed I/O - this will automatically make the
> connector available to Python SDK and will also allow runners to manage the
> version of the connector.
>
> A design doc is available here:
> https://s.apache.org/beam-delta-lake-source
>
> Please let me know if you have any comments/questions.
>
> Thanks,
> Cham
>
> [1] https://delta.io/
>

Reply via email to