Thanks for the comments. An initial prototype is available here:
https://github.com/apache/beam/pull/38560

Please let me know if there are any additional comments on the doc or the
prototype.

Please see here for the GitHub issue and the task breakdown:
https://github.com/apache/beam/issues/21100

Thanks,
Cham

On Wed, May 13, 2026 at 4:55 PM Ahmed Abualsaud <[email protected]>
wrote:

> Thanks for sharing the doc! Left some comments but I think the general
> design looks great and I'm exited to see this new source in Beam.
>
> Best,
> Ahmed
>
> On Tue, May 12, 2026 at 2:28 PM Chamikara Jayalath via dev <
> [email protected]> wrote:
>
>> Hi all,
>>
>> I'm looking into implementing a Delta Lake [1] source for Apache Beam.
>>
>> Some of the highlights are listed below.
>>
>> *Add support for reading data from an existing Delta Lake table (at HEAD,
>> which could be past the latest checkpoint).
>> * Support reading from a specific checkpoint (latest or past).
>> * Use the new Delta Kernel API to implement the source.
>> * Support parallelized reading via initial splitting and/or dynamic work
>> rebalancing.
>> * Support for Beam managed I/O - this will automatically make the
>> connector available to Python SDK and will also allow runners to manage the
>> version of the connector.
>>
>> A design doc is available here:
>> https://s.apache.org/beam-delta-lake-source
>>
>> Please let me know if you have any comments/questions.
>>
>> Thanks,
>> Cham
>>
>> [1] https://delta.io/
>>
>

Reply via email to