Thanks for the comments. An initial prototype is available here: https://github.com/apache/beam/pull/38560
Please let me know if there are any additional comments on the doc or the prototype. Please see here for the GitHub issue and the task breakdown: https://github.com/apache/beam/issues/21100 Thanks, Cham On Wed, May 13, 2026 at 4:55 PM Ahmed Abualsaud <[email protected]> wrote: > Thanks for sharing the doc! Left some comments but I think the general > design looks great and I'm exited to see this new source in Beam. > > Best, > Ahmed > > On Tue, May 12, 2026 at 2:28 PM Chamikara Jayalath via dev < > [email protected]> wrote: > >> Hi all, >> >> I'm looking into implementing a Delta Lake [1] source for Apache Beam. >> >> Some of the highlights are listed below. >> >> *Add support for reading data from an existing Delta Lake table (at HEAD, >> which could be past the latest checkpoint). >> * Support reading from a specific checkpoint (latest or past). >> * Use the new Delta Kernel API to implement the source. >> * Support parallelized reading via initial splitting and/or dynamic work >> rebalancing. >> * Support for Beam managed I/O - this will automatically make the >> connector available to Python SDK and will also allow runners to manage the >> version of the connector. >> >> A design doc is available here: >> https://s.apache.org/beam-delta-lake-source >> >> Please let me know if you have any comments/questions. >> >> Thanks, >> Cham >> >> [1] https://delta.io/ >> >
