Hey Jan, thanks for calling that out. Ideally I would have liked to give more time for extensive engagement like we do with other proposals.
The reason for the accelerated timeline on this PR is to pave the way for a "Getting Started" section for Beam on the Iceberg website, ideally before their 1.10.0 release cut on June 23rd. Integrating IcebergIO with Beam SQL is crucial for reaching a wider audience, and getting this piece in before the Beam release cut (today/tomorrow) would allow us to showcase SQL support sooner. As Talat mentioned, the underlying Java implementation can be iterated on, which means our immediate priority for consensus is the SQL syntax itself. I've put together an "after-the-fact" document to provide more context, including a scouting report on how other frameworks handle catalog management, and the proposed Beam SQL syntax. I hope this helps kickstart the discussion. https://docs.google.com/document/d/16P0JrcJ28KSoMMpLYExWPZaala7CE4Ezen-jC_ly3M4/edit?tab=t.0 Best, Ahmed On Wed, Jun 11, 2025 at 3:18 PM Talat Uyarer <ta...@apache.org> wrote: > Hi Ahmed, > > Thank you so much for this change. I have been waiting for these recent > SQL changes for a while. > > Going forward, I agree with Jan about having a design doc to outline these > changes. The underlying Java implementation is largely hidden from users, > so that can be changed in the future, but as a community we should agree on > the proposed SQL syntax. > > Jan, I am as a Beam user and a small contributor, I've also been waiting > for this feature. And if you don't mind, can we get Ahmed's changes in this > version? > > Thanks > > On 2025/06/11 18:42:40 Jan Lukavský wrote: > > Hi Ahmed, > > > > this is a great effort which is by no doubt greatly needed by the Beam > > project as a whole. On the other hand I think we should try to establish > > a way to pull the community into the discussion process. Could you sum > > up the the PR (not small) into a design document where we can have a > > discussion about the goals, alternative solutions, already tried ways, > > etc? This would be really cool! > > > > Best, > > > > Jan > > > > On 6/10/25 16:12, Ahmed Abualsaud via dev wrote: > > > Hey all, > > > > > > I was integrating our Java IcebergIO with Beam SQL (PR #34799 > > > <https://github.com/apache/beam/pull/34799>) and got blocked on the > > > fact that Beam SQL currently lacks a "Catalog" concept. This is > > > fundamental to modern data architectures like Iceberg, where they are > > > used to manage table metadata and enable broad ecosystem integration. > > > To address this gap, I've opened a new PR (#35223 > > > <https://github.com/apache/beam/pull/35223>), which introduces the > > > *Catalog* and *CatalogManager* interfaces, enabling support for: > > > > > > * > > > > > > |CREATE CATALOG my_catalog TYPE 'local' PROPERTIES (...)| > > > > > > * > > > > > > |SET CATALOG my_catalog| > > > > > > * > > > > > > |DROP CATALOG my_catalog| > > > > > > I left a more detailed overview in the PR description. > > > > > > My hope is that this foundational change will benefit not just > > > IcebergIO, but also other IOs and future Beam SQL integrations. > > > > > > Please take a look and share any feedback, especially regarding major > > > architectural concerns. I'm working on a short timeline, so minor > > > enhancements can be noted for follow-up PRs. > > > > > > Thank you! > > > Ahmed >