Hi Ahmed,

thanks for such a quick write-up. This is a pretty good start! I left some comments, but if we have time pressure, I think we can release "something", but clearly mark it as experimental (or better unstable), so that users know what is the current state.

WDYT?

 Jan

On 6/12/25 02:38, Ahmed Abualsaud via dev wrote:
Hey Jan, thanks for calling that out. Ideally I would have liked to give more time for extensive engagement like we do with other proposals.

The reason for the accelerated timeline on this PR is to pave the way for a "Getting Started" section for Beam on the Iceberg website, ideally before their 1.10.0 release cut on June 23rd. Integrating IcebergIO with Beam SQL is crucial for reaching a wider audience, and getting this piece in before the Beam release cut (today/tomorrow) would allow us to showcase SQL support sooner.

As Talat mentioned, the underlying Java implementation can be iterated on, which means our immediate priority for consensus is the SQL syntax itself.

I've put together an "after-the-fact" document to provide more context, including a scouting report on how other frameworks handle catalog management, and the proposed Beam SQL syntax. I hope this helps kickstart the discussion.

https://docs.google.com/document/d/16P0JrcJ28KSoMMpLYExWPZaala7CE4Ezen-jC_ly3M4/edit?tab=t.0

Best,
Ahmed


On Wed, Jun 11, 2025 at 3:18 PM Talat Uyarer <ta...@apache.org> wrote:

    Hi Ahmed,

    Thank you so much for this change. I have been waiting for these
    recent SQL changes for a while.

    Going forward, I agree with Jan about having a design doc to
    outline these changes. The underlying Java implementation is
    largely hidden from users, so that can be changed in the future,
    but as a community we should agree on the proposed SQL syntax.

    Jan, I am as a Beam user and a small contributor, I've also been
    waiting for this feature. And if you don't mind, can we get
    Ahmed's changes in this version?

    Thanks

    On 2025/06/11 18:42:40 Jan Lukavský wrote:
    > Hi Ahmed,
    >
    > this is a great effort which is by no doubt greatly needed by
    the Beam
    > project as a whole. On the other hand I think we should try to
    establish
    > a way to pull the community into the discussion process. Could
    you sum
    > up the the PR (not small) into a design document where we can
    have a
    > discussion about the goals, alternative solutions, already tried
    ways,
    > etc? This would be really cool!
    >
    > Best,
    >
    >   Jan
    >
    > On 6/10/25 16:12, Ahmed Abualsaud via dev wrote:
    > > Hey all,
    > >
    > > I was integrating our Java IcebergIO with Beam SQL (PR #34799
    > > <https://github.com/apache/beam/pull/34799>) and got blocked
    on the
    > > fact that Beam SQL currently lacks a "Catalog" concept. This is
    > > fundamental to modern data architectures like Iceberg, where
    they are
    > > used to manage table metadata and enable broad ecosystem
    integration.
    > > To address this gap, I've opened a new PR (#35223
    > > <https://github.com/apache/beam/pull/35223>), which introduces
    the
    > > *Catalog* and *CatalogManager* interfaces, enabling support for:
    > >
    > >  *
    > >
    > >     |CREATE CATALOG my_catalog TYPE 'local' PROPERTIES (...)|
    > >
    > >  *
    > >
    > >     |SET CATALOG my_catalog|
    > >
    > >  *
    > >
    > >     |DROP CATALOG my_catalog|
    > >
    > > I left a more detailed overview in the PR description.
    > >
    > > My hope is that this foundational change will benefit not just
    > > IcebergIO, but also other IOs and future Beam SQL integrations.
    > >
    > > Please take a look and share any feedback, especially
    regarding major
    > > architectural concerns. I'm working on a short timeline, so minor
    > > enhancements can be noted for follow-up PRs.
    > >
    > > Thank you!
    > > Ahmed

Reply via email to