Regarding the doc and CATALOG syntax: I like it. Added a couple comments. Kenn
On Mon, Jun 16, 2025 at 10:39 AM Kenneth Knowles <k...@apache.org> wrote: > Aside about the annotations: we used to have @Experimental annotations but > users didn't actually notice and do things differently, mostly, and we > always forgot to "graduate" things to stable, so we removed them. I agree > with the intention though - it would be nice to let people know that this > is new and it may change. My impression is that we've moved into a phase of > the project's life where we go ahead and make changes as one-off judgement > calls rather than by annotation-driven policy. > > Kenn > > On Mon, Jun 16, 2025 at 7:36 AM Jan Lukavský <je...@seznam.cz> wrote: > >> Hi Ahmed, >> >> yes, essentially. Better would be something like @Unstable, but we don't >> have such annotation, currently. So the @Internal one must do. >> >> Jan >> On 6/13/25 20:58, Ahmed Abualsaud via dev wrote: >> >> Yeah makes sense, I managed to annotate the new Catalog and >> CatalogManager interfaces with "@Internal" before the release was cut. Is >> that what y'all had in mind? >> >> Best, >> Ahmed >> >> On Fri, Jun 13, 2025 at 8:59 AM Jan Lukavský <je...@seznam.cz> wrote: >> >>> Hi Ahmed, >>> >>> thanks for such a quick write-up. This is a pretty good start! I left >>> some comments, but if we have time pressure, I think we can release >>> "something", but clearly mark it as experimental (or better unstable), so >>> that users know what is the current state. >>> >>> WDYT? >>> >>> Jan >>> On 6/12/25 02:38, Ahmed Abualsaud via dev wrote: >>> >>> Hey Jan, thanks for calling that out. Ideally I would have liked to give >>> more time for extensive engagement like we do with other proposals. >>> >>> The reason for the accelerated timeline on this PR is to pave the way >>> for a "Getting Started" section for Beam on the Iceberg website, ideally >>> before their 1.10.0 release cut on June 23rd. Integrating IcebergIO with >>> Beam SQL is crucial for reaching a wider audience, and getting this piece >>> in before the Beam release cut (today/tomorrow) would allow us to showcase >>> SQL support sooner. >>> >>> As Talat mentioned, the underlying Java implementation can be iterated >>> on, which means our immediate priority for consensus is the SQL syntax >>> itself. >>> >>> I've put together an "after-the-fact" document to provide more context, >>> including a scouting report on how other frameworks handle catalog >>> management, and the proposed Beam SQL syntax. I hope this helps kickstart >>> the discussion. >>> >>> >>> https://docs.google.com/document/d/16P0JrcJ28KSoMMpLYExWPZaala7CE4Ezen-jC_ly3M4/edit?tab=t.0 >>> >>> Best, >>> Ahmed >>> >>> >>> On Wed, Jun 11, 2025 at 3:18 PM Talat Uyarer <ta...@apache.org> wrote: >>> >>>> Hi Ahmed, >>>> >>>> Thank you so much for this change. I have been waiting for these recent >>>> SQL changes for a while. >>>> >>>> Going forward, I agree with Jan about having a design doc to outline >>>> these changes. The underlying Java implementation is largely hidden from >>>> users, so that can be changed in the future, but as a community we should >>>> agree on the proposed SQL syntax. >>>> >>>> Jan, I am as a Beam user and a small contributor, I've also been >>>> waiting for this feature. And if you don't mind, can we get Ahmed's changes >>>> in this version? >>>> >>>> Thanks >>>> >>>> On 2025/06/11 18:42:40 Jan Lukavský wrote: >>>> > Hi Ahmed, >>>> > >>>> > this is a great effort which is by no doubt greatly needed by the >>>> Beam >>>> > project as a whole. On the other hand I think we should try to >>>> establish >>>> > a way to pull the community into the discussion process. Could you >>>> sum >>>> > up the the PR (not small) into a design document where we can have a >>>> > discussion about the goals, alternative solutions, already tried >>>> ways, >>>> > etc? This would be really cool! >>>> > >>>> > Best, >>>> > >>>> > Jan >>>> > >>>> > On 6/10/25 16:12, Ahmed Abualsaud via dev wrote: >>>> > > Hey all, >>>> > > >>>> > > I was integrating our Java IcebergIO with Beam SQL (PR #34799 >>>> > > <https://github.com/apache/beam/pull/34799>) and got blocked on >>>> the >>>> > > fact that Beam SQL currently lacks a "Catalog" concept. This is >>>> > > fundamental to modern data architectures like Iceberg, where they >>>> are >>>> > > used to manage table metadata and enable broad ecosystem >>>> integration. >>>> > > To address this gap, I've opened a new PR (#35223 >>>> > > <https://github.com/apache/beam/pull/35223>), which introduces the >>>> > > *Catalog* and *CatalogManager* interfaces, enabling support for: >>>> > > >>>> > > * >>>> > > >>>> > > |CREATE CATALOG my_catalog TYPE 'local' PROPERTIES (...)| >>>> > > >>>> > > * >>>> > > >>>> > > |SET CATALOG my_catalog| >>>> > > >>>> > > * >>>> > > >>>> > > |DROP CATALOG my_catalog| >>>> > > >>>> > > I left a more detailed overview in the PR description. >>>> > > >>>> > > My hope is that this foundational change will benefit not just >>>> > > IcebergIO, but also other IOs and future Beam SQL integrations. >>>> > > >>>> > > Please take a look and share any feedback, especially regarding >>>> major >>>> > > architectural concerns. I'm working on a short timeline, so minor >>>> > > enhancements can be noted for follow-up PRs. >>>> > > >>>> > > Thank you! >>>> > > Ahmed >>>> >>>