+1 to this deprecation. Thanks for putting together a clear summary. FWIW it also has significantly worse performance than Calcite SQL dialect, since it calls out to a ZetaSQL subprocess for most calculations, and that is less optimized than Beam's Fn API.
Kenn On Tue, Mar 25, 2025 at 4:18 PM Robert Bradshaw via user < u...@beam.apache.org> wrote: > I'm in favor of deprecating this and cleaning it up, but it depends on > usage. I suspect it is low (or possibly non-existent, especially as there's > little upside to moving away from the default). I cc'd user@ just in case > anyone wants to chime in there. This may be a good thing to add to our > release notes as well (perhaps we can get it in the one that's just about > to go out). > > Unless there is strong, justified pushback, I'd get the deprecation status > (e.g. on the javadocs, website) right away. For actual removal, I agree > with the idea of waiting until it actually causes issues or we move to the > next major beam release, though I might push back at 2.66 being a bit too > quick even if the first condition is hit before then and might give people > at least a quarter's notice. > > - Robert > > > On Mon, Mar 24, 2025 at 2:27 PM Yi Hu via dev <dev@beam.apache.org> wrote: > >> Hi everyone, >> >> I would like to bring up discussion for deprecating Beam SQL's ZetaSQL >> component [1]. >> Beam SQL currently serves with two SQL dialects (i) Apache Calcite and >> (ii) ZetaSQL dialects, see documentation [2] due to the following reasons >> >> - Developments in Beam for ZetaSQL dialect effectively stalled since >> early 2022 (See change history [3]) >> >> - Despite incomplete support status, there is no new bug / feature >> request opened ever since we migrated to use GitHub Issue, suggesting >> minimal adoption [4] >> >> - We still need to keep zetasql up-to-date if its dependency conflicts >> with other google dependencies, as a result ZetaSQL component introduces >> maintenance burden when upgrading GCP-BOM (e.g. [5]). >> >> - One of the main reason that using ZetaSQL dialect, per [2], was because >> >> > ZetaSQL is more compatible with BigQuery, so it’s especially useful in >> pipelines that write to or read from BigQuery tables. >> >> As of today, as GCP BigQuery now supports using GoogleSQL (open-sourced >> as ZetaSQL) querying data that's stored outside of BigQuery via BigQuery >> Connections API / Federated query [6, 7]. This largely provides an >> alternative for using Beam's ZetaSQL interacting with BigQuery. >> >> For these reasons, I propose initiating the process of deprecating >> Beam SQL's ZetaSQL component. There are two decisions needed to be made: >> >> Firstly, agree on when to document the deprecated status for ZetaSQL >> component in javadoc, beam website, currently I recommend do it in the >> release that currently HEAD belongs, that is Beam 2.65.0 (cut April 30, >> 2025) >> >> Secondly, stop publishing ZetaSQL artifacts. This is a breaking change, >> and I think we can leave the deprecated status as is until the following >> situation emerges, whichever comes first, and no earlier than Beam 2.66.0 >> (cut Jun 11, 2025) >> >> - Continued support for ZetaSQL component involving significant burdens, >> like conflict with other Beam dependencies, supported Java versions, etc, or >> - When Beam moved to the next release major release (3) >> >> Thanks for your attention, and any input welcomed! >> >> Regards, >> Yi >> >> [1] >> https://github.com/apache/beam/tree/master/sdks/java/extensions/sql/zetasql >> [2] https://beam.apache.org/documentation/dsls/sql/overview/ >> [3] >> https://github.com/benEng/beam/commits/master/sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/SupportedZetaSqlBuiltinFunctions.java >> [4] >> https://github.com/apache/beam/issues?q=is%3Aissue%20%20label%3Azetasql%20 >> [5] https://github.com/apache/beam/pull/32902 >> [6] https://cloud.google.com/bigquery/docs/connections-api-intro >> [7] https://cloud.google.com/bigquery/docs/federated-queries-intro >> >> -- >> >> Yi Hu, (he/him/his) >> >> Software Engineer >> >> >>