This is an automated email from the ASF dual-hosted git repository. ron pushed a commit to branch release-2.1 in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/release-2.1 by this push: new 16ffc02bf3e [FLINK-38017][release] Add release note for release 2.1 (#26778) 16ffc02bf3e is described below commit 16ffc02bf3ec92a4426677cf3cf29c0f7d4ee00b Author: Ron <ron9....@gmail.com> AuthorDate: Sat Jul 19 10:24:32 2025 +0800 [FLINK-38017][release] Add release note for release 2.1 (#26778) (cherry picked from commit 7309fe832cb303c9adf6bfb441790e3d293908fe) --- docs/content.zh/_index.md | 3 +- docs/content.zh/release-notes/flink-2.1.md | 185 +++++++++++++++++++++++++++++ docs/content/_index.md | 1 + docs/content/release-notes/flink-2.1.md | 185 +++++++++++++++++++++++++++++ 4 files changed, 373 insertions(+), 1 deletion(-) diff --git a/docs/content.zh/_index.md b/docs/content.zh/_index.md index 14693f69724..b2fa4429de6 100644 --- a/docs/content.zh/_index.md +++ b/docs/content.zh/_index.md @@ -86,7 +86,8 @@ under the License. For some reason Hugo will only allow linking to the release notes if there is a leading '/' and file extension. --> -请参阅 [Flink 2.0]({{< ref "/release-notes/flink-2.0.md" >}}), +请参阅 [Flink 2.1]({{< ref "/release-notes/flink-2.1.md" >}}), +[Flink 2.0]({{< ref "/release-notes/flink-2.0.md" >}}), [Flink 1.20]({{< ref "/release-notes/flink-1.20.md" >}}), [Flink 1.19]({{< ref "/release-notes/flink-1.19.md" >}}), [Flink 1.18]({{< ref "/release-notes/flink-1.18.md" >}}), diff --git a/docs/content.zh/release-notes/flink-2.1.md b/docs/content.zh/release-notes/flink-2.1.md new file mode 100644 index 00000000000..b5e1b4f2f80 --- /dev/null +++ b/docs/content.zh/release-notes/flink-2.1.md @@ -0,0 +1,185 @@ +--- +title: "Release Notes - Flink 2.1" +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Release notes - Flink 2.1 + +These release notes discuss important aspects, such as configuration, behavior or dependencies, +that changed between Flink 2.0 and Flink 2.1. Please read these notes carefully if you are +planning to upgrade your Flink version to 2.1. + +### Table SQL / API + +#### Model DDLs using Table API + +##### [FLINK-37548](https://issues.apache.org/jira/browse/FLINK-37548) + +Since Flink 2.0, we have introduced dedicated syntax for AI models, enabling users to define models +as easily as creating catalog objects and invoke them like standard functions or table functions in +SQL statements. In Flink 2.1, we also added Model DDLs Table API support, allowing users to define +and manage AI models programmatically through the Table API in both Java and Python. + +#### Realtime AI Function + +##### [FLINK-34992](https://issues.apache.org/jira/browse/FLINK-34992), [FLINK-37777](https://issues.apache.org/jira/browse/FLINK-37777) + +Based on the AI model DDL, we also expanded the `ML_PREDICT` table-valued function (TVF) to perform +realtime model inference in SQL queries, applying machine learning models to data streams +seamlessly. The implementation supports both Flink builtin model providers (OpenAI) and interfaces +for users to define custom model providers, accelerating Flink's evolution from a real-time data +processing engine to a unified realtime AI platform. Looking ahead, we plan to introduce more AI +functions to unlock end-to-end experience for real-time data processing, model training, and +inference. + +See more details about the capabilities and usages of +Flink's [Model Inference](https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/dev/table/sql/queries/model-inference/). + +#### Variant Type + +##### [FLINK-37922](https://issues.apache.org/jira/browse/FLINK-37922) + +Variant is a new data type for semi-structured data(e.g. JSON), it supports storing any +semi-structured data, including ARRAY, MAP(with STRING keys), and scalar types—while preserving +field type information in a JSON-like structure. Unlike ROW and STRUCTURED types, VARIANT provides +superior flexibility for handling deeply nested and evolving schemas. + +Users can use `PARSE_JSON` or`TRY_PARSE_JSON` to convert JSON-formatted VARCHAR data to VARIANT. In +addition, table formats like Apache Paimon now support the VARIANT type, this enable +users to efficiently process semi-structured data in lakehouse using Flink SQL. + +#### Structured Type Enhancements + +##### [FLINK-37861](https://issues.apache.org/jira/browse/FLINK-37861) + +Enabling declare user-defined objects via STRUCTURED TYPE directly in `CREATE TABLE` DDL +statements, resolving critical type equivalence issues and significantly improving API usability. + +#### Delta Join + +##### [FLINK-37836](https://issues.apache.org/jira/browse/FLINK-37836) + +Introduced a new DeltaJoin operator in stream processing jobs, along with optimizations for simple +streaming join pipeline. Compared to traditional streaming join, delta join requires significantly +less state, effectively mitigating issues related to large state, including resource bottlenecks, +slow checkpointing, and lengthy job recovery times. This feature is enabled by default. More details +can be found +at [Delta Join](https://cwiki.apache.org/confluence/display/FLINK/FLIP-486%3A+Introduce+A+New+DeltaJoin) + +#### Multiple Regular Joins + +##### [FLINK-37859](https://issues.apache.org/jira/browse/FLINK-37859) + +Streaming Flink jobs with multiple cascaded streaming joins often experience operational +instability and performance degradation due to large state sizes. This release introduces a +multi-join operator (`StreamingMultiJoinOperator`) that drastically reduces state size +by eliminating intermediate results. The operator achieves this by processing joins across all input +streams simultaneously within a single operator instance, storing only raw input records instead of +propagated join output. + +This "zero intermediate state" approach primarily targets state reduction, offering substantial +benefits in resource consumption and operational stability. This feature is now available for +pipelines with multiple INNER/LEFT joins that share at least one common join key, enable with +`SET 'table.optimizer.multi-join.enabled' = 'true'`. + +#### Async Lookup Join Enhancements + +##### [FLINK-37874](https://issues.apache.org/jira/browse/FLINK-37874) + +Support handling records in order based on upsert key (the unique key in the input stream deduced by +planner) while allowing parallel processing of different keys to achieve better throughput when +processing changelog data stream. + +#### Sink Reuse + +##### [FLINK-37227](https://issues.apache.org/jira/browse/FLINK-37227) + +Within a single Flink job, when writing multiple `INSERT INTO` statements updating identical +columns (different columns will be supported in next release) of a target table, the planner will +optimize the execution plan and merge the sink nodes to achieve reuse. This will be a great +usability improvement for users using partial-update features with data lake storages like Apache +Paimon. + +#### Support Smile Format for Compiled Plan Serialization + +##### [FLINK-37341](https://issues.apache.org/jira/browse/FLINK-37341) + +This release adds smile binary format support for compiled plans, providing a memory-efficient +alternative to JSON for serialization/deserialization. By default JSON is used, in order to use +smile format need to call `CompiledPlan#asSmileBytes` and `PlanReference#fromSmileBytes` method. + +### Runtime + +#### Add Pluggable Batching for Async Sink + +##### [FLINK-37298](https://issues.apache.org/jira/browse/FLINK-37298) + +Introducing a pluggable batching mechanism for async sink that allows users to define custom +batching write strategies tailored to specific requirements. + +#### Split-level Watermark Metrics + +##### [FLINK-37410](https://issues.apache.org/jira/browse/FLINK-37410) + +We adds some split level watermark metrics, covering watermark progress and per-split state gauges +to enhance the watermark observability: + +- `currentWatermark`: the last watermark this split has received. +- `activeTimeMsPerSecond`: the time this split is active per second. +- `pausedTimeMsPerSecond`: the time this split is paused due to watermark alignment per second. +- `idleTimeMsPerSecond`: the time this split is marked idle by idleness detection per second. +- `accumulatedActiveTimeMs`: accumulated time this split was active since registered. +- `accumulatedPausedTimeMs`: accumulated time this split was paused since registered. +- `accumulatedIdleTimeMs`: accumulated time this split was idle since registered. + +### Connectors + +#### Introduce SQL Connector for Keyed State + +##### [FLINK-36929](https://issues.apache.org/jira/browse/FLINK-36929) + +In this release, we introduce a new connector for keyed state. This connector allows +users to query keyed state directly from checkpoint or savepoint using Flink SQL, making it easier +to inspect, debug, and validate the state of Flink jobs without custom tooling. This feature is +especially useful for analyzing long-running jobs and validating state migrations. + +### Python + +#### Adds support of Python 3.12 and removes support of Python 3.8 + +##### [FLINK-37823](https://issues.apache.org/jira/browse/FLINK-37823), [FLINK-37776](https://issues.apache.org/jira/browse/FLINK-37776) + +PyFlink 2.1 will support Python 3.12 and remove the support for Python 3.8. + +### Dependency upgrades + +#### Upgrade flink-shaded version to 20.0 + +##### [FLINK-37376](https://issues.apache.org/jira/browse/FLINK-37376) + +Bump flink-shaded version to 20.0 to support Smile format. + +#### Upgrade Parquet version to 1.15.3 + +##### [FLINK-37760](https://issues.apache.org/jira/browse/FLINK-37760) + +Bump parquet version to 1.15.3 to resolve parquet-avro module +vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065). \ No newline at end of file diff --git a/docs/content/_index.md b/docs/content/_index.md index 05d5380b269..9051660db58 100644 --- a/docs/content/_index.md +++ b/docs/content/_index.md @@ -87,6 +87,7 @@ For some reason Hugo will only allow linking to the release notes if there is a leading '/' and file extension. --> See the release notes for +[Flink 2.1]({{< ref "/release-notes/flink-2.1.md" >}}), [Flink 2.0]({{< ref "/release-notes/flink-2.0.md" >}}), [Flink 1.20]({{< ref "/release-notes/flink-1.20.md" >}}), [Flink 1.19]({{< ref "/release-notes/flink-1.19.md" >}}), diff --git a/docs/content/release-notes/flink-2.1.md b/docs/content/release-notes/flink-2.1.md new file mode 100644 index 00000000000..b5e1b4f2f80 --- /dev/null +++ b/docs/content/release-notes/flink-2.1.md @@ -0,0 +1,185 @@ +--- +title: "Release Notes - Flink 2.1" +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Release notes - Flink 2.1 + +These release notes discuss important aspects, such as configuration, behavior or dependencies, +that changed between Flink 2.0 and Flink 2.1. Please read these notes carefully if you are +planning to upgrade your Flink version to 2.1. + +### Table SQL / API + +#### Model DDLs using Table API + +##### [FLINK-37548](https://issues.apache.org/jira/browse/FLINK-37548) + +Since Flink 2.0, we have introduced dedicated syntax for AI models, enabling users to define models +as easily as creating catalog objects and invoke them like standard functions or table functions in +SQL statements. In Flink 2.1, we also added Model DDLs Table API support, allowing users to define +and manage AI models programmatically through the Table API in both Java and Python. + +#### Realtime AI Function + +##### [FLINK-34992](https://issues.apache.org/jira/browse/FLINK-34992), [FLINK-37777](https://issues.apache.org/jira/browse/FLINK-37777) + +Based on the AI model DDL, we also expanded the `ML_PREDICT` table-valued function (TVF) to perform +realtime model inference in SQL queries, applying machine learning models to data streams +seamlessly. The implementation supports both Flink builtin model providers (OpenAI) and interfaces +for users to define custom model providers, accelerating Flink's evolution from a real-time data +processing engine to a unified realtime AI platform. Looking ahead, we plan to introduce more AI +functions to unlock end-to-end experience for real-time data processing, model training, and +inference. + +See more details about the capabilities and usages of +Flink's [Model Inference](https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/dev/table/sql/queries/model-inference/). + +#### Variant Type + +##### [FLINK-37922](https://issues.apache.org/jira/browse/FLINK-37922) + +Variant is a new data type for semi-structured data(e.g. JSON), it supports storing any +semi-structured data, including ARRAY, MAP(with STRING keys), and scalar types—while preserving +field type information in a JSON-like structure. Unlike ROW and STRUCTURED types, VARIANT provides +superior flexibility for handling deeply nested and evolving schemas. + +Users can use `PARSE_JSON` or`TRY_PARSE_JSON` to convert JSON-formatted VARCHAR data to VARIANT. In +addition, table formats like Apache Paimon now support the VARIANT type, this enable +users to efficiently process semi-structured data in lakehouse using Flink SQL. + +#### Structured Type Enhancements + +##### [FLINK-37861](https://issues.apache.org/jira/browse/FLINK-37861) + +Enabling declare user-defined objects via STRUCTURED TYPE directly in `CREATE TABLE` DDL +statements, resolving critical type equivalence issues and significantly improving API usability. + +#### Delta Join + +##### [FLINK-37836](https://issues.apache.org/jira/browse/FLINK-37836) + +Introduced a new DeltaJoin operator in stream processing jobs, along with optimizations for simple +streaming join pipeline. Compared to traditional streaming join, delta join requires significantly +less state, effectively mitigating issues related to large state, including resource bottlenecks, +slow checkpointing, and lengthy job recovery times. This feature is enabled by default. More details +can be found +at [Delta Join](https://cwiki.apache.org/confluence/display/FLINK/FLIP-486%3A+Introduce+A+New+DeltaJoin) + +#### Multiple Regular Joins + +##### [FLINK-37859](https://issues.apache.org/jira/browse/FLINK-37859) + +Streaming Flink jobs with multiple cascaded streaming joins often experience operational +instability and performance degradation due to large state sizes. This release introduces a +multi-join operator (`StreamingMultiJoinOperator`) that drastically reduces state size +by eliminating intermediate results. The operator achieves this by processing joins across all input +streams simultaneously within a single operator instance, storing only raw input records instead of +propagated join output. + +This "zero intermediate state" approach primarily targets state reduction, offering substantial +benefits in resource consumption and operational stability. This feature is now available for +pipelines with multiple INNER/LEFT joins that share at least one common join key, enable with +`SET 'table.optimizer.multi-join.enabled' = 'true'`. + +#### Async Lookup Join Enhancements + +##### [FLINK-37874](https://issues.apache.org/jira/browse/FLINK-37874) + +Support handling records in order based on upsert key (the unique key in the input stream deduced by +planner) while allowing parallel processing of different keys to achieve better throughput when +processing changelog data stream. + +#### Sink Reuse + +##### [FLINK-37227](https://issues.apache.org/jira/browse/FLINK-37227) + +Within a single Flink job, when writing multiple `INSERT INTO` statements updating identical +columns (different columns will be supported in next release) of a target table, the planner will +optimize the execution plan and merge the sink nodes to achieve reuse. This will be a great +usability improvement for users using partial-update features with data lake storages like Apache +Paimon. + +#### Support Smile Format for Compiled Plan Serialization + +##### [FLINK-37341](https://issues.apache.org/jira/browse/FLINK-37341) + +This release adds smile binary format support for compiled plans, providing a memory-efficient +alternative to JSON for serialization/deserialization. By default JSON is used, in order to use +smile format need to call `CompiledPlan#asSmileBytes` and `PlanReference#fromSmileBytes` method. + +### Runtime + +#### Add Pluggable Batching for Async Sink + +##### [FLINK-37298](https://issues.apache.org/jira/browse/FLINK-37298) + +Introducing a pluggable batching mechanism for async sink that allows users to define custom +batching write strategies tailored to specific requirements. + +#### Split-level Watermark Metrics + +##### [FLINK-37410](https://issues.apache.org/jira/browse/FLINK-37410) + +We adds some split level watermark metrics, covering watermark progress and per-split state gauges +to enhance the watermark observability: + +- `currentWatermark`: the last watermark this split has received. +- `activeTimeMsPerSecond`: the time this split is active per second. +- `pausedTimeMsPerSecond`: the time this split is paused due to watermark alignment per second. +- `idleTimeMsPerSecond`: the time this split is marked idle by idleness detection per second. +- `accumulatedActiveTimeMs`: accumulated time this split was active since registered. +- `accumulatedPausedTimeMs`: accumulated time this split was paused since registered. +- `accumulatedIdleTimeMs`: accumulated time this split was idle since registered. + +### Connectors + +#### Introduce SQL Connector for Keyed State + +##### [FLINK-36929](https://issues.apache.org/jira/browse/FLINK-36929) + +In this release, we introduce a new connector for keyed state. This connector allows +users to query keyed state directly from checkpoint or savepoint using Flink SQL, making it easier +to inspect, debug, and validate the state of Flink jobs without custom tooling. This feature is +especially useful for analyzing long-running jobs and validating state migrations. + +### Python + +#### Adds support of Python 3.12 and removes support of Python 3.8 + +##### [FLINK-37823](https://issues.apache.org/jira/browse/FLINK-37823), [FLINK-37776](https://issues.apache.org/jira/browse/FLINK-37776) + +PyFlink 2.1 will support Python 3.12 and remove the support for Python 3.8. + +### Dependency upgrades + +#### Upgrade flink-shaded version to 20.0 + +##### [FLINK-37376](https://issues.apache.org/jira/browse/FLINK-37376) + +Bump flink-shaded version to 20.0 to support Smile format. + +#### Upgrade Parquet version to 1.15.3 + +##### [FLINK-37760](https://issues.apache.org/jira/browse/FLINK-37760) + +Bump parquet version to 1.15.3 to resolve parquet-avro module +vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065). \ No newline at end of file