Re: [PR] [FLINK-38661][docs] Add 2.2.0 release note [flink]

via GitHub Mon, 01 Dec 2025 17:04:25 -0800


lincoln-lil commented on code in PR #27235:
URL: https://github.com/apache/flink/pull/27235#discussion_r2579210747



##########
docs/content.zh/release-notes/flink-2.2.md:
##########
@@ -0,0 +1,224 @@
+---
+title: "Release Notes - Flink 2.2"
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Release notes - Flink 2.2
+
+These release notes discuss important aspects, such as configuration, behavior 
or dependencies,
+that changed between Flink 2.1 and Flink 2.2. Please read these notes 
carefully if you are
+planning to upgrade your Flink version to 2.2.
+
+### Table SQL / API
+
+#### Support VECTOR_SEARCH in Flink SQL
+
+##### [FLINK-38422](https://issues.apache.org/jira/browse/FLINK-38422)
+
+Apache Flink has supported leveraging LLM capabilities through the 
`ML_PREDICT` function in Flink SQL
+since version 2.1, enabling users to perform semantic analysis in a simple and 
efficient way. This
+integration has been technically validated in scenarios such as log 
classification and real-time
+question-answering systems. However, the current architecture allows Flink to 
only use embedding
+models to convert unstructured data (e.g., text, images) into high-dimensional 
vector features,
+which are then persisted to downstream storage systems. It lacks real-time 
online querying and
+similarity analysis capabilities for vector spaces. The VECTOR_SEARCH function 
is provided in Flink
+2.2 to enable users to perform streaming vector similarity searches and 
real-time context retrieval
+directly within Flink.
+
+See more details about the capabilities and usages of
+Flink's [Vector 
Search](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/sql/queries/vector-search/).
+
+#### Realtime AI Function
+
+##### [FLINK-38104](https://issues.apache.org/jira/browse/FLINK-38104)
+
+Apache Flink has supported leveraging LLM capabilities through the 
`ML_PREDICT` function in Flink SQL
+since version 2.1. In Flink 2.2, the Table API also supports model inference 
operations that allow
+you to integrate machine learning models directly into your data processing 
pipelines.
+
+#### Materialized Table
+
+##### [FLINK-38532](https://issues.apache.org/jira/browse/FLINK-38532), 
[FLINK-38311](https://issues.apache.org/jira/browse/FLINK-38311)
+
+Materialized Table is a new table type introduced in Flink SQL, aimed at 
simplifying both batch and
+stream data pipelines, providing a consistent development experience. By 
specifying data freshness
+and query when creating Materialized Table, the engine automatically derives 
the schema for the
+materialized table and creates corresponding data refresh pipeline to achieve 
the specified freshness.
+
+From Flink 2.2, the FRESHNESS clause is not a mandatory part of the CREATE 
MATERIALIZED TABLE and
+CREATE OR ALTER MATERIALIZED TABLE DDL statements. Flink 2.2 introduces a new 
MaterializedTableEnricher
+interface. This provides a formal extension point for customizable default 
logic, allowing advanced
+users and vendors to implement "smart" default behaviors (e.g., inferring 
freshness from upstream tables).
+
+Besides this, users can use `DISTRIBUTED INTO` or`DISTRIBUTED INTO` to support 
bucketing concept
+for Materialized tables. Users can use `SHOW MATERIALIZED TABLES` to show all 
Materialized tables.
+
+#### SinkUpsertMaterializer V2
+
+##### [FLINK-38459](https://issues.apache.org/jira/browse/FLINK-38459)
+
+SinkUpsertMaterializer is an operator in Flink that reconciles out of order 
changelog events before
+sending them to an upsert sink. Performance of this operator degrades 
exponentially in some cases.
+Flink 2.2 introduces a new implementation that is optimized for such cases.
+
+#### Delta Join
+
+##### [FLINK-38495](https://issues.apache.org/jira/browse/FLINK-38495), 
[FLINK-38511](https://issues.apache.org/jira/browse/FLINK-38511), 
[FLINK-38556](https://issues.apache.org/jira/browse/FLINK-38556)
+
+In 2.1, Apache Flink has introduced a new delta join operator to mitigate the 
challenges caused by
+big state in regular joins. It replaces the large state maintained by regular 
joins with a
+bidirectional lookup-based join that directly reuses data from the source 
tables.
+
+Flink 2.2 enhances support for converting more SQL patterns into delta joins. 
Delta joins now
+support consuming CDC sources without DELETE operations, and allow projection 
and filter operations
+after the source. Additionally, delta joins include support for caching, which 
helps reduce requests
+to external storage.
+
+See more details about the capabilities and usages of Flink's
+[Delta 
Joins](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/tuning/#delta-joins).
+
+#### SQL Types
+
+##### [FLINK-20539](https://issues.apache.org/jira/browse/FLINK-20539), 
[FLINK-38181](https://issues.apache.org/jira/browse/FLINK-38181)
+
+Before Flink 2.2, row types defined in SQL e.g. `SELECT CAST(f AS ROW&lt;i NOT 
NULL&gt;)` did ignore 
+the `NOT NULL` constraint. This was more aligned with the SQL standard but 
caused many type 
+inconsistencies and cryptic error message when working on nested data. For 
example, it prevented 
+using rows in computed columns or join keys. The new behavior takes the 
nullability into consideration.
+The config option `table.legacy-nested-row-nullability` allows to restore the 
old behavior if required,
+but it is recommended to update existing queries that ignored constraints 
before.
+
+Casting to TIME type now considers the correct precision (0-3). Casting 
incorrect strings to time 
+(e.g. where the hour component is higher than 24) leads to a runtime exception 
now. Casting between 
+BINARY and VARBINARY should now correctly consider the target length.
+
+### Runtime
+
+#### Balanced Tasks Scheduling
+
+##### [FLINK-31757](https://issues.apache.org/jira/browse/FLINK-31757)
+
+Introducing a balanced tasks scheduling strategy to achieve task load 
balancing for TMs and reducing
+job bottlenecks.
+
+See more details about the capabilities and usages of
+Flink's [Balanced Tasks 
Scheduling](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/deployment/tasks-scheduling/balanced_tasks_scheduling/).
+
+#### Enhanced Job History Retention Policies for HistoryServer
+
+##### [FLINK-38229](https://issues.apache.org/jira/browse/FLINK-38229)
+
+Before Flink 2.2, HistoryServer supports only a quantity-based job archive 
retention policy and
+is insufficient for scenarios, requiring time-based retention or combined 
rules. Users can use
+the new configuration `historyserver.archive.retained-ttl` combining with 
`historyserver.archive.retained-jobs`
+to fulfill more scenario requirements.
+
+#### Metrics
+
+##### [FLINK-38158](https://issues.apache.org/jira/browse/FLINK-38158), 
[FLINK-38353](https://issues.apache.org/jira/browse/FLINK-38353)
+
+Since 2.2.0 users can now assign custom metric variables for each 
operator/transformation used in the
+Job. Those variables are later converted to tags/labels by the metric 
reporters, allowing users to 
+tab/label specific operator's metrics. For example, you can use this to name 
and differentiate sources.
+
+Users can now control the level of details of checkpoint spans via 
[traces.checkpoint.span-detail-level](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/deployment/config/#traces-checkpoint-span-detail-level).
+Highest levels report tree of spans for each task and subtask. Reported custom 
spans can now contain 
+children spans. See more details in 
[Traces](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/ops/traces/).
+
+#### Introduce Event Reporting
+
+##### [FLINK-37426](https://issues.apache.org/jira/browse/FLINK-37426)
+
+Since 2.1.0 users are able to report custom events using the EventReporters. 
Since 2.2.0 Flink reports
+some built-in/system events.
+
+#### Use UniqueKeys instead of Upsertkeys for state management

Review Comment:
   This part should also be placed in the `Table SQL / API` section.
   



##########
docs/content.zh/release-notes/flink-2.2.md:
##########
@@ -0,0 +1,224 @@
+---
+title: "Release Notes - Flink 2.2"
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Release notes - Flink 2.2
+
+These release notes discuss important aspects, such as configuration, behavior 
or dependencies,
+that changed between Flink 2.1 and Flink 2.2. Please read these notes 
carefully if you are
+planning to upgrade your Flink version to 2.2.
+
+### Table SQL / API
+
+#### Support VECTOR_SEARCH in Flink SQL
+
+##### [FLINK-38422](https://issues.apache.org/jira/browse/FLINK-38422)
+
+Apache Flink has supported leveraging LLM capabilities through the 
`ML_PREDICT` function in Flink SQL
+since version 2.1, enabling users to perform semantic analysis in a simple and 
efficient way. This
+integration has been technically validated in scenarios such as log 
classification and real-time
+question-answering systems. However, the current architecture allows Flink to 
only use embedding
+models to convert unstructured data (e.g., text, images) into high-dimensional 
vector features,
+which are then persisted to downstream storage systems. It lacks real-time 
online querying and
+similarity analysis capabilities for vector spaces. The VECTOR_SEARCH function 
is provided in Flink
+2.2 to enable users to perform streaming vector similarity searches and 
real-time context retrieval
+directly within Flink.
+
+See more details about the capabilities and usages of
+Flink's [Vector 
Search](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/sql/queries/vector-search/).
+
+#### Realtime AI Function
+
+##### [FLINK-38104](https://issues.apache.org/jira/browse/FLINK-38104)
+
+Apache Flink has supported leveraging LLM capabilities through the 
`ML_PREDICT` function in Flink SQL
+since version 2.1. In Flink 2.2, the Table API also supports model inference 
operations that allow
+you to integrate machine learning models directly into your data processing 
pipelines.
+
+#### Materialized Table
+
+##### [FLINK-38532](https://issues.apache.org/jira/browse/FLINK-38532), 
[FLINK-38311](https://issues.apache.org/jira/browse/FLINK-38311)
+
+Materialized Table is a new table type introduced in Flink SQL, aimed at 
simplifying both batch and
+stream data pipelines, providing a consistent development experience. By 
specifying data freshness
+and query when creating Materialized Table, the engine automatically derives 
the schema for the
+materialized table and creates corresponding data refresh pipeline to achieve 
the specified freshness.
+
+From Flink 2.2, the FRESHNESS clause is not a mandatory part of the CREATE 
MATERIALIZED TABLE and
+CREATE OR ALTER MATERIALIZED TABLE DDL statements. Flink 2.2 introduces a new 
MaterializedTableEnricher
+interface. This provides a formal extension point for customizable default 
logic, allowing advanced
+users and vendors to implement "smart" default behaviors (e.g., inferring 
freshness from upstream tables).
+
+Besides this, users can use `DISTRIBUTED INTO` or`DISTRIBUTED INTO` to support 
bucketing concept

Review Comment:
   `DISTRIBUTED INTO` appears duplicated here, pls check whether one of them 
should be a different clause.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [FLINK-38661][docs] Add 2.2.0 release note [flink]

Reply via email to