xuyangzhong commented on code in PR #2030:
URL: https://github.com/apache/fluss/pull/2030#discussion_r2570397980


##########
website/docs/engine-flink/delta-joins.md:
##########
@@ -172,11 +177,16 @@ Refer to the [Delta Join 
Issue](https://issues.apache.org/jira/browse/FLINK-3783
 
 #### Limitations
 
-- The primary key or the prefix lookup key of the tables must be included as 
part of the equivalence conditions in the join.
+- The primary key or the prefix key of the tables must be included as part of 
the equivalence conditions in the join.
 - The join must be a INNER join.
-- The downstream nodes of the join can accept duplicate changes, such as a 
sink that provides UPSERT mode.
-- When consuming a CDC stream, the join key used in the delta join must be 
part of the primary key.
-- All filters must be applied on the upsert key, and neither filters nor 
projections should contain non-deterministic functions.
+- The downstream node of the join must support idempotent updates, typically 
it's an upsert sink and should not have a `SinkUpsertMaterializer` node before 
it.
+  - Flink planner automatically inserts a `SinkUpsertMaterializer` when the 
sink’s primary key does not fully cover the upstream update key.
+  - You can learn more details about `SinkUpsertMaterializer` by reading this 
[blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
+- Since delta join does not support to handle update-before messages, it is 
necessary to ensure that the entire pipeline can safely discard update-before 
messages. That means when consuming a CDC stream:

Review Comment:
   In Flink, when consuming a changelog source, the source operator may output 
`insert`, `update-before`, `update-after`, and `delete` messages 
(`update-before` and `update-after` originate from an update statement in the 
storage engine.). 
   
   Here, I would like to express that the delta join operator cannot handle 
(consume) the update-before messages output by the source operator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to