[
https://issues.apache.org/jira/browse/FLINK-31408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tzu-Li (Gordon) Tai closed FLINK-31408.
---------------------------------------
Resolution: Fixed
> Add EXACTLY_ONCE support to upsert-kafka
> ----------------------------------------
>
> Key: FLINK-31408
> URL: https://issues.apache.org/jira/browse/FLINK-31408
> Project: Flink
> Issue Type: New Feature
> Components: Connectors / Kafka
> Reporter: Alex Sorokoumov
> Assignee: Alex Sorokoumov
> Priority: Major
> Labels: pull-request-available
> Fix For: kafka-3.1.0
>
>
> {{upsert-kafka}} connector should support optional {{EXACTLY_ONCE}} delivery
> semantics.
> [upsert-kafka
> docs|https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/upsert-kafka/#consistency-guarantees]
> suggest that the connector handles duplicate records from
> {{{}AT_LEAST_ONCE{}}}. However, at least 2 reasons exist to configure the
> connector with {{{}EXACTLY_ONCE{}}}.
> First, there might be other non-Flink topic consumers that would rather not
> have duplicated records.
> Second, multiple {{upsert-kafka}} producers might cause keys to roll back to
> previous values. Consider a scenario with 2 producing jobs A and B, writing
> to the same topic with {{AT_LEAST_ONCE}} and a consuming job reading from the
> topic. Both producers write unique, monotonically increasing sequences to the
> same key. Job A writes {{x=a1,a2,a3,a4,a5…}} Job B writes
> {{{}x=b1,b2,b3,b4,b5,...{}}}. With this setup, we can have the following
> sequence:
> # Job A produces x=a5.
> # Job B produces x=b5.
> # Job A produces the duplicate write x= 5.
> The consuming job would observe {{x}} going to {{{}a5{}}}, then to
> {{{}b5{}}}, then back {{{}a5{}}}. {{EXACTLY_ONCE}} would prevent this
> behavior.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)