[ https://issues.apache.org/jira/browse/FLINK-31408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tzu-Li (Gordon) Tai updated FLINK-31408: ---------------------------------------- Fix Version/s: kafka-3.1.0 > Add EXACTLY_ONCE support to upsert-kafka > ---------------------------------------- > > Key: FLINK-31408 > URL: https://issues.apache.org/jira/browse/FLINK-31408 > Project: Flink > Issue Type: New Feature > Components: Connectors / Kafka > Reporter: Alex Sorokoumov > Assignee: Alex Sorokoumov > Priority: Major > Labels: pull-request-available > Fix For: kafka-3.1.0 > > > {{upsert-kafka}} connector should support optional {{EXACTLY_ONCE}} delivery > semantics. > [upsert-kafka > docs|https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/upsert-kafka/#consistency-guarantees] > suggest that the connector handles duplicate records from > {{{}AT_LEAST_ONCE{}}}. However, at least 2 reasons exist to configure the > connector with {{{}EXACTLY_ONCE{}}}. > First, there might be other non-Flink topic consumers that would rather not > have duplicated records. > Second, multiple {{upsert-kafka}} producers might cause keys to roll back to > previous values. Consider a scenario with 2 producing jobs A and B, writing > to the same topic with {{AT_LEAST_ONCE}} and a consuming job reading from the > topic. Both producers write unique, monotonically increasing sequences to the > same key. Job A writes {{x=a1,a2,a3,a4,a5…}} Job B writes > {{{}x=b1,b2,b3,b4,b5,...{}}}. With this setup, we can have the following > sequence: > # Job A produces x=a5. > # Job B produces x=b5. > # Job A produces the duplicate write x= 5. > The consuming job would observe {{x}} going to {{{}a5{}}}, then to > {{{}b5{}}}, then back {{{}a5{}}}. {{EXACTLY_ONCE}} would prevent this > behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)