[
https://issues.apache.org/jira/browse/FLINK-39218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Danny Cranmer reassigned FLINK-39218:
-------------------------------------
Assignee: Aleksandr Savonin
> Add CLI tool to manage lingering Kafka transactions
> ---------------------------------------------------
>
> Key: FLINK-39218
> URL: https://issues.apache.org/jira/browse/FLINK-39218
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Kafka
> Reporter: Aleksandr Savonin
> Assignee: Aleksandr Savonin
> Priority: Major
>
> When a Flink job using the KafkaSink with EXACTLY_ONCE delivery guarantee
> stops unexpectedly (e.g., crash), it can leave Kafka transactions in an
> ONGOING state. These lingering transactions block downstream consumers
> operating in read_committed isolation level, as the Last Stable Offset (LSO)
> cannot advance past them.
> Currently, there is no built-in way to resolve this without restarting the
> original Flink job or waiting for the transaction timeout.
> This ticket adds a dedicated CLI tool
> (flink-connector-kafka-transaction-tool) packaged as a self-contained
> uber-jar that allows operators to manually resolve stuck transactions:
> * Abort: Connects with the same transactional.id to fence the previous
> producer, forcing the broker to abort the open transaction.
> * Commit: Resumes a specific transaction using the exact producerId and
> epoch from Flink checkpoint state/logs and commits it.
> The tool is implemented as a new Maven module
> (flink-connector-kafka-transaction-tool).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)