[
https://issues.apache.org/jira/browse/NIFI-15614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pierre Villard updated NIFI-15614:
----------------------------------
Status: Patch Available (was: Open)
> ConsumeKafka - Duplicate messages during consumer group rebalance
> -----------------------------------------------------------------
>
> Key: NIFI-15614
> URL: https://issues.apache.org/jira/browse/NIFI-15614
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 2.8.0
> Reporter: Pierre Villard
> Assignee: Pierre Villard
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When using ConsumeKafka with Kafka3ConnectionService, duplicate messages may
> be processed when a consumer group rebalance occurs.
> NIFI-15464 addressed a related issue by deferring offset commits during
> rebalance. The fix stored revoked partitions in onPartitionsRevoked() and had
> the processor call commitOffsetsForRevokedPartitions() after its session
> commit.
> The deferred commit approach is not enough because by the time poll() returns
> and the processor attempts to commit, the consumer is no longer part of an
> active group. Kafka rejects the commit with RebalanceInProgressException,
> offsets are rolled back, and messages are re-consumed as duplicates.
> The Kafka consumer is only in a valid state to commit offsets during the
> onPartitionsRevoked() callback. Once this callback returns, the consumer's
> group membership is revoked and any commit attempt will fail.
> We need to implement synchronous offset commit inside onPartitionsRevoked()
> callback, similar to how NiFi 1.x handled rebalances in ConsumerLease. This
> requires introducing a callback mechanism to ensure the NiFi session is
> committed before Kafka offsets are committed, preventing both data loss and
> duplicates.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)