Pierre Villard created NIFI-15614:
-------------------------------------
Summary: ConsumeKafka - Duplicate messages during consumer group
rebalance
Key: NIFI-15614
URL: https://issues.apache.org/jira/browse/NIFI-15614
Project: Apache NiFi
Issue Type: Bug
Components: Extensions
Affects Versions: 2.8.0
Reporter: Pierre Villard
Assignee: Pierre Villard
When using ConsumeKafka with Kafka3ConnectionService, duplicate messages may be
processed when a consumer group rebalance occurs.
NIFI-15464 addressed a related issue by deferring offset commits during
rebalance. The fix stored revoked partitions in onPartitionsRevoked() and had
the processor call commitOffsetsForRevokedPartitions() after its session commit.
The deferred commit approach is not enough because by the time poll() returns
and the processor attempts to commit, the consumer is no longer part of an
active group. Kafka rejects the commit with RebalanceInProgressException,
offsets are rolled back, and messages are re-consumed as duplicates.
The Kafka consumer is only in a valid state to commit offsets during the
onPartitionsRevoked() callback. Once this callback returns, the consumer's
group membership is revoked and any commit attempt will fail.
We need to implement synchronous offset commit inside onPartitionsRevoked()
callback, similar to how NiFi 1.x handled rebalances in ConsumerLease. This
requires introducing a callback mechanism to ensure the NiFi session is
committed before Kafka offsets are committed, preventing both data loss and
duplicates.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)