Pierre Villard created NIFI-15614:
-------------------------------------

             Summary: ConsumeKafka - Duplicate messages during consumer group 
rebalance
                 Key: NIFI-15614
                 URL: https://issues.apache.org/jira/browse/NIFI-15614
             Project: Apache NiFi
          Issue Type: Bug
          Components: Extensions
    Affects Versions: 2.8.0
            Reporter: Pierre Villard
            Assignee: Pierre Villard


When using ConsumeKafka with Kafka3ConnectionService, duplicate messages may be 
processed when a consumer group rebalance occurs.

NIFI-15464 addressed a related issue by deferring offset commits during 
rebalance. The fix stored revoked partitions in onPartitionsRevoked() and had 
the processor call commitOffsetsForRevokedPartitions() after its session commit.

The deferred commit approach is not enough because by the time poll() returns 
and the processor attempts to commit, the consumer is no longer part of an 
active group. Kafka rejects the commit with RebalanceInProgressException, 
offsets are rolled back, and messages are re-consumed as duplicates.

The Kafka consumer is only in a valid state to commit offsets during the 
onPartitionsRevoked() callback. Once this callback returns, the consumer's 
group membership is revoked and any commit attempt will fail.

We need to implement synchronous offset commit inside onPartitionsRevoked() 
callback, similar to how NiFi 1.x handled rebalances in ConsumerLease. This 
requires introducing a callback mechanism to ensure the NiFi session is 
committed before Kafka offsets are committed, preventing both data loss and 
duplicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to