Chia-Ping Tsai created KAFKA-18160:
--------------------------------------

             Summary: Interrupting or waking up onPartitionsAssigned in 
AsyncConsumer can cause the ConsumerRebalanceListenerCallbackCompletedEvent to 
be skipped, potentially leading to corrupted added partitions
                 Key: KAFKA-18160
                 URL: https://issues.apache.org/jira/browse/KAFKA-18160
             Project: Kafka
          Issue Type: Improvement
            Reporter: Chia-Ping Tsai
            Assignee: Chia-Ping Tsai


I noticed this issue when testing KAFKA-17962. It includes two bugs listed 
below.

*ConsumerRebalanceListenerCallbackCompletedEvent is skipped*

`invokeRebalanceCallbacks`could throw WakeupException/InterruptException [0] 
and they are NOT handled. Hence, the event 
`ConsumerRebalanceListenerCallbackCompletedEvent` is NOT sent to background 
thread.

*Solution*: We should use try-catch blocks to propagate both 
InterruptedException and WakeupException to the background thread.


*corrupted added partitions*

In the next iteration of invokeRebalanceCallbacks, non-fetchable assigned 
partitions are treated as owned partitions [1]. This results in "empty" 
partitions being passed to the listener, meaning that the listener never 
receives the correctly added partitions after the first execution fails. 
Consequently, this causes the test_pause_and_resume_sink (KAFKA-17962) to 
become unstable when using AsyncConsumer.

*Solution*: We should add only partitions where pendingOnAssignedCallback is 
false to the owned partitions.


[0] 
https://github.com/apache/kafka/blob/2d39d5be64d4f5b6446f4b9ec3f32b039707d9d1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L2046

[1] 
https://github.com/apache/kafka/blob/2d39d5be64d4f5b6446f4b9ec3f32b039707d9d1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractMembershipManager.java#L828




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to