Justine Olshan created KAFKA-15468: -------------------------------------- Summary: Prevent transaction coordinator reloads on already loaded leaders Key: KAFKA-15468 URL: https://issues.apache.org/jira/browse/KAFKA-15468 Project: Kafka Issue Type: Task Reporter: Justine Olshan Assignee: Justine Olshan
I was doing some research on txn coordinator loading and found that on a single roll, a coordinator was loaded up to 5x on a single broker! (For reference, it should only load once on the preferred leader and on any temporary leaders when the broker is down) I was looking into TopicDelta and I saw this check to show “new leaders” (prevPartition == null || prevPartition.partitionEpoch != entry.getValue().partitionEpoch) . I don’t think this is correct because epoch can change for reasons other than becoming a leader (ie isr/follower changes). Here’s some more information with respect to the scenario I encountered--the coordinator was on broker id 1: 6 Sep 2023 @ 09:42:55.782 UTC message:[Transaction State Manager 1]: Finished loading 62 transaction metadata from __transaction_state-13 in 114 milliseconds, of which 0 milliseconds was spent in the scheduler. 6 Sep 2023 @ 09:45:41.328 UTC message:[Transaction State Manager 1]: Finished loading 62 transaction metadata from __transaction_state-13 in 30 milliseconds, of which 0 milliseconds was spent in the scheduler. 6 Sep 2023 @ 09:49:42.863 UTC message:[Transaction State Manager 1]: Finished loading 62 transaction metadata from __transaction_state-13 in 990 milliseconds, of which 2 milliseconds was spent in the scheduler. (correct load) 6 Sep 2023 @ 09:51:10.868 UTC message:[Transaction State Manager 1]: Finished loading 62 transaction metadata from __transaction_state-13 in 182 milliseconds, of which 144 milliseconds was spent in the scheduler. 6 Sep 2023 @ 09:53:53.576 UTC message:[Transaction State Manager 1]: Finished loading 62 transaction metadata from __transaction_state-13 in 177 milliseconds, of which 143 milliseconds was spent in the scheduler. Following logs I found: 1. kafka-3 shuts down and is removed from ISR 2. kafka-3 restarted and rejoined the ISR 3. kafka-1 shuts down and unloads, restarts and loads (correct) 4. kafka-2 shuts down and is removed from ISR 5. kafka-2 rejoins the ISR There are two aspects to this problem 1. TopicDelta shows a change whenever partition epoch changes. Partition epoch changes can occur even when the leader doesn't change 2. A leader epoch can change without electing a new leader. In this case, we should check if the transaction coordinator has already loaded to avoid reloads. -- This message was sent by Atlassian Jira (v8.20.10#820010)