Justine Olshan created KAFKA-15468:
--------------------------------------

             Summary: Prevent transaction coordinator reloads on already loaded 
leaders
                 Key: KAFKA-15468
                 URL: https://issues.apache.org/jira/browse/KAFKA-15468
             Project: Kafka
          Issue Type: Task
            Reporter: Justine Olshan
            Assignee: Justine Olshan


I was doing some research on txn coordinator loading and found that on a single 
roll, a coordinator was loaded up to 5x on a single broker! (For reference, it 
should only load once on the preferred leader and on any temporary leaders when 
the broker is down)

I was looking into TopicDelta and I saw this check to show “new leaders” 
(prevPartition == null || prevPartition.partitionEpoch != 
entry.getValue().partitionEpoch) . I don’t think this is correct because epoch 
can change for reasons other than becoming a leader (ie isr/follower changes).

Here’s some more information with respect to the scenario I encountered--the 
coordinator was on broker id 1:
6 Sep 2023 @ 09:42:55.782 UTC message:[Transaction State Manager 1]: Finished 
loading 62 transaction metadata from __transaction_state-13 in 114 
milliseconds, of which 0 milliseconds was spent in the scheduler.

6 Sep 2023 @ 09:45:41.328 UTC message:[Transaction State Manager 1]: Finished 
loading 62 transaction metadata from __transaction_state-13 in 30 milliseconds, 
of which 0 milliseconds was spent in the scheduler.

6 Sep 2023 @ 09:49:42.863 UTC message:[Transaction State Manager 1]: Finished 
loading 62 transaction metadata from __transaction_state-13 in 990 
milliseconds, of which 2 milliseconds was spent in the scheduler.
(correct load)

6 Sep 2023 @ 09:51:10.868 UTC message:[Transaction State Manager 1]: Finished 
loading 62 transaction metadata from __transaction_state-13 in 182 
milliseconds, of which 144 milliseconds was spent in the scheduler.

6 Sep 2023 @ 09:53:53.576 UTC message:[Transaction State Manager 1]: Finished 
loading 62 transaction metadata from __transaction_state-13 in 177 
milliseconds, of which 143 milliseconds was spent in the scheduler.

Following logs I found:
1. kafka-3 shuts down and is removed from ISR
2. kafka-3 restarted and rejoined the ISR
3. kafka-1 shuts down and unloads, restarts and loads (correct)
4. kafka-2 shuts down and is removed from ISR
5. kafka-2 rejoins the ISR

There are two aspects to this problem
1. TopicDelta shows a change whenever partition epoch changes. Partition epoch 
changes can occur even when the leader doesn't change
2. A leader epoch can change without electing a new leader. In this case, we 
should check if the transaction coordinator has already loaded to avoid reloads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to