showuon commented on PR #16118:
URL: https://github.com/apache/kafka/pull/16118#issuecomment-2146749642

   I was trying to know the root cause of this problem, that why does it fail 
after upgrade, but not fail without upgrade. My understanding is that because 
before upgrade, the topic image doesn't have dirID for the assignment. After 
upgrade, the assignment has the dirID. So in the `ReplicaManager#applyDelta`, 
we'll have have directoryId changes in `localChanges`, which will invoke 
`AssignmentEvent` 
[here](https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ReplicaManager.scala#L2748).
 With that, we'll get the unexpected `NOT_LEADER_OR_FOLLOWER` error. And I also 
confirmed, without your change in this PR, this issue also exists. That is:
   
   
      1. Launch a 3.6.0 controller and a 3.6.0 broker(BrokerA) in Kraft mode;
      2. Create a topic with 1 partition;
      ~~3. Launch a 3.6.0 broker(Broker B) in Kraft mode and reassign the step 
2 partition to Broker B;~~
      4. Upgrade Broker B to 3.7.0;
      5. Upgrade Broker A, Controllers to 3.7.0
      6. Upgrade MV to 3.7: ./bin/kafka-features.sh --bootstrap-server 
localhost:9092 upgrade --metadata 3.7
      7. reassign the step 2 partition to Broker A (or B)
   
   
   So I think we might need to think about a good solution to fix from the 
root. I will create another ticket to track this issue. That said, I think this 
PR already fixed the issue in JIRA. Let's complete it! :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to