mumrah opened a new pull request, #15918:
URL: https://github.com/apache/kafka/pull/15918

   When becoming the active KRaftMigrationDriver, there is another race 
condition similar to KAFKA-16171. This time, the race is due to a stale read 
from ZK. After writing to `/controller` and `/controller_epoch`, it is possible 
that a read on `/migration` is not linear with the writes that were just made. 
In other words, we get a stale read on `/migration`. This leads to an inability 
to sync metadata to ZK due to incorrect zkVersion on the migration Znode. 
   
   The non-linearizability of reads is in fact documented behavior for ZK, so 
we need to handle it.
   
   To fix the stale read, this patch adds a write to `/migration` after 
updating `/controller` and `/controller_epoch`. This allows us to learn the 
correct zkVersion for the migration ZNode before leaving the BECOME_CONTROLLER 
state.
   
   This patch also adds a check on the current leader epoch when running 
certain events in KRaftMigrationDriver. Historically, we did not include this 
check because it is not necessary for correctness. Writes to ZK are gated on 
the  `/controller_epoch` zkVersion, and RPCs sent to brokers are gated on the 
controller epoch. However, during a time of rapid failover, there is a lot of 
processing happening on the controller (i.e., full metadata sync to ZK and full 
UMRs sent to brokers), so it is best to avoid running events we know will fail.
   
   There is also a small fix in here to improve the logging of ZK operations. 
The log message are changed to past tense to reflect the fact that they have 
already happened by the time the log message is created.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to