[jira] [Updated] (KAFKA-16247) replica keep out-of-sync after migrating broker to KRaft

Luke Chen (Jira) Wed, 14 Feb 2024 04:21:41 -0800


     [ 
https://issues.apache.org/jira/browse/KAFKA-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Luke Chen updated KAFKA-16247:
------------------------------
    Description: 
We are deploying 3 controllers and 3 brokers, and following the steps in 
[doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're 
moving from "Enabling the migration on the brokers" state to "Migrating brokers 
to KRaft" state, the first rolled broker becomes out-of-sync and never become 
in-sync. 


>From the log, we can see some "reject alterPartition" errors, but it just 
>happen 2 times. Theoretically, the leader should add the follower  into ISR as 
>long as the follower is fetching since we don't have client writing data. But 
>can't figure out why it didn't fetch. 


Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494

===
update Feb. 14

After further investigating the logs, I think the reason why the replica is not 
added into ISR is because the alterPartition request got non-retriable error 
from controller:


{code:java}
Failed to alter partition to PendingExpandIsr(newInSyncReplicaId=0, 
sentLeaderAndIsr=LeaderAndIsr(leader=1, leaderEpoch=4, 
isrWithBrokerEpoch=List(BrokerState(brokerId=1, brokerEpoch=-1), 
BrokerState(brokerId=2, brokerEpoch=-1), BrokerState(brokerId=0, 
brokerEpoch=-1)), leaderRecoveryState=RECOVERED, partitionEpoch=7), 
leaderRecoveryState=RECOVERED, 
lastCommittedState=CommittedPartitionState(isr=Set(1, 2), 
leaderRecoveryState=RECOVERED)) because the partition epoch is invalid. 
Partition state may be out of sync, awaiting new the latest metadata. 
(kafka.cluster.Partition) 
[zk-broker-1-to-controller-alter-partition-channel-manager]
{code}

Since it's a non-retriable error, we'll keep the state as pending, and waiting 
for later leaderAndISR update as described 
[here|https://github.com/apache/kafka/blob/d24abe0edebad37e554adea47408c3063037f744/core/src/main/scala/kafka/cluster/Partition.scala#L1876C1-L1876C41].

Log analysis: https://gist.github.com/showuon/5514cbb995fc2ae6acd5858f69c137bb

So the question becomes:
1. Why does the controller increase the partition epoch?
2. When the leader receives the leaderAndISR request from the controller, it 
ignored the request because the leader epoch is identical, even though the 
partition epoch is updated. Is the behavior expected? Will it impact the 
alterPartition request later?

  was:
We are deploying 3 controllers and 3 brokers, and following the steps in 
[doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're 
moving from "Enabling the migration on the brokers" state to "Migrating brokers 
to KRaft" state, the first rolled broker becomes out-of-sync and never become 
in-sync. 


>From the log, we can see some "reject alterPartition" errors, but it just 
>happen 2 times. Theoretically, the leader should add the follower  into ISR as 
>long as the follower is fetching since we don't have client writing data. But 
>can't figure out why it didn't fetch. 


Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494


> replica keep out-of-sync after migrating broker to KRaft
> --------------------------------------------------------
>
>                 Key: KAFKA-16247
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16247
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.7.0
>            Reporter: Luke Chen
>            Priority: Major
>         Attachments: KAFKA-16247.zip
>
>
> We are deploying 3 controllers and 3 brokers, and following the steps in 
> [doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're 
> moving from "Enabling the migration on the brokers" state to "Migrating 
> brokers to KRaft" state, the first rolled broker becomes out-of-sync and 
> never become in-sync. 
> From the log, we can see some "reject alterPartition" errors, but it just 
> happen 2 times. Theoretically, the leader should add the follower  into ISR 
> as long as the follower is fetching since we don't have client writing data. 
> But can't figure out why it didn't fetch. 
> Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494
> ===
> update Feb. 14
> After further investigating the logs, I think the reason why the replica is 
> not added into ISR is because the alterPartition request got non-retriable 
> error from controller:
> {code:java}
> Failed to alter partition to PendingExpandIsr(newInSyncReplicaId=0, 
> sentLeaderAndIsr=LeaderAndIsr(leader=1, leaderEpoch=4, 
> isrWithBrokerEpoch=List(BrokerState(brokerId=1, brokerEpoch=-1), 
> BrokerState(brokerId=2, brokerEpoch=-1), BrokerState(brokerId=0, 
> brokerEpoch=-1)), leaderRecoveryState=RECOVERED, partitionEpoch=7), 
> leaderRecoveryState=RECOVERED, 
> lastCommittedState=CommittedPartitionState(isr=Set(1, 2), 
> leaderRecoveryState=RECOVERED)) because the partition epoch is invalid. 
> Partition state may be out of sync, awaiting new the latest metadata. 
> (kafka.cluster.Partition) 
> [zk-broker-1-to-controller-alter-partition-channel-manager]
> {code}
> Since it's a non-retriable error, we'll keep the state as pending, and 
> waiting for later leaderAndISR update as described 
> [here|https://github.com/apache/kafka/blob/d24abe0edebad37e554adea47408c3063037f744/core/src/main/scala/kafka/cluster/Partition.scala#L1876C1-L1876C41].
> Log analysis: https://gist.github.com/showuon/5514cbb995fc2ae6acd5858f69c137bb
> So the question becomes:
> 1. Why does the controller increase the partition epoch?
> 2. When the leader receives the leaderAndISR request from the controller, it 
> ignored the request because the leader epoch is identical, even though the 
> partition epoch is updated. Is the behavior expected? Will it impact the 
> alterPartition request later?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-16247) replica keep out-of-sync after migrating broker to KRaft

Reply via email to