Did you have unclean leader election enabled here? best, Colin
On Mon, Apr 7, 2025, at 11:49, Julian Bergner wrote: > Hi, > We primarily operate a Kafka cluster consisting of 3 brokers (IDs: 4, > 5, 6) running Kraft in version 3.9.0. However, we tested the same > scenario in a Zookeeper-managed cluster (3.9.0) to confirm behaviour > consistency and identified a difference between the two. > Consider the following scenario: > > * A topic (foo) with 1 partition and replication factor 2 > * Initial ISR for partition 0 is [6, 5] (leader is broker 6) > When performing a partition reassignment from [6, 5] to [4, 5] > (removing the current leader 6 and introducing broker 4—previously not > part of ISR—as the new leader), we observe the following in KRaft: > > * An "UNCLEAN partition change" event is logged, despite having > unclean.leader.election.enable explicitly set to false on all brokers > and controllers. > * The metric > kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec > increments continuously and does not reset to 0. > Relevant logs: > DEBUG [QuorumController id=3] Node 6 has altered ISR for foo-0 to [5, > 4]. (org.apache.kafka.controller.ReplicationControlManager) > INFO [QuorumController id=3] AlterPartition request from node 6 for > foo-0 completed the ongoing partition reassignment and triggered a > leadership change. Returning NEW_LEADER_ELECTED. > (org.apache.kafka.controller.ReplicationControlManager) > INFO [QuorumController id=3] UNCLEAN partition change for foo-0 with > topic ID 0nBbSaN0QWy_hmOnfNLNrg: replicas: [4, 5, 6] -> [4, 5], > directories: [fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A, > Osd62auxOyraCSUnTJXWTw] -> [fwoolkW969wV61_D5mIIkg, > TFl5RsmVwdUxDjTwUb202A], isr: [6, 5] -> [5, 4], removingReplicas: [6] > -> [], addingReplicas: [4] -> [], leader: 6 -> 4, leaderEpoch: 1 -> 2, > partitionEpoch: 3 -> 4 > (org.apache.kafka.controller.ReplicationControlManager) > INFO [QuorumController id=3] Replayed partition assignment change > PartitionChangeRecord(partitionId=0, topicId=0nBbSaN0QWy_hmOnfNLNrg, > isr=[5, 4], leader=4, replicas=[4, 5], removingReplicas=[], > addingReplicas=[], leaderRecoveryState=-1, > directories=[fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A], > eligibleLeaderReplicas=null, lastKnownElr=null) for topic foo > (org.apache.kafka.controller.ReplicationControlManager) > > Performing this exact same reassignment test in a Zookeeper-managed > cluster did not result in any increment in the > UncleanLeaderElectionsPerSec metric, nor was any similar "UNCLEAN > partition change" log observed. > Our expectation was that with unclean.leader.election.enable set to > false, the controller should prevent any unclean leader elections and > ISR should only include replicas that were previously synchronized. > > Could you confirm if this behaviour difference is expected in KRaft or > if it might indicate an issue or misconfiguration? > Thanks! > Julian > > ________________________________ > Ultra Tendency International GmbH - Amtsgericht Stendal: HRB 26409 - > Geschäftsführer/CEO: Dr. Robert Neumann > August-Bebel-Str. 46, 39326 Colbitz, Germany - > https://ultratendency.com - i...@ultratendency.com