Did you have unclean leader election enabled here?

best,
Colin

On Mon, Apr 7, 2025, at 11:49, Julian Bergner wrote:
> Hi,
> We primarily operate a Kafka cluster consisting of 3 brokers (IDs: 4, 
> 5, 6) running Kraft in version 3.9.0. However, we tested the same 
> scenario in a Zookeeper-managed cluster (3.9.0) to confirm behaviour 
> consistency and identified a difference between the two.
> Consider the following scenario:
>
>   *   A topic (foo) with 1 partition and replication factor 2
>   *   Initial ISR for partition 0 is [6, 5] (leader is broker 6)
> When performing a partition reassignment from [6, 5] to [4, 5] 
> (removing the current leader 6 and introducing broker 4—previously not 
> part of ISR—as the new leader), we observe the following in KRaft:
>
>   *   An "UNCLEAN partition change" event is logged, despite having 
> unclean.leader.election.enable explicitly set to false on all brokers 
> and controllers.
>   *   The metric 
> kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec 
> increments continuously and does not reset to 0.
> Relevant logs:
> DEBUG [QuorumController id=3] Node 6 has altered ISR for foo-0 to [5, 
> 4]. (org.apache.kafka.controller.ReplicationControlManager)
> INFO [QuorumController id=3] AlterPartition request from node 6 for 
> foo-0 completed the ongoing partition reassignment and triggered a 
> leadership change. Returning NEW_LEADER_ELECTED. 
> (org.apache.kafka.controller.ReplicationControlManager)
> INFO [QuorumController id=3] UNCLEAN partition change for foo-0 with 
> topic ID 0nBbSaN0QWy_hmOnfNLNrg: replicas: [4, 5, 6] -> [4, 5], 
> directories: [fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A, 
> Osd62auxOyraCSUnTJXWTw] -> [fwoolkW969wV61_D5mIIkg, 
> TFl5RsmVwdUxDjTwUb202A], isr: [6, 5] -> [5, 4], removingReplicas: [6] 
> -> [], addingReplicas: [4] -> [], leader: 6 -> 4, leaderEpoch: 1 -> 2, 
> partitionEpoch: 3 -> 4 
> (org.apache.kafka.controller.ReplicationControlManager)
> INFO [QuorumController id=3] Replayed partition assignment change 
> PartitionChangeRecord(partitionId=0, topicId=0nBbSaN0QWy_hmOnfNLNrg, 
> isr=[5, 4], leader=4, replicas=[4, 5], removingReplicas=[], 
> addingReplicas=[], leaderRecoveryState=-1, 
> directories=[fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A], 
> eligibleLeaderReplicas=null, lastKnownElr=null) for topic foo 
> (org.apache.kafka.controller.ReplicationControlManager)
>
> Performing this exact same reassignment test in a Zookeeper-managed 
> cluster did not result in any increment in the 
> UncleanLeaderElectionsPerSec metric, nor was any similar "UNCLEAN 
> partition change" log observed.
> Our expectation was that with unclean.leader.election.enable set to 
> false, the controller should prevent any unclean leader elections and 
> ISR should only include replicas that were previously synchronized.
>
> Could you confirm if this behaviour difference is expected in KRaft or 
> if it might indicate an issue or misconfiguration?
> Thanks!
> Julian
>
> ________________________________
> Ultra Tendency International GmbH - Amtsgericht Stendal: HRB 26409 - 
> Geschäftsführer/CEO: Dr. Robert Neumann
> August-Bebel-Str. 46, 39326 Colbitz, Germany - 
> https://ultratendency.com - i...@ultratendency.com

Reply via email to