Re: [PR] KAFKA-12670: KRaft support for unclean.leader.election.enable [kafka]

via GitHub Tue, 13 Aug 2024 04:09:29 -0700


soarez commented on code in PR #16866:
URL: https://github.com/apache/kafka/pull/16866#discussion_r1715073632



##########
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java:
##########
@@ -1632,21 +1632,58 @@ boolean arePartitionLeadersImbalanced() {
     }
 
     /**
-     * Attempt to elect a preferred leader for all topic partitions which have 
a leader that is not the preferred replica.
+     * Check if we can do an election for partitions with no leader or a 
leader other than the preferred one.
      *
      * The response() method in the return object is true if this method 
returned without electing all possible preferred replicas.
      * The quorum controller should reschedule this operation immediately if 
it is true.
      *
      * @return All of the election records and if there may be more available 
preferred replicas to elect as leader
      */
-    ControllerResult<Boolean> maybeBalancePartitionLeaders() {
+    ControllerResult<Boolean> maybeAdjustPartitionLeaders() {
         List<ApiMessageAndVersion> records = new ArrayList<>();
+        maybeTriggerUncleanLeaderElectionForLeaderlessPartitions(records, 
maxElectionsPerImbalance);
+        maybeTriggerLeaderChangeForPartitionsWithoutPreferredLeader(records, 
maxElectionsPerImbalance);
+        boolean rescheduleImmediately = records.size() >= 
maxElectionsPerImbalance;
+        return ControllerResult.of(records, rescheduleImmediately);
+    }
 
-        boolean rescheduleImmediately = false;
+    /**
+     * Trigger unclean leader election for partitions without leader (visiable 
for testing)
+     *
+     * @param records  The record list to append to.
+     */
+    void maybeTriggerUncleanLeaderElectionForLeaderlessPartitions(
+        List<ApiMessageAndVersion> records,
+        int maxElections
+    ) {
+        Iterator<TopicIdPartition> iterator = 
brokersToIsrs.partitionsWithNoLeader();
+        for (TopicIdPartition topicIdPartition = iterator.next();
+             iterator.hasNext() && records.size() < maxElections; ) {
+            TopicControlInfo topic = topics.get(topicIdPartition.topicId());
+            if 
(configurationControl.uncleanLeaderElectionEnabledForTopic(topic.name)) {
+                ApiError result = electLeader(topic.name, 
topicIdPartition.partitionId(),
+                        ElectionType.UNCLEAN, records);
+                if (result.error().equals(Errors.NONE)) {
+                    log.error("Triggered unclean leader election for offline 
partition {}-{}.",
+                            topic.name, topicIdPartition.partitionId());
+                } else if 
(result.error().equals(Errors.ELIGIBLE_LEADERS_NOT_AVAILABLE)) {
+                    log.warn("Cannot trigger unclean leader election for 
offline partition {}-{}: {}",
+                            topic.name, topicIdPartition.partitionId(), 
Errors.ELIGIBLE_LEADERS_NOT_AVAILABLE);
+                }
+            } else if (log.isTraceEnabled()) {
+                log.info("Cannot trigger unclean leader election for offline 
partition {}-{} " +

Review Comment:
   This should be logged at trace level, as per the wrapping condition.



##########
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java:
##########
@@ -1632,21 +1632,58 @@ boolean arePartitionLeadersImbalanced() {
     }
 
     /**
-     * Attempt to elect a preferred leader for all topic partitions which have 
a leader that is not the preferred replica.
+     * Check if we can do an election for partitions with no leader or a 
leader other than the preferred one.
      *
      * The response() method in the return object is true if this method 
returned without electing all possible preferred replicas.
      * The quorum controller should reschedule this operation immediately if 
it is true.
      *
      * @return All of the election records and if there may be more available 
preferred replicas to elect as leader
      */
-    ControllerResult<Boolean> maybeBalancePartitionLeaders() {
+    ControllerResult<Boolean> maybeAdjustPartitionLeaders() {
         List<ApiMessageAndVersion> records = new ArrayList<>();
+        maybeTriggerUncleanLeaderElectionForLeaderlessPartitions(records, 
maxElectionsPerImbalance);
+        maybeTriggerLeaderChangeForPartitionsWithoutPreferredLeader(records, 
maxElectionsPerImbalance);
+        boolean rescheduleImmediately = records.size() >= 
maxElectionsPerImbalance;
+        return ControllerResult.of(records, rescheduleImmediately);
+    }
 
-        boolean rescheduleImmediately = false;
+    /**
+     * Trigger unclean leader election for partitions without leader (visiable 
for testing)
+     *
+     * @param records  The record list to append to.
+     */
+    void maybeTriggerUncleanLeaderElectionForLeaderlessPartitions(
+        List<ApiMessageAndVersion> records,
+        int maxElections
+    ) {
+        Iterator<TopicIdPartition> iterator = 
brokersToIsrs.partitionsWithNoLeader();
+        for (TopicIdPartition topicIdPartition = iterator.next();
+             iterator.hasNext() && records.size() < maxElections; ) {
+            TopicControlInfo topic = topics.get(topicIdPartition.topicId());
+            if 
(configurationControl.uncleanLeaderElectionEnabledForTopic(topic.name)) {
+                ApiError result = electLeader(topic.name, 
topicIdPartition.partitionId(),
+                        ElectionType.UNCLEAN, records);
+                if (result.error().equals(Errors.NONE)) {
+                    log.error("Triggered unclean leader election for offline 
partition {}-{}.",
+                            topic.name, topicIdPartition.partitionId());
+                } else if 
(result.error().equals(Errors.ELIGIBLE_LEADERS_NOT_AVAILABLE)) {
+                    log.warn("Cannot trigger unclean leader election for 
offline partition {}-{}: {}",
+                            topic.name, topicIdPartition.partitionId(), 
Errors.ELIGIBLE_LEADERS_NOT_AVAILABLE);
+                }

Review Comment:
   Could the error be some of some other type? Should we log ERROR in such case?



##########
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java:
##########
@@ -1632,21 +1632,58 @@ boolean arePartitionLeadersImbalanced() {
     }
 
     /**
-     * Attempt to elect a preferred leader for all topic partitions which have 
a leader that is not the preferred replica.
+     * Check if we can do an election for partitions with no leader or a 
leader other than the preferred one.
      *
      * The response() method in the return object is true if this method 
returned without electing all possible preferred replicas.
      * The quorum controller should reschedule this operation immediately if 
it is true.
      *
      * @return All of the election records and if there may be more available 
preferred replicas to elect as leader
      */
-    ControllerResult<Boolean> maybeBalancePartitionLeaders() {
+    ControllerResult<Boolean> maybeAdjustPartitionLeaders() {
         List<ApiMessageAndVersion> records = new ArrayList<>();
+        maybeTriggerUncleanLeaderElectionForLeaderlessPartitions(records, 
maxElectionsPerImbalance);
+        maybeTriggerLeaderChangeForPartitionsWithoutPreferredLeader(records, 
maxElectionsPerImbalance);
+        boolean rescheduleImmediately = records.size() >= 
maxElectionsPerImbalance;
+        return ControllerResult.of(records, rescheduleImmediately);
+    }
 
-        boolean rescheduleImmediately = false;
+    /**
+     * Trigger unclean leader election for partitions without leader (visiable 
for testing)
+     *
+     * @param records  The record list to append to.
+     */
+    void maybeTriggerUncleanLeaderElectionForLeaderlessPartitions(
+        List<ApiMessageAndVersion> records,
+        int maxElections
+    ) {
+        Iterator<TopicIdPartition> iterator = 
brokersToIsrs.partitionsWithNoLeader();
+        for (TopicIdPartition topicIdPartition = iterator.next();
+             iterator.hasNext() && records.size() < maxElections; ) {
+            TopicControlInfo topic = topics.get(topicIdPartition.topicId());
+            if 
(configurationControl.uncleanLeaderElectionEnabledForTopic(topic.name)) {
+                ApiError result = electLeader(topic.name, 
topicIdPartition.partitionId(),
+                        ElectionType.UNCLEAN, records);
+                if (result.error().equals(Errors.NONE)) {
+                    log.error("Triggered unclean leader election for offline 
partition {}-{}.",

Review Comment:
   Strictly speaking, the partition change records haven't yet been committed, 
so in rare circumstances, the wording in this log line will be misleading.
   
   ```suggestion
                       log.error("Triggering unclean leader election for 
offline partition {}-{}.",
   ```



##########
metadata/src/main/java/org/apache/kafka/controller/ClusterControlManager.java:
##########
@@ -297,12 +310,17 @@ private ClusterControlManager(
         this.controllerRegistrations = new TimelineHashMap<>(snapshotRegistry, 
0);
         this.directoryToBroker = new TimelineHashMap<>(snapshotRegistry, 0);
         this.brokerUncleanShutdownHandler = brokerUncleanShutdownHandler;
+        this.nodeId = nodeId;
     }
 
     ReplicaPlacer replicaPlacer() {
         return replicaPlacer;
     }
 
+    public int nodeId() {
+        return nodeId;
+    }

Review Comment:
   What's this for? I don't see it used anywhere.



##########
metadata/src/main/java/org/apache/kafka/controller/metrics/ControllerMetadataMetrics.java:
##########
@@ -54,6 +56,8 @@ public final class ControllerMetadataMetrics implements 
AutoCloseable {
         "KafkaController", "MetadataErrorCount");
     private static final MetricName ZK_MIGRATION_STATE = getMetricName(
         "KafkaController", "ZkMigrationState");
+    private static final MetricName UNCLEAN_LEADER_ELECTIONS_PER_SEC = 
getMetricName(
+            "ControllerStats", "UncleanLeaderElectionsPerSec");

Review Comment:
   Even though it seems incongruent with the previous metrics, this is correct. 
The existing metric assumes `typeName` of `ControllerStats`, not 
`KafkaController`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] KAFKA-12670: KRaft support for unclean.leader.election.enable [kafka]

Reply via email to