[ 
https://issues.apache.org/jira/browse/KAFKA-19785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18042713#comment-18042713
 ] 

David commented on KAFKA-19785:
-------------------------------

After reviewing the issue, I identified a section that might be contributing to 
the problem.

*Potentially related code:*

In {{raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java}}, function 
{{maybeTransition}} ([line 
2525|https://github.com/apache/kafka/blob/985bc99521dd22bbf620591b8db8613c54f596b2/raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java#L2525]):
 The condition 
{code:java}
if (!hasConsistentLeader(epoch, leaderId)) return;
{code} 
is too strict. For the first valid leader discovery (currentLeader empty, epoch 
matches), {{hasConsistentLeader}} returns false, causing the code path that 
leads to an {{IllegalStateException}} at line 2526, fencing the broker.

*Possible fix:*

{code}
--- raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java
+++ raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java
@@ -2521,7 +2521,13 @@ private void maybeTransition(
            OptionalInt leaderId,
            int epoch,
            Endpoints leaderEndpoints,
            long currentTimeMs
        ) {
-        if (!hasConsistentLeader(epoch, leaderId))
-            return;
+        // Permit initial leader discovery: same epoch, but no current leader 
set.
+        if (!hasConsistentLeader(epoch, leaderId)) {
+            if (currentLeader.isEmpty() && leaderId.isPresent() && epoch == 
currentEpoch) {
+                // first-time leader assignment; proceed to transition logic
+            } else {
+                return;
+            }
+        }
            // Existing logic to apply state transition...
        }
{code}

Please let me know if this assessment aligns with your understanding.

> Two Kafka brokers were not active in 3 node cluster setup
> ---------------------------------------------------------
>
>                 Key: KAFKA-19785
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19785
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>            Reporter: Sravani
>            Priority: Major
>
> Hi Team,
> We were facing kafka issue where two of the kafka brokers were fenced and 
> Kafka was not able to process messages. We are using Kafka 4.0.0 version. 
> Below are the errors.
>  
> Sep 22 09:41:42 s99ossv152 kafka[42245]: [2025-09-22 07:41:42,419] ERROR 
> Encountered fatal fault: Unexpected error in raft IO thread 
> (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
> Sep 22 09:41:42 s99ossv152 kafka[42245]: java.lang.IllegalStateException: 
> Received request or response with leader OptionalInt[3] and epoch 55 *which 
> is inconsistent with current leader* OptionalInt.empty and epoch 55
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.raft.KafkaRaftClient.maybeTransition(KafkaRaftClient.java:2528)
>  ~[kafka-raft-4.0.0.jar:?]
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.raft.KafkaRaftClient.maybeHandleCommonResponse(KafkaRaftClient.java:2484)
>  ~[kafka-raft-4.0.0.jar:?]
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.raft.KafkaRaftClient.handleFetchResponse(KafkaRaftClient.java:1707)
>  ~[kafka-raft-4.0.0.jar:?]
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.raft.KafkaRaftClient.handleResponse(KafkaRaftClient.java:2568)
>  ~[kafka-raft-4.0.0.jar:?]
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.raft.KafkaRaftClient.handleInboundMessage(KafkaRaftClient.java:2724)
>  ~[kafka-raft-4.0.0.jar:?]
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:3460) 
> ~[kafka-raft-4.0.0.jar:?]
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.raft.KafkaRaftClientDriver.doWork(KafkaRaftClientDriver.java:64)
>  [kafka-raft-4.0.0.jar:?]
> Sep 22 09:41:42 s99ossv152 kafka[42245]: #011at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
>  [kafka-server-common-4.0.0.jar:?]
> Below metrics shows Fenceborker count as 2.0
> kafka_controller_KafkaController_Value\{name="ActiveBrokerCount",} 1.0
> kafka_controller_KafkaController_Value\{name="GlobalTopicCount",} 23.0
> kafka_controller_KafkaController_Value{name="{*}FencedBrokerCount{*}",} 2.0
> Please help us to resolve this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to