[jira] [Commented] (KAFKA-6281) Kafka JavaAPI Producer failed with NotLeaderForPartitionException

Anil (JIRA) Mon, 04 Jun 2018 10:17:19 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500542#comment-16500542
 ]


Anil commented on KAFKA-6281:
-----------------------------

For my topic, leader is node 1. 
During the same time I got this exception again , I see in the 
server-state-change log, that the Leader has been changed to "Node 2". Is it 
due to Node 1 going down ?. 

If yes, is the same reason why I got exception "This server is not the leader 
for that partition " from Producer ?.  Here the leader is actually node 1, but 
some how, for the node 2 there's a wrong signal sent, saying that node is the 
Leader. Please clarify on this.

 

_--Exception : 2 --_
_-2018-06-04T10:15:15,522 [kafka-producer-network-thread | producer-1] ERROR 
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is 
not the leader for that topic-partition._

{color:#FF0000}stage-change.log from node 1 :{color}

_*[2018-06-04 10:14:49,272] TRACE Broker 1 received LeaderAndIsr request 
PartitionState(controllerEpoch=43, leader=2, leaderEpoch=23, isr=[2], 
zkVersion=568, replicas=[1, 2]) correlation id 1 from controller 2 epoch 43 for 
partition [MY_TOPIC_NAME,0] (state.change.logger)*_
_[2018-06-04 10:14:49,273] TRACE Broker 1 handling LeaderAndIsr request 
correlationId 1 from controller 2 epoch 43 starting the become-follower 
transition for partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:49,277] TRACE Broker 1 stopped fetchers as part of 
become-follower request from controller 2 epoch 43 with correlation id 1 for 
partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:49,282] TRACE Broker 1 truncated logs and checkpointed 
recovery boundaries for partition [MY_TOPIC_NAME,0] as part of become-follower 
request with correlation id 1 from controller 2 epoch 43 (state.change.logger)_
_[2018-06-04 10:14:49,284] TRACE Broker 1 started fetcher to new leader as part 
of become-follower request from controller 2 epoch 43 with correlation id 1 for 
partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:49,286] TRACE Broker 1 completed LeaderAndIsr request 
correlationId 1 from controller 2 epoch 43 for the become-follower transition 
for partition [MY_TOPIC_NAME,0] (state.change.logger)_

 

{color:#FF0000}stage-change.log from node 2 :{color}

 

_[2018-06-04 10:14:48,562] TRACE Controller 2 epoch 43 changed partition 
[MY_TOPIC_NAME,0] state from OnlinePartition to OfflinePartition 
(state.change.logger)_
_[2018-06-04 10:14:48,753] TRACE Controller 2 epoch 43 started leader election 
for partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:48,757] TRACE Controller 2 epoch 43 elected leader 2 for 
Offline partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:48,757] TRACE Controller 2 epoch 43 changed partition 
[MY_TOPIC_NAME,0] from OfflinePartition to OnlinePartition with leader 2 
(state.change.logger)_
_[2018-06-04 10:14:48,795] TRACE Controller 2 epoch 43 sending become-leader 
LeaderAndIsr request (Leader:2,ISR:2,LeaderEpoch:23,ControllerEpoch:43) to 
broker 2 for partition [MY_TOPIC_NAME,0] (state._
_change.logger)_
_[2018-06-04 10:14:48,796] TRACE Broker 2 received LeaderAndIsr request 
PartitionState(controllerEpoch=43, leader=2, leaderEpoch=23, isr=[2], 
zkVersion=568, replicas=[1, 2]) correlation id 341 from_ 
_controller 2 epoch 43 for partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:48,796] TRACE Broker 2 handling LeaderAndIsr request 
correlationId 341 from controller 2 epoch 43 starting the become-leader 
transition for partition [MY_TOPIC_NAME,0] (state.change_
_.logger)_
_[2018-06-04 10:14:48,798] TRACE Broker 2 stopped fetchers as part of 
become-leader request from controller 2 epoch 43 with correlation id 341 for 
partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:48,798] TRACE Broker 2 completed LeaderAndIsr request 
correlationId 341 from controller 2 epoch 43 for the become-leader transition 
for partition [MY_TOPIC_NAME,0] (state.change.log_
_ger)_
_[2018-06-04 10:14:48,976] TRACE Controller 2 epoch 43 changed state of replica 
1 for partition [MY_TOPIC_NAME,0] from OnlineReplica to OfflineReplica 
(state.change.logger)_
_[2018-06-04 10:14:49,251] TRACE Controller 2 epoch 43 sending become-leader 
LeaderAndIsr request (Leader:2,ISR:2,LeaderEpoch:23,ControllerEpoch:43) to 
broker 2 for partition [MY_TOPIC_NAME,0] (state._
_change.logger)_
_[2018-06-04 10:14:49,261] TRACE Controller 2 epoch 43 changed state of replica 
1 for partition [MY_TOPIC_NAME,0] from OfflineReplica to OnlineReplica 
(state.change.logger)_
_[2018-06-04 10:14:49,267] TRACE Controller 2 epoch 43 sending become-follower 
LeaderAndIsr request (Leader:2,ISR:2,LeaderEpoch:23,ControllerEpoch:43) to 
broker 1 for partition [MY_TOPIC_NAME,0] (stat_
_e.change.logger)_
_[2018-06-04 10:14:49,292] TRACE Broker 2 received LeaderAndIsr request 
PartitionState(controllerEpoch=43, leader=2, leaderEpoch=23, isr=[2], 
zkVersion=568, replicas=[1, 2]) correlation id 343 from_ 
_controller 2 epoch 43 for partition [MY_TOPIC_NAME,0] (state.change.logger)_
_[2018-06-04 10:14:49,293] WARN Broker 2 ignoring LeaderAndIsr request from 
controller 2 with correlation id 343 epoch 43 for partition [MY_TOPIC_NAME,0] 
since its associated leader epoch 23 is not hi_
_gher than the current leader epoch 23 (state.change.logger)_

> Kafka JavaAPI Producer failed with NotLeaderForPartitionException
> -----------------------------------------------------------------
>
>                 Key: KAFKA-6281
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6281
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Anil
>            Priority: Major
>         Attachments: server1-controller.log, server2-controller.log
>
>
> We are running Kafka (vesion kafka_2.11-0.10.1.0) in a 2 node cluster. We 
> have 2 producers (Java API) acting on different topics. Each topic has single 
> partition. The topic where we had this issue, has one consumer running. This 
> set up has been running fine for 3 months, and we saw this issue. All the 
> suggested cases/solutions for this issue in other forums don't seem to apply 
> for my scenario.
> Exception at producer;
> {code}
> -2017-11-25T17:40:33,035 [kafka-producer-network-thread | producer-1] ERROR 
> client.producer.BingLogProducerCallback - Encountered exception in sending 
> message ; > org.apache.kafka.common.errors.NotLeaderForPartitionException: 
> This server is not the leader for that topic-partition.
> {code}
> We haven't enabled retries for the messages, because this is transactional 
> data and we want to maintain the order.
> Producer config:
> {code}
> bootstrap.servers : server1ip:9092
> acks :all
> retries : 0
> linger.ms :0
> buffer.memory :10240000
> max.request.size :1024000
> key.serializer : org.apache.kafka.common.serialization.StringSerializer
> value.serializer : org.apache.kafka.common.serialization.StringSerializer
> {code}
> We are connecting to server1 at both producer and consumer. The controller 
> log at server2 indicates there is some shutdown happened at during sametime, 
> but I dont understand why this happened.
> {color:red}[2017-11-25 17:31:44,776] DEBUG [Controller 2]: topics not in 
> preferred replica Map() (kafka.controller.KafkaController) [2017-11-25 
> 17:31:44,776] TRACE [Controller 2]: leader imbalance ratio for broker 2 is 
> 0.000000 (kafka.controller.KafkaController) [2017-11-25 17:31:44,776] DEBUG 
> [Controller 2]: topics not in preferred replica Map() 
> (kafka.controller.KafkaController) [2017-11-25 17:31:44,776] TRACE 
> [Controller 2]: leader imbalance ratio for broker 1 is 0.000000 
> (kafka.controller.KafkaController) [2017-11-25 17:34:18,314] INFO 
> [SessionExpirationListener on 2], ZK expired; shut down all controller 
> components and try to re-elect 
> (kafka.controller.KafkaController$SessionExpirationListener) [2017-11-25 
> 17:34:18,317] DEBUG [Controller 2]: Controller resigning, broker id 2 
> (kafka.controller.KafkaController) [2017-11-25 17:34:18,317] DEBUG 
> [Controller 2]: De-registering IsrChangeNotificationListener 
> (kafka.controller.KafkaController) [2017-11-25 17:34:18,317] INFO 
> [delete-topics-thread-2], Shutting down 
> (kafka.controller.TopicDeletionManager$DeleteTopicsThread) [2017-11-25 
> 17:34:18,317] INFO [delete-topics-thread-2], Stopped 
> (kafka.controller.TopicDeletionManager$DeleteTopicsThread) [2017-11-25 
> 17:34:18,318] INFO [delete-topics-thread-2], Shutdown completed 
> (kafka.controller.TopicDeletionManager$DeleteTopicsThread) [2017-11-25 
> 17:34:18,318] INFO [Partition state machine on Controller 2]: Stopped 
> partition state machine (kafka.controller.PartitionStateMachine) [2017-11-25 
> 17:34:18,318] INFO [Replica state machine on controller 2]: Stopped replica 
> state machine (kafka.controller.ReplicaStateMachine) [2017-11-25 
> 17:34:18,318] INFO [Controller-2-to-broker-2-send-thread], Shutting down 
> (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,318] INFO 
> [Controller-2-to-broker-2-send-thread], Stopped 
> (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO 
> [Controller-2-to-broker-2-send-thread], Shutdown completed 
> (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO 
> [Controller-2-to-broker-1-send-thread], Shutting down 
> (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO 
> [Controller-2-to-broker-1-send-thread], Stopped 
> (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO 
> [Controller-2-to-broker-1-send-thread], Shutdown completed 
> (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO 
> [Controller 2]: Broker 2 resigned as the controller 
> (kafka.controller.KafkaController) [2017-11-25 17:34:18,353] DEBUG 
> [IsrChangeNotificationListener] Fired!!! 
> (kafka.controller.IsrChangeNotificationListener) [2017-11-25 17:34:18,353] 
> DEBUG [IsrChangeNotificationListener] Fired!!! 
> (kafka.controller.IsrChangeNotificationListener) [2017-11-25 17:34:18,354] 
> INFO [BrokerChangeListener on Controller 2]: Broker change listener fired for 
> path /brokers/ids with children 1,2 
> (kafka.controller.ReplicaStateMachine$BrokerChangeListener) [2017-11-25 
> 17:34:18,355] DEBUG [DeleteTopicsListener on 2]: Delete topics listener fired 
> for topics to be deleted 
> (kafka.controller.PartitionStateMachine$DeleteTopicsListener) [2017-11-25 
> 17:34:18,362] INFO [AddPartitionsListener on 2]: Partition modification 
> triggered {"version":1,"partitions":{"0":[1]}} for path /brokers/topics/ESQ 
> (kafka.controller.PartitionStateMachine$PartitionModificationsListener) 
> [2017-11-25 17:34:18,368] INFO [AddPartitionsListener on 2]: Partition 
> modification triggered {"version":1,"partitions":{"0":[1]}} for path 
> /brokers/topics/Test1 
> (kafka.controller.PartitionStateMachine$PartitionModificationsListener) 
> [2017-11-25 17:34:18,369] INFO [AddPartitionsListener on 2]: Partition 
> modification triggered {"version":1,"partitions":{"0":[2]}} for path 
> /brokers/topics/ImageQ 
> (kafka.controller.PartitionStateMachine$PartitionModificationsListener) 
> [2017-11-25 17:34:18,374] INFO [AddPartitionsListener on 2]: Partition 
> modification triggered 
> {"version":1,"partitions":{"8":[1,2],"4":[1,2],"9":[2,1],"5":[2,1],"6":[1,2],"1":[2,1],"0":[1,2],"2":[1,2],"7":[2,1],"3":[2,1]}}
>  for path /brokers/topics/NMS_NotifyQ 
> (kafka.controller.PartitionStateMachine$PartitionModificationsListener) 
> [2017-11-25 17:34:18,375] INFO [AddPartitionsListener on 2]: Partition 
> modification triggered {"version":1,"partitions":{"0":[1]}} for path 
> /brokers/topics/TempBinLogReqQ @{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KAFKA-6281) Kafka JavaAPI Producer failed with NotLeaderForPartitionException

Reply via email to