[ 
https://issues.apache.org/jira/browse/KAFKA-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4460:
-------------------------------
    Labels: reliability  (was: )

> Consumer stops getting messages when partition leader dies
> ----------------------------------------------------------
>
>                 Key: KAFKA-4460
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4460
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.10.0.1
>            Reporter: Bernhard Bonigl
>              Labels: reliability
>
> I have a setup consisting of 2 Kafka broker (0 and 1) using a zookeeper, a 
> spring boot application with producers and a spring boot application with 
> consumers.
> The topic has 5 partitions and a replication factor of 2, both brokers are in 
> sync, partitions have alternating leader (although it doesn't matter).
> The spring boot kafka configuration is setup as follows:
> {code}
> kafka.address: localhost:9092,localhost:9093
> kafka.numberOfConsumers: 20
> {code}
> Where Broker 0 uses port 9092 and Broker 1 uses port 9093.
> ----
> When sending events they are consumed just fine. When Broker 0 is killed all 
> topics get Broker 1 as their leader, however the consumers stop consuming 
> events until Broker 0 is back. This happens nearly every time, but usually it 
> takes at most 3 attempts of alternatively killing the leading broker to 
> create the error state.
> The console log is getting spammed by the coordinators, it looks like the 
> coordinator representing broker 0 is marked as dead, but instantly 
> rediscovered and used again many many times, and only at the end the other 
> broker is discovered. When the switch works the log is only minimally spammed 
> and the other broker is discovered very quickly.
> This gist contains the log of the application when the problem occurs. The 
> first line is a log of ours indicating a successfully consumed message. After 
> that the Broker 0 (localhost:9092) is killed - you can see the log spam I was 
> talking about. At the end localhost:9093 is discovered, however no further 
> messages are consumed. After that I killed the application.
> ----
> I also discovered this unresolved stackoverflow question, which seems to be 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to