Hello, I'm trying to run single Kafka broker, with few topics. Basically 1 broker, 1 partition per topic, 1 replica, few topics. I've been using spotify/kafka dockerhub image which apparently just downloads Kafka release (0.8.2.1 in my case) and start it with default config + advertised host settings added.
When I start Kafka like this it works fine, for a number of days. Occasionally, and seemingly random, it however enters some state where my clients are receiving LeaderNotAvailable exception, for all topics. Once Kafka server enters this state, I didn't found any way to get it back to healthy state. If I restart the server, it immediately works fine again, for few days. This is identical whether running on my development laptop or on Amazon's ECS service. I have feeling, that is happens often on my laptop when I put it to sleep (so virtualbox and docker inside might be affected somehow), but over past few weekssuch failure didnt happened, despite of daily usage and laptop sleeping. I googled a bit, it seems to happen when Kafka can't access self through the address specified in advertised host. I've verified that the host is availbale (i.e. I can connect to self using those settings), all dns/networking/etc seem to work fine. Like, I can docker exec to the docker container, and with telnet access zookeeper's 2181 or Kafka's 9092 ports, using the addresses from server.properties file. I also tried to run kafka-preferred-replica-election, which succeeds on first try and says that election process has started for all topics. But, thatprocess apparently does continue indefinitely, so subsequent executions of that command abort due to running election process. I've checked all the logs from Kafka and Zookeeper, nothing alarming there, either. Any idea where could I dig next?How to troubleshoot it when it will happens? What to check/execute? PS. While I consider myself to be relatively strong in devops area, my experience with Kafka is very minimal, soplease comment even on most novice details, as I'm likely to miss them. -- Best regards from Kamil Burzynski