Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread Ismael Juma
There are indeed some known issues in the Controller that require care to avoid. Onur has recently contributed a PR that simplifies the concurrency model of the Controller: https://github.com/apache/kafka/commit/bb663d04febcadd4f120e0ff5c5919ca8bf7e971 This is a good first step and will be part

Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread Michal Borowiecki
Hi James, This "Cached zkVersion [x] not equal to that in zookeeper" issue bit us once in production and I found these ticket to be relevant: KAFKA-2729 KAFKA-3042 KAFKA-3083

Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread James Brown
For what it's worth, shutting down the entire cluster and then restarting it did address this issue. I'd love anyone's thoughts on what the "correct" fix would be here. On Fri, Apr 28, 2017 at 10:58 AM, James Brown wrote: > The following is also appearing in the logs a

Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread James Brown
The following is also appearing in the logs a lot, if anyone has any ideas: INFO Partition [easypost.syslog,7] on broker 1: Cached zkVersion [647] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) On Fri, Apr 28, 2017 at 10:43 AM, James Brown

topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread James Brown
We're running 0.10.1.0 on a five-node cluster. I was in the process of migrating some topics from having 2 replicas to having three replicas when two the five machines in this cluster crashed (brokers 2 and 3). After restarting them, all of the topics that were previously assigned to them are