yes, I gave it several minutes.
On Sat, Jun 14, 2014 at 2:18 PM, Michael G. Noll <mich...@michael-noll.com> wrote: > Have you given Kafka some time to re-elect a new leader for the > "missing" partition when you re-try steps 1-5? > > See here: > > If you do, you should be able to go through steps > > 1-8 without seeing LeaderNotAvailableExceptions (you may need to give > > Kafka some time to re-elect the remaining, second broker as the new > > leader for the first broker's partitions though). > > Best, > Michael > > > > On 06/12/2014 08:43 PM, Prakash Gowri Shankor wrote: > > So if we go back to the 2 broker case, I tried your suggestion with > > replication-factor 2 > > > > ./kafka-topics.sh --topic test2 --create --partitions 3 --zookeeper > > localhost:2181 --replication-factor > > > > When i repeat steps 1-5 i still see the exception. When i go to step 8 ( > > back to 2 brokers ), I dont see it. > > Here is my topic description: > > > > ./kafka-topics.sh --describe --topic test2 --zookeeper localhost:2181 > > > > Topic:test2 PartitionCount:3 ReplicationFactor:2 Configs: > > > > Topic: test2 Partition: 0 Leader: 1 Replicas: 1,0 Isr: 1,0 > > > > Topic: test2 Partition: 1 Leader: 0 Replicas: 0,1 Isr: 0,1 > > > > Topic: test2 Partition: 2 Leader: 1 Replicas: 1,0 Isr: 1,0 > > > > > > On Wed, Jun 11, 2014 at 3:20 PM, Michael G. Noll < > > michael+st...@michael-noll.com> wrote: > > > >> In your second case (1-broker cluster and putting your laptop to sleep) > >> these exceptions should be transient and disappear after a while. > >> > >> In the logs you should see ZK session expirations (hence the > >> initial/transient exceptions, which in this case are expected and ok), > >> followed by new ZK sessions being established. > >> > >> So this case is (should?) be very different from your case number 1. > >> > >> --Michael > >> > >> > >>> On 11.06.2014, at 23:13, Prakash Gowri Shankor < > >> prakash.shan...@gmail.com> wrote: > >>> > >>> Thanks for your response Michael. > >>> > >>> In step 3, I am actually stopping the entire cluster and restarting it > >>> without the 2nd broker. But I see your point. When i look in > >>> /tmp/kafka-logs-2 ( which is the log dir for the 2nd broker ) I see it > >>> holds test2-1 ( ie 1st partition of test2 topic ). > >>> For /tmp/kafka-logs ( which is the log dir for the first broker ) I see > >> it > >>> holds test2-0 and test2-2 ( 0th and 2nd partition of test2 topic ). > >>> So it would seem that kafka is missing the leader for partition 1 and > >> hence > >>> throwing the exception on the producer side. > >>> Let me try your replication suggestion. > >>> > >>> While all of the above might explain the exception in the case of 2 > >>> brokers, there are still times when I see it with just a single broker. > >>> In this case, I start from a normal working cluster with 1 broker only. > >>> Then I either put my machine into sleep/hibernation. On wake, I do > >> shutdown > >>> the cluster ( for sanity ) and restart. > >>> On restart, I start seeing this exception. In this case i only have one > >>> broker. I still create the topic the way i described earlier. > >>> I understand this is not the ideal production topology, but its > annoying > >> to > >>> see it during development. > >>> > >>> Thanks > >>> > >>> > >>> On Wed, Jun 11, 2014 at 1:40 PM, Michael G. Noll < > >> mich...@michael-noll.com> > >>> wrote: > >>> > >>>> Prakash, > >>>> > >>>> you are configure the topic with a replication factor of only 1, i.e. > no > >>>> additional replica beyond "the original one". This replication > setting > >>>> of 1 means that only one of the two brokers will ever host the > (single) > >>>> replica -- which is implied to also be the leader in-sync replica -- > of > >>>> a given partition. > >>>> > >>>> In step 3 you are disabling one of the two brokers. Because this > >>>> stopped broker is the only broker that hosts one or more of the 3 > >>>> partitions you configured (I can't tell which partition(s) it is, but > >>>> you can find out by --describe'ing the topic), your Kafka cluster -- > >>>> which is now running in degraded state -- will miss the leader of > those > >>>> affected partitions. And because you set the replication factor to 1, > >>>> the remaining, second broker will not and will never take over the > >>>> leadership of those partitions from the stopped broker. Hence you > will > >>>> keep getting the LeaderNotAvailableException's until you restart the > >>>> stopped broker in step 7. > >>>> > >>>> So to me it looks as if the behavior of Kafka is actually correct and > as > >>>> expected. > >>>> > >>>> If you want to "rectify" your test setup, try increasing the > replication > >>>> factor from 1 to 2. If you do, you should be able to go through steps > >>>> 1-8 without seeing LeaderNotAvailableExceptions (you may need to give > >>>> Kafka some time to re-elect the remaining, second broker as the new > >>>> leader for the first broker's partitions though). > >>>> > >>>> Hope this helps, > >>>> Michael > >>>> > >>>> > >>>> > >>>>> On 06/11/2014 07:49 PM, Prakash Gowri Shankor wrote: > >>>>> yes, > >>>>> here are the steps: > >>>>> > >>>>> Create topic as : ./kafka-topics.sh --topic test2 --create > >>>> --partitions 3 > >>>>> --zookeeper localhost:2181 --replication-factor 1 > >>>>> > >>>>> 1) Start cluster with 2 brokers, 3 consumers. > >>>>> 2) Dont start any producer > >>>>> 3) Shutdown cluster and disable one broker from starting > >>>>> 4) restart cluster with 1 broker, 3 consumers > >>>>> 5) Start producer and send messages. I see this exception > >>>>> 6) Shutdown cluster. > >>>>> 7) Enable 2nd broker. > >>>>> 8) Restart cluster with 2 brokers, 3 consumer and the one producer > and > >>>> send > >>>>> messages. Now I dont see the exception. > >>>> > >>>> > >> > > > >