I've been doing some testing, with an async producer. It seems, if I start up the producer, with no zk cluster present, it does what I expect, that is it waits for a limited time looking for the zk cluster, and then gives up after the zk.connectiontimeout.ms setting (6000ms, by default), and fails to send a message. However, if after starting up zk and having a good connection to zk and kafka, I then shutdown the zk cluster, the producer never seems to stop accepting messages to send.
As long as kafka stays up and running, even without zk still available, my producer sends messages and my consumer can consume them. However, if I then stop kafka also, my producer happily keeps on accepting messages without failing in a call to producer.send(). It's clearly no longer able to send any messages at this point. So, I assume it eventually will just start dropping messages on the floor? I would have expected that once both zk/kafka are not available, things should revert to the initial startup case, where it tries for 6000ms and then throws an exception on send. Thoughts? What's the expected behavior for async producers, when the async buffered messages can't be sent. I think it's fine if they are just lost, but should it be possible to block further accepting of messages once the system has detected a problem communicating with zk/kafka? Also, if I cleanly shutdown an async producer (e.g. call producer.close()), should it make a best effort to send out any buffered messages before shutting down? Or will it quit immediately dropping any buffered messages on the floor? Jason