[
https://issues.apache.org/jira/browse/KAFKA-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gwen Shapira reassigned KAFKA-1082:
-----------------------------------
Assignee: Gwen Shapira
> zkclient dies after UnknownHostException in zk reconnect
> --------------------------------------------------------
>
> Key: KAFKA-1082
> URL: https://issues.apache.org/jira/browse/KAFKA-1082
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.7.2, 0.8.0
> Reporter: Anatoly Fayngelerin
> Assignee: Gwen Shapira
> Attachments: KAFKA-1082.patch
>
>
> Moving this here from the dev list:
> I've run into the following issue with the Kafka server. The zkclient lib
> seems to die silently if there is an UnknownHostException(or any IOException)
> while reconnecting the ZK session. I've filed a bug about this with the
> zkclient lib(https://github.com/sgroschupf/zkclient/issues/23). The
> ramifications for Kafka were the silent loss of all ephemeral nodes
> associated with the affected process.
> It is fairly easy to reproduce this locally using the following steps:
> -- Configure a local kafka broker to connect to a local ZK instance using a
> DNS alias(e.g. add "127.0.0.1 kafka-test-dns" to your /etc/hosts)
> -- Start the broker, observe that ephemeral nodes have been added to ZK
> -- Suspend the broker process, preventing it from sending heartbeats to the
> ZK instance. Observe the loss of ephemeral nodes in ZK.
> -- Remove the DNS alias(e.g. comment out the /etc/hosts line).
> -- Upon resuming the broker, the UknownHostException is logged. After this
> point, the server cannot re-establish its ZK connection. Re-enabling the
> alias, for example, does not resume normal operation. The broker continues
> accepting requests, without participating in the ZK protocols.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)