[ 
https://issues.apache.org/jira/browse/KAFKA-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461448#comment-16461448
 ] 

David Glasser commented on KAFKA-6843:
--------------------------------------

Sorry, I may be wrong here. I think the default for these properties got 
changed in Java 1.7 or so to be OK, and the real problem is a Zookeeper issue 
(ZOOKEEPER-2184) which is hopefully fixed in Kafka 1.1 (KAFKA-5473).  We are 
still on 1.0 and are planning to upgrade now.

> Document issue with DNS TTL
> ---------------------------
>
>                 Key: KAFKA-6843
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6843
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: David Glasser
>            Priority: Major
>
> We run Kafka and Zookeeper in Google Kubernetes Engine. We have recently had 
> problems where our brokers had serious problems when GKE replaced our cluster 
> (cycling both Zookeeper and Kafka in parallel).  Kafka (1.0) brokers lost the 
> ability the talk to Zookeeper, and eventually failed their controlled 
> shutdown, leading to slow startup times for the new broker and outages for 
> our system.
> We eventually tracked this down to the fact that (at least in our 
> environment) the default JVM DNS caching behavior is to cache results 
> forever.  We rely on DNS to connect to Zookeeper, and the DNS resolution 
> changes when the Zookeeper pods are replaced.
> The fix is straightforward: setting the property networkaddress.cache.ttl or 
> sun.net.inetaddr.ttl to make the caching non-infinite (or use a "security 
> manager"). See 
> [https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html] 
> for details.
> I think this gotcha should be documented. Probably at 
> [https://kafka.apache.org/11/documentation/#java] ? I'm happy to submit a PR 
> if people agree this is the right place.  (I suppose somehow fixing this in 
> code would be nice too.)
> By the way, if you search the Apache issue tracker for 
> [networkaddress.cache.ttl|https://issues.apache.org/jira/browse/JAMES-774?jql=text%20~%20%22%5C%22networkaddress.cache.ttl%5C%22%22],
>  you'll learn that this is a common issue faced by many Apache Java projects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to