Mikaël Cluseau created KAFKA-2426:
-------------------------------------

             Summary: A Kafka node tries to connect to itself through its 
advertised hostname
                 Key: KAFKA-2426
                 URL: https://issues.apache.org/jira/browse/KAFKA-2426
             Project: Kafka
          Issue Type: Bug
          Components: network
    Affects Versions: 0.8.2.1
         Environment: Docker https://github.com/wurstmeister/kafka-docker, 
managed by a Kubernetes cluster, with an "iptables proxy".
            Reporter: Mikaël Cluseau
            Assignee: Jun Rao


Hi,

when used behind a firewall, Apache Kafka nodes are trying to connect to 
themselves using their advertised hostnames. This means that if you have a 
service IP managed by the docker's host using *only* iptables DNAT rules, the 
node's connection to "itself" times out.

This is the case in any setup where a host will DNAT the service IP to the 
instance's IP, and send the packet back on the same interface other a Linux 
Bridge port not configured in "hairpin" mode. It's because of this: 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_forward.c#n30

The specific part of the kubernetes issue is here: 
https://github.com/BenTheElder/kubernetes/issues/3#issuecomment-123925060 .

The timeout involves that the even if partition's leader is elected, it then 
fails to accept writes from the other members, causing a write lock. and 
generating very heavy logs (as fast as Kafka usualy is, but through log4j this 
time ;)).

This also means that the normal docker case work by going through the 
userspace-proxy, which necessarily impacts the performance.

The workaround for us was to add a "127.0.0.2 {advertised hostname}" to 
/etc/hosts in the container startup script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to