[ https://issues.apache.org/jira/browse/KAFKA-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
barney updated KAFKA-1832: -------------------------- Description: h3.How to replay the problem: * producer configuration: ** producer.type=async ** metadata.broker.list=not.existed.com:9092 Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts; * send a lot of messages continuously using the above producer It will cause '*java.net.SocketException: Too many open files*' after a while, or you can use '*lsof -p $pid|wc -l*' to check the count of open files which will be increasing as time goes by until it reaches the system limit(check by 'ulimit -n'). h3.Problem cause: {code:title=kafka.network.BlockingChannel|borderStyle=solid} channel.connect(new InetSocketAddress(host, port)) {code} this line will throw an exception '*java.nio.channels.UnresolvedAddressException*' when broker host does not exist, and at this same time the field '*connected*' is false; In *kafka.producer.SyncProducer*, 'disconnect()' will not invoke 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false which means the FileDescriptor will be created but never closed; h3.More: When the broker is an non-existent ip(for example: metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem will not appear; In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block but 'Net.connect()' is in, that makes the difference; h3.Temporary Solution: {code:title=kafka.network.BlockingChannel|borderStyle=solid} try { channel.connect(new InetSocketAddress(host, port)) } catch { case e: UnresolvedAddressException => { disconnect(); throw e } } {code} was: h3.How to replay the problem: *producer configuration: **producer.type=async **metadata.broker.list=not.existed.com:9092 Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts; *send a lot of messages continuously using the above producer It will cause '*java.net.SocketException: Too many open files*' after a while, or you can use '*lsof -p $pid|wc -l*' to check the count of open files which will be increasing as time goes by until it reaches the system limit(check by 'ulimit -n'). h3.Problem cause: {code:title=kafka.network.BlockingChannel|borderStyle=solid} channel.connect(new InetSocketAddress(host, port)) {code} this line will throw an exception '*java.nio.channels.UnresolvedAddressException*' when broker host does not exist, and at this same time the field '*connected*' is false; In *kafka.producer.SyncProducer*, 'disconnect()' will not invoke 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false which means the FileDescriptor will be created but never closed; h3.More: When the broker is an non-existent ip(for example: metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem will not appear; In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block but 'Net.connect()' is in, that makes the difference; h3.Temporary Solution: {code:title=kafka.network.BlockingChannel|borderStyle=solid} try { channel.connect(new InetSocketAddress(host, port)) } catch { case e: UnresolvedAddressException => { disconnect(); throw e } } {code} > Async Producer will cause 'java.net.SocketException: Too many open files' > when broker host does not exist > --------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1832 > URL: https://issues.apache.org/jira/browse/KAFKA-1832 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 0.8.1, 0.8.1.1 > Environment: linux > Reporter: barney > Assignee: Jun Rao > > h3.How to replay the problem: > * producer configuration: > ** producer.type=async > ** metadata.broker.list=not.existed.com:9092 > Make sure the host 'not.existed.com' does not exist in DNS server or > /etc/hosts; > * send a lot of messages continuously using the above producer > It will cause '*java.net.SocketException: Too many open files*' after a > while, or you can use '*lsof -p $pid|wc -l*' to check the count of open files > which will be increasing as time goes by until it reaches the system > limit(check by 'ulimit -n'). > h3.Problem cause: > {code:title=kafka.network.BlockingChannel|borderStyle=solid} > channel.connect(new InetSocketAddress(host, port)) > {code} > this line will throw an exception > '*java.nio.channels.UnresolvedAddressException*' when broker host does not > exist, and at this same time the field '*connected*' is false; > In *kafka.producer.SyncProducer*, 'disconnect()' will not invoke > 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false > which means the FileDescriptor will be created but never closed; > h3.More: > When the broker is an non-existent ip(for example: > metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the > problem will not appear; > In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch > block but 'Net.connect()' is in, that makes the difference; > h3.Temporary Solution: > {code:title=kafka.network.BlockingChannel|borderStyle=solid} > try > { > channel.connect(new InetSocketAddress(host, port)) > } > catch > { > case e: UnresolvedAddressException => > { > disconnect(); > throw e > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)