[ 
https://issues.apache.org/jira/browse/KAFKA-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-1832.
------------------------------
    Resolution: Fixed

Fixed in  KAFKA-1041

> Async Producer will cause 'java.net.SocketException: Too many open files' 
> when broker host does not exist
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1832
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1832
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions: 0.8.1, 0.8.1.1
>         Environment: linux
>            Reporter: barney
>            Assignee: Jun Rao
>
> h3.How to replay the problem:
> * producer configuration:
> ** producer.type=async
> ** metadata.broker.list=not.existed.com:9092
> Make sure the host '*not.existed.com*' does not exist in DNS server or 
> /etc/hosts;
> * send a lot of messages continuously using the above producer
> It will cause '*java.net.SocketException: Too many open files*' after a 
> while, or you can use '*lsof -p $pid|wc -l*' to check the count of open files 
> which will be increasing as time goes by until it reaches the system 
> limit(check by '*ulimit -n*').
> h3.Problem cause:
> {code:title=kafka.network.BlockingChannel|borderStyle=solid} 
> channel.connect(new InetSocketAddress(host, port))
> {code}
> this line will throw an exception 
> '*java.nio.channels.UnresolvedAddressException*' when broker host does not 
> exist, and at this same time the field '*connected*' is false;
> In *kafka.producer.SyncProducer*, '*disconnect()*' will not invoke 
> '*blockingChannel.disconnect()*' because '*blockingChannel.isConnected*' is 
> false which means the FileDescriptor will be created but never closed;
> h3.More:
> When the broker is an non-existent ip(for example: 
> metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the 
> problem will not appear;
> In *SocketChannelImpl.connect()*, '*Net.checkAddress()*' is not in try-catch 
> block but '*Net.connect()*' is in, that makes the difference;
> h3.Temporary Solution:
> {code:title=kafka.network.BlockingChannel|borderStyle=solid} 
> try
> {
>     channel.connect(new InetSocketAddress(host, port))
> }
> catch
> {
>     case e: UnresolvedAddressException => 
>     {
>         disconnect();
>         throw e
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to