[ https://issues.apache.org/jira/browse/KAFKA-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pedro Gontijo updated KAFKA-7757: --------------------------------- Description: We upgraded from 0.10.2.2 to 2.1.0 (a cluster with 3 brokers) After a while (hours) 2 brokers start to throw: {code:java} java.io.IOException: Connection to NN was disconnected before the response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190) at kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129) at scala.Option.foreach(Option.scala:257) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) {code} The problem has happened with all brokers. File descriptors start to pile up and if I do not restart it throws "Too many open files" and crashes. {code:java} ERROR Error while accepting connection (kafka.network.Acceptor) java.io.IOException: Too many open files in system at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) at kafka.network.Acceptor.accept(SocketServer.scala:460) at kafka.network.Acceptor.run(SocketServer.scala:403) at java.lang.Thread.run(Thread.java:748) {code} After some hours the issue happens again... was: We upgraded from 0.10.2.2 to 2.1.0 (a cluster with 3 brokers) After a while (hours) 2 brokers start to throw: {code:java} java.io.IOException: Connection to NN was disconnected before the response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190) at kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129) at scala.Option.foreach(Option.scala:257) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) {code} The problem has happened with all brokers. File descriptors start to pile up and if I do not restart it throws "Too many open files" and crashes. {code:java} ERROR Error while accepting connection (kafka.network.Acceptor) java.io.IOException: Too many open files in system at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) at kafka.network.Acceptor.accept(SocketServer.scala:460) at kafka.network.Acceptor.run(SocketServer.scala:403) at java.lang.Thread.run(Thread.java:748) {code} > Too many open files after java.io.IOException: Connection to n was > disconnected before the response was read > ------------------------------------------------------------------------------------------------------------ > > Key: KAFKA-7757 > URL: https://issues.apache.org/jira/browse/KAFKA-7757 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 2.1.0 > Reporter: Pedro Gontijo > Priority: Major > > We upgraded from 0.10.2.2 to 2.1.0 (a cluster with 3 brokers) > After a while (hours) 2 brokers start to throw: > {code:java} > java.io.IOException: Connection to NN was disconnected before the response > was read > at > org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97) > at > kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97) > at > kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190) > at > kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241) > at > kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130) > at > kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129) > at scala.Option.foreach(Option.scala:257) > at > kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129) > at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) > {code} > The problem has happened with all brokers. > File descriptors start to pile up and if I do not restart it throws "Too many > open files" and crashes. > {code:java} > ERROR Error while accepting connection (kafka.network.Acceptor) > java.io.IOException: Too many open files in system > at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) > at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) > at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) > at kafka.network.Acceptor.accept(SocketServer.scala:460) > at kafka.network.Acceptor.run(SocketServer.scala:403) > at java.lang.Thread.run(Thread.java:748) > {code} > > After some hours the issue happens again... > -- This message was sent by Atlassian JIRA (v7.6.3#76005)