How many files does each broker itself have open ? You can find this from 'ls -l /proc/<processid>/fd'
On Tue, Jun 24, 2014 at 10:18 PM, Lung, Paul <pl...@ebay.com> wrote: > Hi All, > > > I just upgraded my cluster from 0.8.1 to 0.8.1.1. I’m seeing the following > error messages on the same 3 brokers once in a while: > > > [2014-06-24 21:43:44,711] ERROR Error in acceptor (kafka.network.Acceptor) > > java.io.IOException: Too many open files > > at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) > > at > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163) > > at kafka.network.Acceptor.accept(SocketServer.scala:200) > > at kafka.network.Acceptor.run(SocketServer.scala:154) > > at java.lang.Thread.run(Thread.java:679) > > [2014-06-24 21:43:44,711] ERROR Error in acceptor (kafka.network.Acceptor) > > java.io.IOException: Too many open files > > at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) > > at > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163) > > at kafka.network.Acceptor.accept(SocketServer.scala:200) > > at kafka.network.Acceptor.run(SocketServer.scala:154) > > at java.lang.Thread.run(Thread.java:679) > > When this happens, these 3 brokers essentially go out of sync when you do > a “kafka-topics.sh —describe”. > > I tracked the number of open files by doing “watch –n 1 ‘sudo lsof | wc > –l’”, which basically counts all open files on the system. The numbers for > the systems are basically in the 6000 range, with one system going to 9000. > I presume the 9000 machine is the controller. Looking at the ulimit of the > user, both the hard limit and the soft limit for open files is 100,000. > Using sysctl, the max file is fs.file-max = 9774928. So we seem to be way > under the limit. > > What am I missing here? Is there some JVM limit around 10K open files or > something? > > Paul Lung >