At Tue, 05 Jan 2010 10:36:27 +0200, Imri Zvik <im...@inter.net.il> wrote:
> > i have a high load DNS server running bind 9.4.3 on RH - > > yesterday we experienced a problem with the bind (the bind froze) , and > > when looking at the logs i saw the following error : > > named error: socket: file descriptor exceeds limit (4096/4096) > > i looked at my OS file descriptor limit and using ulimit -n - 1024 . > > where the number 4096 come from? It's the hard-coded default maximum number of file descriptor (which is nearly equal to the maximum allowable number of open sockets). > If I'm not mistaken, you should either recompile with a higher value for > ISC_SOCKET_MAXSOCKETS or restart named with the -S <maxsockets> argument. I'm afraid it's yes and no. Yes, you can raise the hard coded default value by the -S command line option. (I'm afraid) no, I suspect it won't solve the problem. From my past experiences, 4096 should be sufficient even for a very busy server. If it still consumes all available sockets, it's more likely to mean there's some unexpected serious error (bug) which can't be mitigated by raising that limit. I've heard of similar reports (seemingly consuming all available sockets and named "freezes"), but unfortunately I couldn't reproduce it myself and since it seems to be quite rare I've not figured out the problem. One possible workaround one may want to try is to *disable* epoll, the efficient version of I/O API for Linux: ./configure --disable-epoll This means named will use the inefficient API of select, but depending on the machine power and the server load, it may provide acceptable performance and rather stabler behavior as select is (seemingly) stabler API. --- JINMEI, Tatuya Internet Systems Consortium, Inc. _______________________________________________ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users