Hi

suffering form the same problem a year ago or so, I dug into this by following the call chain and cgi.py is the source of the 'too many fd' problem.

For an explanation read the comment starting at line 417 in cgi.py which reads:

    The class is subclassable, mostly for the purpose of overriding
    the make_file() method, which is called internally to come up with
    a file open for reading and writing.  This makes it possible to
    override the default choice of storing all files in a temporary
    directory and unlinking them as soon as they have been opened.

The trick which is used here is the fact, that an fd hangs around for some time even if the fd in question was unlinked. It takes some time for the OS to collect all those unlinked fds, but they will be collected eventually. The number of fds allowed per process when using cgi.py (used by twisted) depends on the burst rate of requests, because every request has per default a FieldStorage and therefore an fd.

The only solution is to up the number of allowed fds per process/per machine and depends on the OS:

MS Windows: if CRT is used, hardcoded to 2048 else limited by mem

On **ixes use ''ulimit -a' or 'sysctl -a | grep files' to get a printout the system value, usually something along kern.maxfiles=10000

Per machine:
/etc/sysctl.conf contains the values for the kernel preset when booting.

Per process:
/etc/login.conf contains usually a variable called openfiles-max

On my OpenBSD production system (avg load 30 req/sec) values are

kern.maxfiles=10000

openfiles-max=8192
openfiles-cur=8192

which allows smooth operation of two twisted processes on a dual core machine.

HTH, Werner

FYI the output of top:

load averages: 0.34, 0.31, 0.31 08:55:01
31 processes:  1 running, 29 idle, 1 on processor
CPU0 states: 10.8% user, 0.0% nice, 2.6% system, 0.0% interrupt, 86.6% idle CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Memory: Real: 325M/608M act/tot  Free: 2913M  Swap: 0K/4096M used/tot

PID  UNAME PRI NICE  SIZE   RES STATE    WAIT  TIME   CPU    COMMAND
4562 www   2   0     125M   97M sleep/0  poll  242:39 11.82% python2.5
6506 www   2   0     205M  181M run/0    -     34:20   2.00% python2.5


Phil Mayers wrote:
This is a bit vague, and I wanted to get some feedback before I submit a ticket.

We have a long-running twisted / nevow process that basically has:

 root
  \- RPC2 - a twisted.web.xmlrpc.XMLRPC sub-class
  \- ui   - nevow pages

The thing hung up over the weekend with "too many open file descriptors" and before I killed it I did an "lsof"; lots of the files were:

python25 20163 nsg 31u REG 253,0 370 3276854 /tmp/tmp5QJivu (deleted)

...and "cat /proc/20163/fd/31" shows:

<?xml version='1.0'?>
<methodCall>
<methodName>classify_maclist</methodName>
<params>
<param>
<value><string>HORPROD</string></value>
</param>
<param>
<value><array><data>
<value><string>xxxx</string></value>
</data></array></value>
</param>
<param>
<value><int>-1</int></value>
</param>
<param>
<value><int>5</int></value>
</param>
</params>
</methodCall>

...which is an XMLRPC call from a Zope server on another machine to this process. I presume the t.w.http.Request content is getting written to a tempfile, but I can't understand why - the Content-Length is tiny (<400 bytes).

I can't seem to reproduce this in a sample application though; does anyone have any ideas how I can narrow down the problem?

_______________________________________________
Twisted-web mailing list
[email protected]
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web

_______________________________________________
Twisted-web mailing list
[email protected]
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web

Reply via email to