Re: Problem with file descriptor handling in httpd 2.3.1
On Jan 4, 2009, at 11:57 AM, Rainer Jung wrote: Here's the gdb story: When the content file gets opened, its cleanup is correctly registered with the request pool. Later in core_filters.c at the end of function ap_core_output_filter() line 528 we call setaside_remaining_output(). This goes down the stack via ap_save_brigade(), file_bucket_setaside() to apr_file_setaside(). This kills the cleanup for the request pool and adds it instead to the transaction (=connection) pool. There we are. 2.2.x has a different structure, although I can also see two calls to ap_save_brigade() in ap_core_output_filter(), but they use different pools as new targets, namely a deferred_write_pool resp. input_pool. Uggg... so we need to do the 'same' with the 2.3/2.4 arch as well...
Re: Problem with file descriptor handling in httpd 2.3.1
On 01/04/2009 06:28 PM, Rainer Jung wrote: > On 04.01.2009 17:57, Rainer Jung wrote: >> When the content file gets opened, its cleanup is correctly registered >> with the request pool. Later in core_filters.c at the end of function >> ap_core_output_filter() line 528 we call setaside_remaining_output(). > > ... > >> 2.2.x has a different structure, although I can also see two calls to >> ap_save_brigade() in ap_core_output_filter(), but they use different >> pools as new targets, namely a deferred_write_pool resp. input_pool. > > And the code already contains the appropriate hint: > > static void setaside_remaining_output(...) > { > ... > if (make_a_copy) { > /* XXX should this use a separate deferred write pool, like > * the original ap_core_output_filter? > */ > ap_save_brigade(f, &(ctx->buffered_bb), &bb, c->pool); > ... > } > Thanks for the analysis and good catch. Maybe I have a look into this by tomorrow. Regards Rüdiger
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 17:57, Rainer Jung wrote: When the content file gets opened, its cleanup is correctly registered with the request pool. Later in core_filters.c at the end of function ap_core_output_filter() line 528 we call setaside_remaining_output(). ... 2.2.x has a different structure, although I can also see two calls to ap_save_brigade() in ap_core_output_filter(), but they use different pools as new targets, namely a deferred_write_pool resp. input_pool. And the code already contains the appropriate hint: static void setaside_remaining_output(...) { ... if (make_a_copy) { /* XXX should this use a separate deferred write pool, like * the original ap_core_output_filter? */ ap_save_brigade(f, &(ctx->buffered_bb), &bb, c->pool); ... }
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 15:04, Ruediger Pluem wrote: On 01/04/2009 12:49 AM, Rainer Jung wrote: On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. MaxKeepAliveRequests=100 (Default) - the file leading to EMFILE is the static content file, which can be observed to be open more than 1000 times in parallel although ab concurrency is only 20 - From looking at the code it seems the file is closed during a cleanup function associated to the request pool, which is triggered by an EOR bucket Now what happens under KeepAlive is that the content files are kept open longer than the handling of the request, more precisely until the closing of the connection. So when MaxKeepAliveRequests*Concurrency> MaxNumberOfFDs we run out of file descriptors. I observed the behaviour with 2.3.1 on Linux (SLES10 64Bit) with Event, Worker and Prefork. I didn't yet have the time to retest with 2.2. It should only happen in 2.3.x/trunk because the EOR bucket is a new feature to let MPMs do async writes once the handler has finished running. And yes, this sounds like a nasty bug. I verified I can't reproduce with the same platform and 2.2.11. Not sure I understand the EOR asynchronicity good enough to analyze the root cause. Can you try the following patch please? Here's the gdb story: When the content file gets opened, its cleanup is correctly registered with the request pool. Later in core_filters.c at the end of function ap_core_output_filter() line 528 we call setaside_remaining_output(). This goes down the stack via ap_save_brigade(), file_bucket_setaside() to apr_file_setaside(). This kills the cleanup for the request pool and adds it instead to the transaction (=connection) pool. There we are. 2.2.x has a different structure, although I can also see two calls to ap_save_brigade() in ap_core_output_filter(), but they use different pools as new targets, namely a deferred_write_pool resp. input_pool. So now we know, how it happens, but I don't have an immediate idea how to solve it. Regards, Rainer
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 16:22, Rainer Jung wrote: On 04.01.2009 15:56, Ruediger Pluem wrote: On 01/04/2009 03:48 PM, Rainer Jung wrote: On 04.01.2009 15:40, Ruediger Pluem wrote: On 01/04/2009 03:26 PM, Rainer Jung wrote: On 04.01.2009 14:14, Ruediger Pluem wrote: On 01/04/2009 11:24 AM, Rainer Jung wrote: On 04.01.2009 01:51, Ruediger Pluem wrote: On 01/04/2009 12:49 AM, Rainer Jung wrote: On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. What is the exact size of the file? It is the index.html, via URL /, so size is 45 Bytes. Can you try if you run in the same problem on 2.2.x with a file of size 257 bytes? I tried on the same type of system with event MPM and 2.2.11. Can't reproduce even with content file of size 257 bytes. Possibly you need to increase the number of threads per process with event MPM and the number of concurrent requests from ab. I increased the maximum KeepAlive Requests and the KeepAlive timeout a lot and during a longer running test I see always exactly as many open FDs for the content file in /proc/PID/fd as I had concurrency in ab. So it seems the FDs always get closed before handling the next request in the connection. After testing the patch, I'll try it again with 257 bytes on 2.2.11 with prefork or worker. IMHO this cannot happen with prefork on 2.2.x. So I guess it is not worth testing. It still confuses me that this happens on trunk as it looks like that ab does not do pipelining. ^The strace log shows, that the sequence really is - new connection - read request - open file - send response - log request repeat this triplet a lot of times (maybe as long as KeepAlive is active) and then there are a lot of close() for the content files. Not sure, about the exact thing that triggers the close. So I don't necessarily see pipelining (in the sense of sending more requests before responses return) being necessary. I tested your patch (worker, trunk): It does not help. I then added an error log statement directly after the requests++ and it shows this number is always "1". I can now even reproduce without load. Simply open a connection and send hand crafted KeepAlive requests via telnet. The file descriptors are kept open as long as the connection is alive. I'll run under the debugger to see, how the stack looks like, when the file gets closed. Since the logging is done much earlier (directly after eahc request) the problem does not seem to be directly related to EOR. It looks like somehow the close file cleanup does not run when the request pool is destroyed or maybe it is registered with the connection pool. gdb should help. More later. Rainer
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 15:56, Ruediger Pluem wrote: On 01/04/2009 03:48 PM, Rainer Jung wrote: On 04.01.2009 15:40, Ruediger Pluem wrote: On 01/04/2009 03:26 PM, Rainer Jung wrote: On 04.01.2009 14:14, Ruediger Pluem wrote: On 01/04/2009 11:24 AM, Rainer Jung wrote: On 04.01.2009 01:51, Ruediger Pluem wrote: On 01/04/2009 12:49 AM, Rainer Jung wrote: On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. What is the exact size of the file? It is the index.html, via URL /, so size is 45 Bytes. Can you try if you run in the same problem on 2.2.x with a file of size 257 bytes? I tried on the same type of system with event MPM and 2.2.11. Can't reproduce even with content file of size 257 bytes. Possibly you need to increase the number of threads per process with event MPM and the number of concurrent requests from ab. I increased the maximum KeepAlive Requests and the KeepAlive timeout a lot and during a longer running test I see always exactly as many open FDs for the content file in /proc/PID/fd as I had concurrency in ab. So it seems the FDs always get closed before handling the next request in the connection. After testing the patch, I'll try it again with 257 bytes on 2.2.11 with prefork or worker. IMHO this cannot happen with prefork on 2.2.x. So I guess it is not worth testing. It still confuses me that this happens on trunk as it looks like that ab does not do pipelining. ^The strace log shows, that the sequence really is - new connection - read request - open file - send response - log request repeat this triplet a lot of times (maybe as long as KeepAlive is active) and then there are a lot of close() for the content files. Not sure, about the exact thing that triggers the close. So I don't necessarily see pipelining (in the sense of sending more requests before responses return) being necessary. I tested your patch (worker, trunk): It does not help. I then added an error log statement directly after the requests++ and it shows this number is always "1". Regards, Rainer
Re: Problem with file descriptor handling in httpd 2.3.1
On 01/04/2009 03:48 PM, Rainer Jung wrote: > On 04.01.2009 15:40, Ruediger Pluem wrote: >> >> On 01/04/2009 03:26 PM, Rainer Jung wrote: >>> On 04.01.2009 14:14, Ruediger Pluem wrote: On 01/04/2009 11:24 AM, Rainer Jung wrote: > On 04.01.2009 01:51, Ruediger Pluem wrote: >> On 01/04/2009 12:49 AM, Rainer Jung wrote: >>> On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: > During testing 2.3.1 I noticed a lot of errors of type EMFILE: > "Too > many open files". I used strace and the problem looks like this: > > - The test case is using ab with HTTP keep alive, concurrency 20 > and a > small file, so doing about 2000 requests per second. >> What is the exact size of the file? > It is the index.html, via URL /, so size is 45 Bytes. Can you try if you run in the same problem on 2.2.x with a file of size 257 bytes? >>> I tried on the same type of system with event MPM and 2.2.11. Can't >>> reproduce even with content file of size 257 bytes. >> >> Possibly you need to increase the number of threads per process with >> event MPM >> and the number of concurrent requests from ab. > > I increased the maximum KeepAlive Requests and the KeepAlive timeout a > lot and during a longer running test I see always exactly as many open > FDs for the content file in /proc/PID/fd as I had concurrency in ab. So > it seems the FDs always get closed before handling the next request in > the connection. > > After testing the patch, I'll try it again with 257 bytes on 2.2.11 with > prefork or worker. IMHO this cannot happen with prefork on 2.2.x. So I guess it is not worth testing. It still confuses me that this happens on trunk as it looks like that ab does not do pipelining. Regards Rüdiger
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 15:40, Ruediger Pluem wrote: On 01/04/2009 03:26 PM, Rainer Jung wrote: On 04.01.2009 14:14, Ruediger Pluem wrote: On 01/04/2009 11:24 AM, Rainer Jung wrote: On 04.01.2009 01:51, Ruediger Pluem wrote: On 01/04/2009 12:49 AM, Rainer Jung wrote: On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. What is the exact size of the file? It is the index.html, via URL /, so size is 45 Bytes. Can you try if you run in the same problem on 2.2.x with a file of size 257 bytes? I tried on the same type of system with event MPM and 2.2.11. Can't reproduce even with content file of size 257 bytes. Possibly you need to increase the number of threads per process with event MPM and the number of concurrent requests from ab. I increased the maximum KeepAlive Requests and the KeepAlive timeout a lot and during a longer running test I see always exactly as many open FDs for the content file in /proc/PID/fd as I had concurrency in ab. So it seems the FDs always get closed before handling the next request in the connection. After testing the patch, I'll try it again with 257 bytes on 2.2.11 with prefork or worker. Regards, Rainer
Re: Problem with file descriptor handling in httpd 2.3.1
On 01/04/2009 03:26 PM, Rainer Jung wrote: > On 04.01.2009 14:14, Ruediger Pluem wrote: >> >> On 01/04/2009 11:24 AM, Rainer Jung wrote: >>> On 04.01.2009 01:51, Ruediger Pluem wrote: On 01/04/2009 12:49 AM, Rainer Jung wrote: > On 04.01.2009 00:36, Paul Querna wrote: >> Rainer Jung wrote: >>> During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too >>> many open files". I used strace and the problem looks like this: >>> >>> - The test case is using ab with HTTP keep alive, concurrency 20 >>> and a >>> small file, so doing about 2000 requests per second. What is the exact size of the file? >>> It is the index.html, via URL /, so size is 45 Bytes. >> >> Can you try if you run in the same problem on 2.2.x with a file of >> size 257 bytes? > > I tried on the same type of system with event MPM and 2.2.11. Can't > reproduce even with content file of size 257 bytes. Possibly you need to increase the number of threads per process with event MPM and the number of concurrent requests from ab. Regards Rüdiger
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 14:14, Ruediger Pluem wrote: On 01/04/2009 11:24 AM, Rainer Jung wrote: On 04.01.2009 01:51, Ruediger Pluem wrote: On 01/04/2009 12:49 AM, Rainer Jung wrote: On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. What is the exact size of the file? It is the index.html, via URL /, so size is 45 Bytes. Can you try if you run in the same problem on 2.2.x with a file of size 257 bytes? I tried on the same type of system with event MPM and 2.2.11. Can't reproduce even with content file of size 257 bytes. The same file with trunk immediately reproduces the problem. Will try your patch/hack next. Thanks Rainer
Re: Problem with file descriptor handling in httpd 2.3.1
On 01/04/2009 12:49 AM, Rainer Jung wrote: > On 04.01.2009 00:36, Paul Querna wrote: >> Rainer Jung wrote: >>> During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too >>> many open files". I used strace and the problem looks like this: >>> >>> - The test case is using ab with HTTP keep alive, concurrency 20 and a >>> small file, so doing about 2000 requests per second. >>> MaxKeepAliveRequests=100 (Default) >>> >>> - the file leading to EMFILE is the static content file, which can be >>> observed to be open more than 1000 times in parallel although ab >>> concurrency is only 20 >>> >>> - From looking at the code it seems the file is closed during a >>> cleanup function associated to the request pool, which is triggered by >>> an EOR bucket >>> >>> Now what happens under KeepAlive is that the content files are kept >>> open longer than the handling of the request, more precisely until the >>> closing of the connection. So when MaxKeepAliveRequests*Concurrency > >>> MaxNumberOfFDs we run out of file descriptors. >>> >>> I observed the behaviour with 2.3.1 on Linux (SLES10 64Bit) with >>> Event, Worker and Prefork. I didn't yet have the time to retest with >>> 2.2. >> >> It should only happen in 2.3.x/trunk because the EOR bucket is a new >> feature to let MPMs do async writes once the handler has finished >> running. >> >> And yes, this sounds like a nasty bug. > > I verified I can't reproduce with the same platform and 2.2.11. > > Not sure I understand the EOR asynchronicity good enough to analyze the > root cause. Can you try the following patch please? Index: server/core_filters.c === --- server/core_filters.c (Revision 731238) +++ server/core_filters.c (Arbeitskopie) @@ -367,6 +367,7 @@ #define THRESHOLD_MIN_WRITE 4096 #define THRESHOLD_MAX_BUFFER 65536 +#define MAX_REQUESTS_QUEUED 10 /* Optional function coming from mod_logio, used for logging of output * traffic @@ -381,6 +382,7 @@ apr_bucket_brigade *bb; apr_bucket *bucket, *next; apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; +int requests; /* Fail quickly if the connection has already been aborted. */ if (c->aborted) { @@ -466,6 +468,7 @@ bytes_in_brigade = 0; non_file_bytes_in_brigade = 0; +requests = 0; for (bucket = APR_BRIGADE_FIRST(bb); bucket != APR_BRIGADE_SENTINEL(bb); bucket = next) { next = APR_BUCKET_NEXT(bucket); @@ -501,11 +504,22 @@ non_file_bytes_in_brigade += bucket->length; } } +else if (bucket->type == &ap_bucket_type_eor) { +/* + * Count the number of requests still queued in the brigade. + * Pipelining of a high number of small files can cause + * a high number of open file descriptors, which if it happens + * on many threads in parallel can cause us to hit the OS limits. + */ +requests++; +} } -if (non_file_bytes_in_brigade >= THRESHOLD_MAX_BUFFER) { +if ((non_file_bytes_in_brigade >= THRESHOLD_MAX_BUFFER) +|| (requests > MAX_REQUESTS_QUEUED)) { /* ### Writing the entire brigade may be excessive; we really just - * ### need to send enough data to be under THRESHOLD_MAX_BUFFER. + * ### need to send enough data to be under THRESHOLD_MAX_BUFFER or + * ### under MAX_REQUESTS_QUEUED */ apr_status_t rv = send_brigade_blocking(net->client_socket, bb, &(ctx->bytes_written), c); This is still some sort of a hack, but maybe helpful to understand if this is the problem. Regards Rüdiger Index: server/core_filters.c === --- server/core_filters.c (Revision 731238) +++ server/core_filters.c (Arbeitskopie) @@ -367,6 +367,7 @@ #define THRESHOLD_MIN_WRITE 4096 #define THRESHOLD_MAX_BUFFER 65536 +#define MAX_REQUESTS_QUEUED 10 /* Optional function coming from mod_logio, used for logging of output * traffic @@ -381,6 +382,7 @@ apr_bucket_brigade *bb; apr_bucket *bucket, *next; apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; +int requests; /* Fail quickly if the connection has already been aborted. */ if (c->aborted) { @@ -466,6 +468,7 @@ bytes_in_brigade = 0; non_file_bytes_in_brigade = 0; +requests = 0; for (bucket = APR_BRIGADE_FIRST(bb); bucket != APR_BRIGADE_SENTINEL(bb); bucket = next) { next = APR_BUCKET_NEXT(bucket); @@ -501,11 +504,22 @@ non_file_bytes_in_brigade += bucket->length; } } +else if (bucket->type == &ap_bucket_type_eor) { +/* + * Count the number of requests still queued in the brigade. + * Pipelining of a high number of small files can cause + * a
Re: Problem with file descriptor handling in httpd 2.3.1
On 01/04/2009 11:24 AM, Rainer Jung wrote: > On 04.01.2009 01:51, Ruediger Pluem wrote: >> >> On 01/04/2009 12:49 AM, Rainer Jung wrote: >>> On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: > During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too > many open files". I used strace and the problem looks like this: > > - The test case is using ab with HTTP keep alive, concurrency 20 and a > small file, so doing about 2000 requests per second. >> >> What is the exact size of the file? > > It is the index.html, via URL /, so size is 45 Bytes. Can you try if you run in the same problem on 2.2.x with a file of size 257 bytes? Regards Rüdiger
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 01:51, Ruediger Pluem wrote: On 01/04/2009 12:49 AM, Rainer Jung wrote: On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. What is the exact size of the file? It is the index.html, via URL /, so size is 45 Bytes. Configuration is very close to original, except for: 40c40 < Listen myhost:8000 --- > Listen 80 455,456c455,456 < EnableMMAP off < EnableSendfile off --- > #EnableMMAP off > #EnableSendfile off (because installation is on NFS, but the problem also occurs with those switches on) The following Modules are loaded: LoadModule authn_file_module modules/mod_authn_file.so LoadModule authn_anon_module modules/mod_authn_anon.so LoadModule authn_core_module modules/mod_authn_core.so LoadModule authz_host_module modules/mod_authz_host.so LoadModule authz_groupfile_module modules/mod_authz_groupfile.so LoadModule authz_user_module modules/mod_authz_user.so LoadModule authz_owner_module modules/mod_authz_owner.so LoadModule authz_core_module modules/mod_authz_core.so LoadModule access_compat_module modules/mod_access_compat.so LoadModule auth_basic_module modules/mod_auth_basic.so LoadModule auth_digest_module modules/mod_auth_digest.so LoadModule log_config_module modules/mod_log_config.so LoadModule env_module modules/mod_env.so LoadModule mime_magic_module modules/mod_mime_magic.so LoadModule cern_meta_module modules/mod_cern_meta.so LoadModule expires_module modules/mod_expires.so LoadModule headers_module modules/mod_headers.so LoadModule ident_module modules/mod_ident.so LoadModule usertrack_module modules/mod_usertrack.so LoadModule unique_id_module modules/mod_unique_id.so LoadModule setenvif_module modules/mod_setenvif.so LoadModule version_module modules/mod_version.so LoadModule mime_module modules/mod_mime.so LoadModule unixd_module modules/mod_unixd.so LoadModule status_module modules/mod_status.so LoadModule autoindex_module modules/mod_autoindex.so LoadModule asis_module modules/mod_asis.so LoadModule info_module modules/mod_info.so LoadModule suexec_module modules/mod_suexec.so LoadModule vhost_alias_module modules/mod_vhost_alias.so LoadModule negotiation_module modules/mod_negotiation.so LoadModule dir_module modules/mod_dir.so LoadModule imagemap_module modules/mod_imagemap.so LoadModule actions_module modules/mod_actions.so LoadModule speling_module modules/mod_speling.so LoadModule userdir_module modules/mod_userdir.so LoadModule alias_module modules/mod_alias.so LoadModule rewrite_module modules/mod_rewrite.so To reproduce you must use KeepAlive and your MaxKeepAliveRequests (Default:100) times concurrency must exceed the maximum number of FDs. Even without exceeding, you can use "httpd -X" and look at /proc/PID/fd during the test run. You should be able to notice a huge number of fds, all pointing to the index.html. Regards, Rainer
Re: Problem with file descriptor handling in httpd 2.3.1
On 01/04/2009 12:49 AM, Rainer Jung wrote: > On 04.01.2009 00:36, Paul Querna wrote: >> Rainer Jung wrote: >>> During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too >>> many open files". I used strace and the problem looks like this: >>> >>> - The test case is using ab with HTTP keep alive, concurrency 20 and a >>> small file, so doing about 2000 requests per second. What is the exact size of the file? Regards Rüdiger
Re: Problem with file descriptor handling in httpd 2.3.1
On 04.01.2009 00:36, Paul Querna wrote: Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. MaxKeepAliveRequests=100 (Default) - the file leading to EMFILE is the static content file, which can be observed to be open more than 1000 times in parallel although ab concurrency is only 20 - From looking at the code it seems the file is closed during a cleanup function associated to the request pool, which is triggered by an EOR bucket Now what happens under KeepAlive is that the content files are kept open longer than the handling of the request, more precisely until the closing of the connection. So when MaxKeepAliveRequests*Concurrency > MaxNumberOfFDs we run out of file descriptors. I observed the behaviour with 2.3.1 on Linux (SLES10 64Bit) with Event, Worker and Prefork. I didn't yet have the time to retest with 2.2. It should only happen in 2.3.x/trunk because the EOR bucket is a new feature to let MPMs do async writes once the handler has finished running. And yes, this sounds like a nasty bug. I verified I can't reproduce with the same platform and 2.2.11. Not sure I understand the EOR asynchronicity good enough to analyze the root cause. Rainer
Re: Problem with file descriptor handling in httpd 2.3.1
Rainer Jung wrote: During testing 2.3.1 I noticed a lot of errors of type EMFILE: "Too many open files". I used strace and the problem looks like this: - The test case is using ab with HTTP keep alive, concurrency 20 and a small file, so doing about 2000 requests per second. MaxKeepAliveRequests=100 (Default) - the file leading to EMFILE is the static content file, which can be observed to be open more than 1000 times in parallel although ab concurrency is only 20 - From looking at the code it seems the file is closed during a cleanup function associated to the request pool, which is triggered by an EOR bucket Now what happens under KeepAlive is that the content files are kept open longer than the handling of the request, more precisely until the closing of the connection. So when MaxKeepAliveRequests*Concurrency > MaxNumberOfFDs we run out of file descriptors. I observed the behaviour with 2.3.1 on Linux (SLES10 64Bit) with Event, Worker and Prefork. I didn't yet have the time to retest with 2.2. It should only happen in 2.3.x/trunk because the EOR bucket is a new feature to let MPMs do async writes once the handler has finished running. And yes, this sounds like a nasty bug. -Paul