Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 03:05 PM, Yann Ylavic wrote: Couldn't htcacheclean or alike do something like this? "EnableMMAP off" could definitely help here. (Didn't mean to ignore this part of your email, but I don't have much experience with htcacheclean yet so I can't really comment...) --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 03:05 PM, Yann Ylavic wrote: Hmm, Linux raises SIGBUS if an mmap is used after the underlying file has been truncated (see [1]). See also https://bz.apache.org/bugzilla/show_bug.cgi?id=46688 . Niklas, just to clarify: you're not willfully truncating large files as they're being served, right? I *can* reproduce a SIGBUS if I start truncating files during my stress testing. But EnableMMAP's documentation calls that case out explicitly. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 2, 2017 at 11:36 PM, Jacob Championwrote: > On 02/02/2017 02:32 PM, Yann Ylavic wrote: >> >> On Thu, Feb 2, 2017 at 11:19 PM, Jacob Champion >> wrote: >>> >>> Idle thoughts: "Cannot access memory" in this case could be a red >>> herring, >>> if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems >>> reasonable >>> that the data in question could be mmap'd, given the nice round address >>> and >>> 4 MiB length (equal to APR_MMAP_LIMIT). >>> >>> That doesn't mean we're looking in the wrong place, though, since SIGBUS >>> can >>> also be generated by an out-of-bounds access to an mmap'd region. >> >> >> Right, looks like the memory has been unmapped though (SIGBUS) before >> being (re)used. > > Oh, I thought an access after an unmap would SIGSEGV instead of SIGBUS. I > haven't ever tested that out; I should try it... Hmm, Linux raises SIGBUS if an mmap is used after the underlying file has been truncated (see [1]). Couldn't htcacheclean or alike do something like this? "EnableMMAP off" could definitely help here. [1] http://man7.org/linux/man-pages/man2/mmap.2.html
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 02:32 PM, Yann Ylavic wrote: On Thu, Feb 2, 2017 at 11:19 PM, Jacob Championwrote: Idle thoughts: "Cannot access memory" in this case could be a red herring, if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems reasonable that the data in question could be mmap'd, given the nice round address and 4 MiB length (equal to APR_MMAP_LIMIT). That doesn't mean we're looking in the wrong place, though, since SIGBUS can also be generated by an out-of-bounds access to an mmap'd region. Right, looks like the memory has been unmapped though (SIGBUS) before being (re)used. Oh, I thought an access after an unmap would SIGSEGV instead of SIGBUS. I haven't ever tested that out; I should try it... --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 2, 2017 at 11:19 PM, Jacob Championwrote: > > Idle thoughts: "Cannot access memory" in this case could be a red herring, > if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems reasonable > that the data in question could be mmap'd, given the nice round address and > 4 MiB length (equal to APR_MMAP_LIMIT). > > That doesn't mean we're looking in the wrong place, though, since SIGBUS can > also be generated by an out-of-bounds access to an mmap'd region. Right, looks like the memory has been unmapped though (SIGBUS) before being (re)used. Does "EnableMMAP off" help or produce another backtrace? > > Niklas, what version of APR are you using? Are you serving large (> 4 MiB) > static files? I have not been able to reproduce so far (Ubuntu 16.04, httpd > 2.4.25 + mod_ssl + mpm_event). The original file bucket comes from mod_cache, and indeed looks larger than 4MB. If it were (htcache)cleaned while being served, SIGBUS shouldn't happen still since we hold an fd (and reference) on it... Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 02:04 AM, Yann Ylavic wrote: Hi Niklas, On Wed, Feb 1, 2017 at 7:02 PM, Niklas Edmundssonwrote: We've started to see spurious segfaults with httpd 2.4.25, mpm_event, ssl on Ubuntu 14.04LTS. Not frequent, but none the less happening. #4 ssl_io_filter_output (f=0x7f507013cfe0, bb=0x7f4f840be168) at ssl_engine_io.c:1746 data = 0x7f5075518000 len = 4194304 bucket = 0x7f4f840b1ba8 status = filter_ctx = 0x7f507013cf88 inctx = outctx = 0x7f507013d008 rblock = APR_NONBLOCK_READ Idle thoughts: "Cannot access memory" in this case could be a red herring, if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems reasonable that the data in question could be mmap'd, given the nice round address and 4 MiB length (equal to APR_MMAP_LIMIT). That doesn't mean we're looking in the wrong place, though, since SIGBUS can also be generated by an out-of-bounds access to an mmap'd region. Niklas, what version of APR are you using? Are you serving large (> 4 MiB) static files? I have not been able to reproduce so far (Ubuntu 16.04, httpd 2.4.25 + mod_ssl + mpm_event). --Jacob [1] https://stackoverflow.com/questions/654393/examining-mmaped-addresses-using-gdb
Re: Autobuild Progress (was Re: Automated tests)
On 01/30/2017 12:02 PM, Jacob Champion wrote: - run per-commit incremental builds - run nightly clean builds These two are implemented. Every commit you make to trunk (well, group of commits, within fifteen seconds of each other) is run through an incremental build, which takes about ten seconds. Every eight hours, the build tree is clobbered, resync'd, and built from scratch, which takes about a minute. Reporting is still IRC-only. I want to keep an eye on the bots for bit before unleashing them upon the mailing list. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 2 Feb 2017, Niklas Edmundsson wrote: On Thu, 2 Feb 2017, Yann Ylavic wrote: Are we hitting a corner case of process cleanup that plays merry hell with https/ssl, or are we just having bad luck? Ideas? Suggestions? 2.4.25 is eager to terminate/shutdown keepalive connections more quickly (than previous versions) on graceful shutdown (e.g. MaxConnectionsPerChild reached). What might happen in ssl_io_filter_output() is that buffered output data (already deleted but not cleared) end up being reused on shutdown. Could you please try the attached patch? Built and deployed, waiting for the most affected host to drain in order to restart it. I'll also lower MaxConnectionsPerChild a bit more in order to stress this. Still seems to SIGBUS. backtraces attached. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- New Borg Software Package : Locutus 1-2-3 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= bt.2201.bz2 Description: Binary data bt.6429.bz2 Description: Binary data bt.8071.bz2 Description: Binary data
Re: [Bug 56188] mod_proxy_fcgi does not send FCGI_ABORT_REQUEST on client disconnect
On Thu, Feb 2, 2017 at 5:16 PM, Yann Ylavicwrote: > On Thu, Feb 2, 2017 at 4:51 PM, Luca Toscano wrote: >> >> 2017-01-30 21:58 GMT+01:00 Yann Ylavic : >>> >>> Maybe what is missing is nonblocking reads on the backend side, and on >>> EAGAIN flush on the client side to detect potential socket errors? >> >> Interesting.. So the idea would be to be non blocking while reading from the >> FCGI backend, and in case of EAGAIN flush to the client in order to force >> ap_pass_brigade to detect if the client aborted the connection. > > Yes, poll() => POLLIN => read() while data available => EAGAIN => > flush => poll(). The "read() while data available => EAGAIN" may be replaced by "read() while read buffer is full" so to avoid a (likely) useless last non-blocking read. > > The flush is necessary because output filters down the road might have > buffered, so it makes so that everything so far reaches the socket > (with potential error). > >> The flushing >> part might be tricky to implement (at least for me, but I'll try to work on >> it :). > > It should be as easy as pushing a FLUSH bucket at the end of the > brigade passed to the client ;) > > The hard part may be more the non-blocking read (and handling of > incomplete headers/data). > > Regards, > Yann.
Re: [Bug 56188] mod_proxy_fcgi does not send FCGI_ABORT_REQUEST on client disconnect
On Thu, Feb 2, 2017 at 4:51 PM, Luca Toscanowrote: > > 2017-01-30 21:58 GMT+01:00 Yann Ylavic : >> >> Maybe what is missing is nonblocking reads on the backend side, and on >> EAGAIN flush on the client side to detect potential socket errors? > > Interesting.. So the idea would be to be non blocking while reading from the > FCGI backend, and in case of EAGAIN flush to the client in order to force > ap_pass_brigade to detect if the client aborted the connection. Yes, poll() => POLLIN => read() while data available => EAGAIN => flush => poll(). The flush is necessary because output filters down the road might have buffered, so it makes so that everything so far reaches the socket (with potential error). > The flushing > part might be tricky to implement (at least for me, but I'll try to work on > it :). It should be as easy as pushing a FLUSH bucket at the end of the brigade passed to the client ;) The hard part may be more the non-blocking read (and handling of incomplete headers/data). Regards, Yann.
Re: [Bug 56188] mod_proxy_fcgi does not send FCGI_ABORT_REQUEST on client disconnect
2017-01-30 21:58 GMT+01:00 Yann Ylavic: > On Mon, Jan 30, 2017 at 7:17 PM, Luca Toscano > wrote: > > > > The use case that I had (the one that caused me to check the original > > bugzilla task/patch and work on it) was a long running PHP script > (running > > on HHVM) that wasn't returning anything until the end of the job > (minutes), > > causing mod-proxy-fcgi to keep waiting even if the client disconnected. > In > > the beginning I also thought that there was a way to "signal" the FCGI > > backend to stop whatever was doing (FCGI_ABORT), but it turned out to be > not > > widely implemented (at least from what I have checked, more info in > > bugzilla). Timeout and ProxyTimeout could be a good compromise for this > > particular issue, but it is a "one size fits all" solution that some > users > > didn't like (providing a patch with the pollset solution). > > I think that's what CGIDScriptTimeout or ProxyTimeout are for: avoid > waiting too much for the script or backend (above some reasonably > configured limit). > > > > > If we remove the above use case (I fixed it in my Production environment > > with a long Timeout value), would it be fine to rely on something like > the > > conn->aborted flag (afaics set by the output filter's chain while > writing to > > the client's conn)? It would simplify a lot the problem :) > > AFAICT, if forwarding data to the client fails (e.g. socket > disconnected), ap_pass_brigade() returns an error and the loop exits. > Yes confirmed with a test (that flushes data to output periodically), the connection->aborted flag is set and mod_proxy_fcgi behaves correctly. > Maybe what is missing is nonblocking reads on the backend side, and on > EAGAIN flush on the client side to detect potential socket errors? > Interesting.. So the idea would be to be non blocking while reading from the FCGI backend, and in case of EAGAIN flush to the client in order to force ap_pass_brigade to detect if the client aborted the connection. The flushing part might be tricky to implement (at least for me, but I'll try to work on it :). In any case, it seems like we are in agreement to discard the client connection polling idea (if anybody has more thoughts please speak up). In any case, I believe that this use case would need to be documented properly in the docs to warn users assuming that FCGI_ABORT is implemented. Thanks for the review! Luca
Re: Underscores in hostnames
Am 02.02.2017 um 14:22 schrieb Reindl Harald: Am 02.02.2017 um 13:53 schrieb Joe Orton: Another 2.4.25 regression reported from a Fedora user is that underscores in hostnames are rejected by default now. I couldn't see a specific discussion of this, was it deliberate? underscores are not allowed in host names by RFC and many things will break at all with them because in different layers of client software things just break by using them we had that more than once in devel environments where strange bugs turend out to be another case where sombody used a underline in /etc/hosts and his local webserver here you go: https://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_hostnames The Internet standards (Requests for Comments) for protocols mandate that component hostname labels may contain only the ASCII letters 'a' through 'z' (in a case-insensitive manner), the digits '0' through '9', and the hyphen ('-'). The original specification of hostnames in RFC 952, mandated that labels could not start with a digit or with a hyphen, and must not end with a hyphen. However, a subsequent specification (RFC 1123) permitted hostname labels to start with digits. No other symbols, punctuation characters, or white space are permitted. While a hostname may not contain other characters, such as the underscore character (_), other DNS names may contain the underscore.[4] Systems such as DomainKeys and service records use the underscore as a means to assure that their special character is not confused with hostnames. For example, _http._sctp.www.example.com specifies a service pointer for an SCTP capable webserver host (www) in the domain example.com. Note that some applications (e.g. Microsoft Internet Explorer) won't work correctly if any part of the hostname contains an underscore character.[5]
Re: Underscores in hostnames
Am 02.02.2017 um 13:53 schrieb Joe Orton: Another 2.4.25 regression reported from a Fedora user is that underscores in hostnames are rejected by default now. I couldn't see a specific discussion of this, was it deliberate? underscores are not allowed in host names by RFC and many things will break at all with them because in different layers of client software things just break by using them we had that more than once in devel environments where strange bugs turend out to be another case where sombody used a underline in /etc/hosts and his local webserver
Re: Underscores in hostnames
AFAIK, underscores are forbidden from being part of a host name as per RFC 1123 Sec 2.1/RFC 952 (Assummptions Sec 1) It's also spelled out in RFC 3986: " A registered name intended for lookup in the DNS (...) consists of a sequence of domain labels separated by ".", each domain label starting and ending with an alphanumeric character and possibly also containing "-" characters. " Issac On 2/2/2017 2:53 PM, Joe Orton wrote: Another 2.4.25 regression reported from a Fedora user is that underscores in hostnames are rejected by default now. I couldn't see a specific discussion of this, was it deliberate? Following breadcrumbs... https://tools.ietf.org/html/rfc7230#section-5.4 Host = uri-host [ ":" port ] ; Section 2.7.1 https://tools.ietf.org/html/rfc7230#section-2.7 uri-host =https://tools.ietf.org/html/rfc3986#section-3.2.2 host= IP-literal / IPv4address / reg-name ... reg-name= *( unreserved / pct-encoded / sub-delims ) https://tools.ietf.org/html/rfc3986#section-2.3 unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" From google I can see that _ in hostnames has various issues, so I'm not sure what is right here. It's simple enough to relax the check, seems unlikely allowing ~ is a good idea. Index: server/vhost.c === --- server/vhost.c (revision 1781359) +++ server/vhost.c (working copy) @@ -757,7 +757,10 @@ int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0; for (ch = host; *ch; ch++) { -if (apr_isalpha(*ch) || *ch == '-') { +/* This should allow any character in 'uri-host' per RFC + * 7320s5.4, which is 'host' by RFC 3986, which matches any + * 'unreserved' character. */ +if (apr_isalpha(*ch) || *ch == '-' || *ch == '_') { is_dotted_decimal = 0; } else if (ch[0] == '.') {
Underscores in hostnames
Another 2.4.25 regression reported from a Fedora user is that underscores in hostnames are rejected by default now. I couldn't see a specific discussion of this, was it deliberate? Following breadcrumbs... https://tools.ietf.org/html/rfc7230#section-5.4 Host = uri-host [ ":" port ] ; Section 2.7.1 https://tools.ietf.org/html/rfc7230#section-2.7 uri-host =https://tools.ietf.org/html/rfc3986#section-3.2.2 host= IP-literal / IPv4address / reg-name ... reg-name= *( unreserved / pct-encoded / sub-delims ) https://tools.ietf.org/html/rfc3986#section-2.3 unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" >From google I can see that _ in hostnames has various issues, so I'm not sure what is right here. It's simple enough to relax the check, seems unlikely allowing ~ is a good idea. Index: server/vhost.c === --- server/vhost.c (revision 1781359) +++ server/vhost.c (working copy) @@ -757,7 +757,10 @@ int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0; for (ch = host; *ch; ch++) { -if (apr_isalpha(*ch) || *ch == '-') { +/* This should allow any character in 'uri-host' per RFC + * 7320s5.4, which is 'host' by RFC 3986, which matches any + * 'unreserved' character. */ +if (apr_isalpha(*ch) || *ch == '-' || *ch == '_') { is_dotted_decimal = 0; } else if (ch[0] == '.') {
AW: httpd 2.4.25, mpm_event, ssl: Status of async write completion?
> -Ursprüngliche Nachricht- > Von: Niklas Edmundsson [mailto:ni...@acc.umu.se] > Gesendet: Donnerstag, 2. Februar 2017 13:31 > An: httpd-dev> Betreff: httpd 2.4.25, mpm_event, ssl: Status of async write completion? > > > Hi all! > > As we're seeing more and more https on ftp.acc.umu.se I've noticed > that the number of threads listed as state W in server-status has > skyrocketed. > > From 2-4 threads busy using http we're talking 70-100 threads for the > same bandwidth when the machine is pushing a mighty 2.7% average CPU > load. > > This is on a large-file workload, serving plain files. > > I anticipated a few more threads due to ssl, but for slow downloaders > I would expect connections doing async write and mostly waiting for > the bits to dribble through the Internet Tubes inbetween the > occational wakeup to do ssl > > Are we not doing async write completion at all on https/ssl and > falling back to the old worker behaviour of one thread per connection? We do async with SSL as well, but in a worse way than http with static resources. For static resources in the form of files we can deliver the whole thing async without switching back to sync with http. For SSL we are in sync mode each time we need to encrypt. Only encrypted data that cannot be sent immediately is sent async. So the increase you see can be said to be expected. Trunk has approaches to overcome this, though there has been reported that there are still some bugs with it. Regards Rüdiger
httpd 2.4.25, mpm_event, ssl: Status of async write completion?
Hi all! As we're seeing more and more https on ftp.acc.umu.se I've noticed that the number of threads listed as state W in server-status has skyrocketed. From 2-4 threads busy using http we're talking 70-100 threads for the same bandwidth when the machine is pushing a mighty 2.7% average CPU load. This is on a large-file workload, serving plain files. I anticipated a few more threads due to ssl, but for slow downloaders I would expect connections doing async write and mostly waiting for the bits to dribble through the Internet Tubes inbetween the occational wakeup to do ssl Are we not doing async write completion at all on https/ssl and falling back to the old worker behaviour of one thread per connection? /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- At last I'm organized", he sighed, and died. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 2 Feb 2017, Yann Ylavic wrote: Are we hitting a corner case of process cleanup that plays merry hell with https/ssl, or are we just having bad luck? Ideas? Suggestions? 2.4.25 is eager to terminate/shutdown keepalive connections more quickly (than previous versions) on graceful shutdown (e.g. MaxConnectionsPerChild reached). What might happen in ssl_io_filter_output() is that buffered output data (already deleted but not cleared) end up being reused on shutdown. Could you please try the attached patch? Built and deployed, waiting for the most affected host to drain in order to restart it. I'll also lower MaxConnectionsPerChild a bit more in order to stress this. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- At last I'm organized", he sighed, and died. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: mod_http2 and Frequent wake-ups for mpm_event
Am 02.02.2017 um 11:09 schrieb Yann Ylavic: > Hi Stefan, > > On Tue, Jan 31, 2017 at 4:01 PM, Stefan Priebe - Profihost AG >wrote: >> >> any ideas? > > I wonder if the attached patch (related to mod_ssl and proposed for > another segfault report) could help in your case. > > Would you mind give it a try? Up and running. It will take some time until i can report back as those crashes are very rare. Stefan
Re: mod_http2 and Frequent wake-ups for mpm_event
Hi Stefan, On Tue, Jan 31, 2017 at 4:01 PM, Stefan Priebe - Profihost AGwrote: > > any ideas? I wonder if the attached patch (related to mod_ssl and proposed for another segfault report) could help in your case. Would you mind give it a try? Thanks, Yann. Index: modules/ssl/ssl_engine_io.c === --- modules/ssl/ssl_engine_io.c (revision 1781324) +++ modules/ssl/ssl_engine_io.c (working copy) @@ -138,6 +138,7 @@ static int bio_filter_out_pass(bio_filter_out_ctx_ outctx->rc = ap_pass_brigade(outctx->filter_ctx->pOutputFilter->next, outctx->bb); +apr_brigade_cleanup(outctx->bb); /* Fail if the connection was reset: */ if (outctx->rc == APR_SUCCESS && outctx->c->aborted) { outctx->rc = APR_ECONNRESET; @@ -1699,13 +1700,12 @@ static apr_status_t ssl_io_filter_output(ap_filter while (!APR_BRIGADE_EMPTY(bb) && status == APR_SUCCESS) { apr_bucket *bucket = APR_BRIGADE_FIRST(bb); -if (APR_BUCKET_IS_METADATA(bucket)) { +if (APR_BUCKET_IS_METADATA(bucket) || !filter_ctx->pssl) { /* Pass through metadata buckets untouched. EOC is * special; terminate the SSL layer first. */ if (AP_BUCKET_IS_EOC(bucket)) { ssl_filter_io_shutdown(filter_ctx, f->c, 0); } -AP_DEBUG_ASSERT(APR_BRIGADE_EMPTY(outctx->bb)); /* Metadata buckets are passed one per brigade; it might * be more efficient (but also more complex) to use @@ -1712,11 +1712,10 @@ static apr_status_t ssl_io_filter_output(ap_filter * outctx->bb as a true buffer and interleave these with * data buckets. */ APR_BUCKET_REMOVE(bucket); -APR_BRIGADE_INSERT_HEAD(outctx->bb, bucket); -status = ap_pass_brigade(f->next, outctx->bb); -if (status == APR_SUCCESS && f->c->aborted) -status = APR_ECONNRESET; -apr_brigade_cleanup(outctx->bb); +APR_BRIGADE_INSERT_TAIL(outctx->bb, bucket); +if (bio_filter_out_pass(outctx) < 0) { +status = outctx->rc; +} } else { /* Filter a data bucket. */
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Hi Niklas, On Wed, Feb 1, 2017 at 7:02 PM, Niklas Edmundssonwrote: > > We've started to see spurious segfaults with httpd 2.4.25, mpm_event, ssl on > Ubuntu 14.04LTS. Not frequent, but none the less happening. > > #4 ssl_io_filter_output (f=0x7f507013cfe0, bb=0x7f4f840be168) at > ssl_engine_io.c:1746 > data = 0x7f5075518000 0x7f5075518000> > len = 4194304 > bucket = 0x7f4f840b1ba8 > status = > filter_ctx = 0x7f507013cf88 > inctx = > outctx = 0x7f507013d008 > rblock = APR_NONBLOCK_READ I suspect some cleanup ordering issue happening in ssl_io_filter_output(), when the EOC bucket is found. > > Are we hitting a corner case of process cleanup that plays merry hell with > https/ssl, or are we just having bad luck? Ideas? Suggestions? 2.4.25 is eager to terminate/shutdown keepalive connections more quickly (than previous versions) on graceful shutdown (e.g. MaxConnectionsPerChild reached). What might happen in ssl_io_filter_output() is that buffered output data (already deleted but not cleared) end up being reused on shutdown. Could you please try the attached patch? Regards, Yann. Index: modules/ssl/ssl_engine_io.c === --- modules/ssl/ssl_engine_io.c (revision 1781324) +++ modules/ssl/ssl_engine_io.c (working copy) @@ -138,6 +138,7 @@ static int bio_filter_out_pass(bio_filter_out_ctx_ outctx->rc = ap_pass_brigade(outctx->filter_ctx->pOutputFilter->next, outctx->bb); +apr_brigade_cleanup(outctx->bb); /* Fail if the connection was reset: */ if (outctx->rc == APR_SUCCESS && outctx->c->aborted) { outctx->rc = APR_ECONNRESET; @@ -1699,13 +1700,12 @@ static apr_status_t ssl_io_filter_output(ap_filter while (!APR_BRIGADE_EMPTY(bb) && status == APR_SUCCESS) { apr_bucket *bucket = APR_BRIGADE_FIRST(bb); -if (APR_BUCKET_IS_METADATA(bucket)) { +if (APR_BUCKET_IS_METADATA(bucket) || !filter_ctx->pssl) { /* Pass through metadata buckets untouched. EOC is * special; terminate the SSL layer first. */ if (AP_BUCKET_IS_EOC(bucket)) { ssl_filter_io_shutdown(filter_ctx, f->c, 0); } -AP_DEBUG_ASSERT(APR_BRIGADE_EMPTY(outctx->bb)); /* Metadata buckets are passed one per brigade; it might * be more efficient (but also more complex) to use @@ -1712,11 +1712,10 @@ static apr_status_t ssl_io_filter_output(ap_filter * outctx->bb as a true buffer and interleave these with * data buckets. */ APR_BUCKET_REMOVE(bucket); -APR_BRIGADE_INSERT_HEAD(outctx->bb, bucket); -status = ap_pass_brigade(f->next, outctx->bb); -if (status == APR_SUCCESS && f->c->aborted) -status = APR_ECONNRESET; -apr_brigade_cleanup(outctx->bb); +APR_BRIGADE_INSERT_TAIL(outctx->bb, bucket); +if (bio_filter_out_pass(outctx) < 0) { +status = outctx->rc; +} } else { /* Filter a data bucket. */