On 23.03.2010 15:30, Jeff Trawick wrote:
On Tue, Mar 23, 2010 at 10:04 AM, Rainer Jung<rainer.j...@kippdata.de>  wrote:
On 23.03.2010 13:34, Jeff Trawick wrote:

On Tue, Mar 23, 2010 at 7:19 AM, Rainer Jung<rainer.j...@kippdata.de>
  wrote:

I can currently reproduce the following problem with 2.2.15 event MPM
under
high load:

When an httpd child process gets closed due to the max spare threads rule
and it holds established client connections for which it has fully
received
a keep alive request, but not yet send any part of the response, it will
simply close that connection.

Is that expected behaviour? It doesn't seem reproducible for the worker
MPM.
The behaviour has been observed using extreme spare rules in order to
make
processes shut down often, but it still seems not right.

Is this the currently-unhandled situation discussed in this thread?


http://mail-archives.apache.org/mod_mbox/httpd-dev/200711.mbox/%3ccc67648e0711130530h45c2a28ctcd743b2160e22...@mail.gmail.com%3e

Perhaps Event's special handling for keepalive connections results in
the window being encountered more often?

I'd say yes. I know from the packet trace, that the previous response on the
same connection got "Connection: Keep-Alive". But from the time gap of about
0.5 seconds between receving the next request and sending the FIN, I guess,
that the child was not already in the process of shutting down, when the
previous "Connection: Keep-Alive" response was send.

So for me the question is: if the web server already acknowledged the next
request (in our case it's a GET request, and a TCP ACK), should it wait with
shutting down the child until the request has been processed and the
response has been send (and in this case "Connetion: Close" was included)?

Since the ACK is out of our control, that situation is potentially
within the race condition.


For the connections which do not have another request pending, I see no
problem in closing them - although there could be a race condition. When
there's a race (client sends next request while server sends FIN), the
client doesn't expect the server to handle the request (it can always happen
when a Keep Alive connection times out). In the situation observed it is
annoying, that the server already accepted the next request and nevertheless
closes the connection without handling the request.

All we can know is whether or not the socket is readable at the point
where we want to gracefully exit the process.  In keepalive state we'd
wait for {timeout, readability, shutdown-event}, and if readable at
wakeup then try to process it unless
!c->base_server->keep_alive_while_exiting&&
ap_graceful_stop_signalled().

I will do some testing around your patch

http://people.apache.org/~trawick/keepalive.txt

I don't think the patch will cover Event.  It modifies
ap_process_http_connection(); ap_process_http_async_connection() is
used with Event unless there are "clogging input filters."  I guess
the analogous point of processing is inside Event itself.

I guess if KeepAliveWhileExiting is enabled (whoops, that's
vhost-specific) then Event would have substantially different shutdown
logic.

I could now take a second look at it. Directly porting your patch to trunk and event is straightforward. There remains a hard problem though: the listener thread has a big loop of type

    while (!listener_may_exit) {
        apr_pollset_poll(...)
        while (HANDLE_EVENTS) {
            if (READABLE_SOCKET)
                ...
            else if (ACCEPT)
                ...
        }
        HANDLE_KEEPALIVE_TIMEOUTS
        HANDLE_WRITE_COMPLETION_TIMEOUTS
    }

Obviously, if we want to respect any previously retunred "Connection: Keep-Alive" headers, we can't terminate the loop on listeners_may_exit. As a first try, I switched to:

    while (1) {
        if (listener_may_exit)
            ap_close_listeners();
        apr_pollset_poll(...);
        REMOVE_LISTENERS_FROM_POLLSET
        while (HANDLE_EVENTS) {
            if (READABLE_SOCKET)
                ...
            else if (ACCEPT)
                ...
        }
        HANDLE_KEEPALIVE_TIMEOUTS
        HANDLE_WRITE_COMPLETION_TIMEOUTS
    }

Now the listeners get closed and in combination with your patch the connections will not be dropped, but instead will receive a "Connection: close" during the next request.

Now the while-loop lacks a correct break criterium. It would need to stop, when the pollset is empty (listeners were removed, other connections were closed due to end of Keep-Alive or timeout). Unfortunately there is no API function for checking whether there are still sockets in the pollset and it isn't straightforward how to do that.

Another possibility would be to wait for a maximum of the vhost keepalive timeouts. But that seems to be a bit to much.

Any ideas or comments?

Regards,

Rainer

Reply via email to