On Tue, Jul 6, 2021 at 4:22 PM Stefan Eissing <stefan.eiss...@greenbytes.de> wrote: > > Coming back to this discussion, starting at the head because this has become > a bit nested, I try to summarise: > > Fact: when the client of a proxied http request aborts the connection (as in > causing a c->aborted somehow), mod_proxy_http only react to this when writing > (parts of) a response. The time this takes depends on the responsiveness of > the backend. Long delay can commonly come from invoking an expensive request > (long time to compute) or from a long-running request (as in > server-side-events, SSEs).
Right. > > Even disregarding SSEs and opinions about its design, we could reduce > resource waste in our server when we can eliminate those delays. We could let idle connections be handled by the MPM's listener thread instead of holding a worker thread. > > In mod_proxy_wstunnel we have to monitor frontend and backend connection > simultaneously due to the nature of the protocol and there the delay will not > happen. This uses a pollset in 2.4.x and in trunk Yann wrapped that into > ap_proxy_tunnel_* functions. Both trunk and 2.4.x use the same proxy tunneling mechanism/functions since 2.4.48, mod_proxy_wstunnel is now an "empty shell" falling back to mod_proxy_http (which is the one creating and starting the tunnel, only for Upgrade(d) protocols so far). In 2.4.x the tunnel is holding a worker thread for the lifetime of the connections still, while trunk goes one step further by allowing the configuration of an AsyncDelay above which when both client and backend connections are idle their polling is deferred to MPM event by using the mpm_register_poll_callback_timeout hook mechanism (the connections are handled by the registered callback from now). The handler then returns SUSPENDED and the worker thread is given back to the MPM (to take another work like handling a new incoming connection or running the callback that resumes the tunneling loop once the connections are ready, and so on..). > > It seeems that ap_proxy_tunnel_* should be usable in mod_proxy_http as well. It is already, see above. > Especially when waiting for the status line and when streaming the response > body. When the frontend connection is HTTP/1.1... That part is indeed missing, for that we'd need to move the HTTP state tracking and response parsing in a hook/callback run by the tunneling loop for each chunk of request/response data or connection event (possibly different hooks depending on the type of event). > > Alas, for HTTP/2, this will not do the trick since H2 secondary connections > do not really have their own socket for polling. But that is a long-open > issue which needs addressing. Not that easy, but certainly beneficial to have > a PIPE socket pair where H2 main connection and workers can at least notify > each other, if not transfer the actual data over. The tunneling loop needs an fd on both sides for polling, but once any fd triggers (is ready) the loop uses the usual input/output filters chain to read/write the buckets brigade. So possibly two pipes per stream could do it for h2<->h1: - the main connection writes incoming data to istream[1] to make them available for the secondary connection on istream[0] - the secondary connection writes outgoing data to ostream[1] to make them available for the main connection on ostream[0] The advantage of the pipes (over socketpair or else) is that they are available on all platforms and apr_pollset() compatible already. That's two fds per pipe but with the above they could be reused for successive streams, provided all the data have been consumed. With this most of the missing work would be on the main connection side I suppose, writing incoming data as they arrive is probably easy but some kind of listener is needed for the responses on the multiple ostream[0], to multiplex them on the main connection. For istream at least, I don't think we can write raw incoming data as is though, more a tuple like (type, length, data) for each chunk so to be able to pass meta like RST_STREAM or alike. On the h1 / secondary connection side all this becomes kind of transparent, all we need is a pair of core secondary input/output filters able to read from / write to the pipes (eventually extracting and interpreting the tuples as needed), plus small changes here and there to accommodate for ap_get_conn_socket() which could actually be a pipe now. > > Another approach could be to "stutter" the blocking backend reads in > mod_proxy_http with a 5 sec socket timeout or so, only to check > frontend->aborted and read again. That might be a minimum effort approach for > the short term. Like Eric I don't think this can work, we are unlikely to have ->aborted set on read, unless the connection is really reset by peer. So with a half-closed connection we still owe a response to the frontend.. > > Yann, did I get this right? Hopefully I made sense above with the picture of what could/remains-to be done ;) > Eric had also the suggestion to only do this on certain content types, but > that would not solve slow responsiveness. The current proxy tunneling mechanism in mod_proxy_http is triggered by an "Upgrade: <proto>" header requested by the client and a "101 Upgrade" accept/reply from the backend. I think what Eric proposed is that we also initiate the tunnel based on some configured Content-Type(s) or maybe some r->notes set by mod_h2 (if it can determine from the start that it's relevant). The issue I see here is that we probably won't have something to rely on from the backend meaning that it is OK for tunneling (like the "101 Upgrade"), and thus a client forging a Content-Type can open a tunnel and do whatever until the connection is closed (no HTTP parsing/check anymore). Plus with h2 it wouldn't be enough, the h2<->h1 pipes from above are needed for the tunneling loop to work. Regards; Yann.