On Tue, Jul 6, 2021 at 4:22 PM Stefan Eissing
<stefan.eiss...@greenbytes.de> wrote:
>
> Coming back to this discussion, starting at the head because this has become 
> a bit nested, I try to summarise:
>
> Fact: when the client of a proxied http request aborts the connection (as in 
> causing a c->aborted somehow), mod_proxy_http only react to this when writing 
> (parts of) a response. The time this takes depends on the responsiveness of 
> the backend. Long delay can commonly come from invoking an expensive request 
> (long time to compute) or from a long-running request (as in 
> server-side-events, SSEs).

Right.

>
> Even disregarding SSEs and opinions about its design, we could reduce 
> resource waste in our server when we can eliminate those delays.

We could let idle connections be handled by the MPM's listener thread
instead of holding a worker thread.

>
> In mod_proxy_wstunnel we have to monitor frontend and backend connection 
> simultaneously due to the nature of the protocol and there the delay will not 
> happen. This uses a pollset in 2.4.x and in trunk Yann wrapped that into 
> ap_proxy_tunnel_* functions.

Both trunk and 2.4.x use the same proxy tunneling mechanism/functions
since 2.4.48, mod_proxy_wstunnel is now an "empty shell" falling back
to mod_proxy_http (which is the one creating and starting the tunnel,
only for Upgrade(d) protocols so far).
In 2.4.x the tunnel is holding a worker thread for the lifetime of the
connections still, while trunk goes one step further by allowing the
configuration of an AsyncDelay above which when both client and
backend connections are idle their polling is deferred to MPM event by
using the mpm_register_poll_callback_timeout hook mechanism (the
connections are handled by the registered callback from now). The
handler then returns SUSPENDED and the worker thread is given back to
the MPM (to take another work like handling a new incoming connection
or running the callback that resumes the tunneling loop once the
connections are ready, and so on..).

>
> It seeems that ap_proxy_tunnel_* should be usable in mod_proxy_http as well.

It is already, see above.

> Especially when waiting for the status line and when streaming the response 
> body. When the frontend connection is HTTP/1.1...

That part is indeed missing, for that we'd need to move the HTTP state
tracking and response parsing in a hook/callback run by the tunneling
loop for each chunk of request/response data or connection event
(possibly different hooks depending on the type of event).

>
> Alas, for HTTP/2, this will not do the trick since H2 secondary connections 
> do not really have their own socket for polling. But that is a long-open 
> issue which needs addressing. Not that easy, but certainly beneficial to have 
> a PIPE socket pair where H2 main connection and workers can at least notify 
> each other, if not transfer the actual data over.

The tunneling loop needs an fd on both sides for polling, but once any
fd triggers (is ready) the loop uses the usual input/output filters
chain to read/write the buckets brigade.

So possibly two pipes per stream could do it for h2<->h1:
- the main connection writes incoming data to istream[1] to make them
available for the secondary connection on istream[0]
- the secondary connection writes outgoing data to ostream[1] to make
them available for the main connection on ostream[0]

The advantage of the pipes (over socketpair or else) is that they are
available on all platforms and apr_pollset() compatible already.
That's two fds per pipe but with the above they could be reused for
successive streams, provided all the data have been consumed.

With this most of the missing work would be on the main connection
side I suppose, writing incoming data as they arrive is probably easy
but some kind of listener is needed for the responses on the multiple
ostream[0], to multiplex them on the main connection.
For istream at least, I don't think we can write raw incoming data as
is though, more a tuple like (type, length, data) for each chunk so to
be able to pass meta like RST_STREAM or alike.

On the h1 / secondary connection side all this becomes kind of
transparent, all we need is a pair of core secondary input/output
filters able to read from / write to the pipes (eventually extracting
and interpreting the tuples as needed), plus small changes here and
there to accommodate for ap_get_conn_socket() which could actually be
a pipe now.

>
> Another approach could be to "stutter" the blocking backend reads in 
> mod_proxy_http with a 5 sec socket timeout or so, only to check 
> frontend->aborted and read again. That might be a minimum effort approach for 
> the short term.

Like Eric I don't think this can work, we are unlikely to have
->aborted set on read, unless the connection is really reset by peer.
So with a half-closed connection we still owe a response to the frontend..

>
> Yann, did I get this right?

Hopefully I made sense above with the picture of what could/remains-to
be done ;)

> Eric had also the suggestion to only do this on certain content types, but 
> that would not solve slow responsiveness.

The current proxy tunneling mechanism in mod_proxy_http is triggered
by an "Upgrade: <proto>" header requested by the client and a "101
Upgrade" accept/reply from the backend. I think what Eric proposed is
that we also initiate the tunnel based on some configured
Content-Type(s) or maybe some r->notes set by mod_h2 (if it can
determine from the start that it's relevant).
The issue I see here is that we probably won't have something to rely
on from the backend meaning that it is OK for tunneling (like the "101
Upgrade"), and thus a client forging a Content-Type can open a tunnel
and do whatever until the connection is closed (no HTTP parsing/check
anymore). Plus with h2 it wouldn't be enough, the h2<->h1 pipes from
above are needed for the tunneling loop to work.


Regards;
Yann.

Reply via email to