Re: [haproxy/haproxy] OPTIM/MINOR: h2_settings_initial_window_size default 64k (PR #1732)

2022-06-08 Thread Glenn Strauss
On Wed, Jun 08, 2022 at 02:42:10PM +0200, Willy Tarreau wrote:
> On Wed, Jun 08, 2022 at 08:29:48AM -0400, Glenn Strauss wrote:
> > > > Another approach might be to pre-emptively effectively double the
> > > > window size by sending WINDOW_UPDATE with the entire window size (65535)
> > > > back to the client upon receive of first DATA frame, and then keeping
> > > > track on server-side until that 65535 is exhausted before sending
> > > > another WINDOW_UPDATE.
> > > 
> > > Sending too large stream window sizes is problematic when some streams
> > > can be processed slowly, as it easily causes head-of-line blocking. If
> > > the advertised window is larger than the available storage, a paused
> > > stream will block other ones.
> > 
> >  we know that poorly written code knows no bounds.  Since the
> > server is permitted to increase the window size, inability by the client
> > to increase the window size is a bug in the client.  The client need
> > only track the window size -- the client need not actually use the extra
> > window size if the client does not wish to do so.
> 
> I was not speaking about the client but the server. If the server
> advertises more than it can buffer itself, you can end up with blocked
> streams: there's an elephant in the pipe that you can't pull completely
> because you have nowhere to store it, and because of this you can't
> access other frames that follow (the funniest ones being window updates
> from the client that are expected to release some room on the response
> side, but that basically only happens with echo servers). But you can
> leave headers frames hanging making the client think the server is taking
> time to process the requests while in fact the requests are not yet parsed,
> they're just blocked.

The generic statement you make is true, but does not apply here unless
the server would not have sent the WINDOW_UPDATE to replenish the
window, in which case the answer is to defer sending the WINDOW_UPDATE.

Since DATA frames might be in flight on the network, the server may want
to be able to buffer twice the advertisted window size and defer sending
WINDOW_UPDATE once the advertised window size is buffered.  Doing so
gives the elephant in the pipe just enough space for a well-behaved
client adhering to the window size restrictions, and without the
undesirable thought of delaying the send WINDOW_UPDATE frames by RTT.

Cheers, Glenn



Re: [haproxy/haproxy] OPTIM/MINOR: h2_settings_initial_window_size default 64k (PR #1732)

2022-06-08 Thread Glenn Strauss
On Wed, Jun 08, 2022 at 08:28:18AM +0200, Willy Tarreau wrote:
> On Tue, Jun 07, 2022 at 05:24:09PM -0400, Glenn Strauss wrote:
> > Yes, having the server merge the sizes of small DATA frames into a
> > larger WINDOW_UPDATE might be useful, too, but is independent and
> > complimentary to my patch proposed in
> > https://github.com/haproxy/haproxy/pull/1732

*complementary  [correcting myself]

> I agree that it's independent but it's the one that is not expected to
> cause any regression with any possible client. That's why I'd like to
> have the two. First that one because it should be durable. Second, your
> patch as an optimization, and if anyone complains about trouble with it
> we can safely revert it or decide what to do without worrying about all
> deployed curl clients that will not be updated.

Were this to cause a theoretical regression, this default initial window
size can be configured in haproxy, so the workaround is to configure the
value back to 65535.

> > One approach might be to collect small DATA frames and send a
> > WINDOW_UPDATE once a threshold is reached.
> 
> I thought about that one, focusing on 1/16 of a max frame size at once,
> but I don't like it much because it means we'd delay some WU, and the
> H2 spec was very clear about the fact that we ought to do our best to
> timely ack data in order to avoid deadlocks. You can have clients that
> stop sending until they're able to read a new block of a preconfigured
> window size for example (and that's where 65535 was an absurd choice
> in the first place). For example, a client that is hard-coded to use
> a 65535 buffer and needs to wait for the WU to replace that buffer
> would not receive an update for the block that spans from 61440 to
> 65535 and would not update the buffer. Thus I really want to be certain
> that we do not let processed DATA frames without the equivalent WU sent
> back.
> 
> > The upcoming lighttpd 1.4.65
> > does just that, collecting updates until 16384 is reached, and then
> > sending a WINDOW_UPDATE for 16384.  Cost: a single 1-RTT delay in
> > sending a WINDOW_UPDATE to continue uploads larger than 112k-1.
> 
> It can definitely cause some inefficient clients to hang if they need
> their buffer to be fully acked and their size is not a multiple of the
> frame size. Be careful about this.

I ended up adjusting this before releasing lighttpd 1.4.65.

For the same few instructions, lighttpd 1.4.65 now sends WINDOW_UPDATE
for 16384 when lighttpd receives the first DATA frame containing data
(1-16384 bytes), and then does not send WINDOW_UPDATE until the next
16384 block is started.  This avoids any delay in sending WINDOW_UPDATE,
though it does (temporarily) increase the window size allowed in the
client if the client has not already sent data consuming the window.

> > Another approach might be to pre-emptively effectively double the
> > window size by sending WINDOW_UPDATE with the entire window size (65535)
> > back to the client upon receive of first DATA frame, and then keeping
> > track on server-side until that 65535 is exhausted before sending
> > another WINDOW_UPDATE.
> 
> Sending too large stream window sizes is problematic when some streams
> can be processed slowly, as it easily causes head-of-line blocking. If
> the advertised window is larger than the available storage, a paused
> stream will block other ones.

 we know that poorly written code knows no bounds.  Since the
server is permitted to increase the window size, inability by the client
to increase the window size is a bug in the client.  The client need
only track the window size -- the client need not actually use the extra
window size if the client does not wish to do so.

Cheers, Glenn



Re: [haproxy/haproxy] OPTIM/MINOR: h2_settings_initial_window_size default 64k (PR #1732)

2022-06-07 Thread Glenn Strauss
On Tue, Jun 07, 2022 at 09:27:43AM -0700, Willy Tarreau wrote:
> Hello Glenn,
> 
> Thanks for your report, I understand the problem, that's very interesting. I 
> would say it's not even an optim, rather an approach trying to help the whole 
> ecosystem at a very low cost by preventing known degenerative cases from 
> happening (or helping them recover).

Reference to issue (for the list)
https://github.com/haproxy/haproxy/pull/1732
https://github.com/nghttp2/nghttp2/issues/1722
https://github.com/curl/curl/pull/8965

Summary: degenerative HTTP/2 client behavior sending mostly 1-byte DATA
frames for a fast client which continually exhausts the HTTP/2 send
window (e.g. for a large file upload) *and* for a simple client which
empties and reuses a single power-2 sized buffer.  A simple server
sending WINDOW_UPDATE for each DATA frame to replenish the window may
reflect back the exact size of the DATA frame.

> However I don't fully agree with the solution, it might just push the issue a 
> bit sidewards.

Changing the default SETTINGS_INITIAL_WINDOW_SIZE does push the issue a
bit sideways, but in doing so mitigates a demonstrated (not theoretical)
issue which might befall 6+ years of curl installations, a sizable
number of which might not be upgraded for a very long time.

> For example a client might already have worked around this issue by using a 
> 65537-bytes buffer and will now face issues again.

I do not follow.  How might that be?
Also, do you know of any such implementations?

For a slightly more intelligent client actively trying to avoid the
degenerative behavior, a simple approach is to avoid injecting tiny DATA
frames when the window is larger and there is more data to be sent.  I
have proposed a patch to curl in https://github.com/curl/curl/pull/8965

The default SETTINGS_MAX_FRAME_SIZE is 16384.

My proposed change in https://github.com/haproxy/haproxy/pull/1732
merely changes the default SETTINGS_INITIAL_WINDOW_SIZE in haproxy
from 65535 to be 64k, a multiple of the default SETTINGS_MAX_FRAME_SIZE.
This provides a general benefit and avoids the degenerative behavior
described in the links at top.

> Thus I'd first prefer to solve it at the core, then possibly merge this patch 
> later as an optimization but nor for the problem, rather to make sure clients 
> can make 4 full 16kB frames, and improve bandwidth and CPU efficiency.

Yes, having the server merge the sizes of small DATA frames into a
larger WINDOW_UPDATE might be useful, too, but is independent and
complimentary to my patch proposed in
https://github.com/haproxy/haproxy/pull/1732

> Given the nature of the problem, I don't imagine we'd meet this issue with 
> interleaved 1-byte frames from multiple streams over the same connection.

Actually, haproxy sets the session window size to be a very large
number, effectively removing HTTP/2 application-level flow-control
at the session level.  That leaves independent streams with the
default SETTINGS_INITIAL_WINDOW_SIZE of 65535, so multiple uploads
can independently exhaust the stream send window and run into this
degenerative behavior provided the client is fast enough, the bandwidth
delay product is large enough, and the client is fully emptying a
power-2 sized buffer before reusing it.

> I'm thinking how we could improve the situation on our side by sending less 
> common window updates. Right now we send the stream WU after reading a frame 
> because of the likelihood that the next frame is not for the same stream and 
> that we lose track of which stream needs to be updated. We can change this to 
> only send a single frame for a number of subsequent frames from the same 
> stream, it will solve the problem for the case you're facing and will be more 
> friendly to the client. Given the nature of the problem, I don't imagine we'd 
> meet this issue with interleaved 1-byte frames from multiple streams over the 
> same connection. If that were the case we'd then need to send less updates, 
> waiting for a sub-divider to be crossed (e.g. check when 1/16th of the max 
> frame size is crossed). But here I really don't think we'd need to go that 
> far and I think that sending grouped updates will be sufficient.

One approach might be to collect small DATA frames and send a
WINDOW_UPDATE once a threshold is reached.  The upcoming lighttpd 1.4.65
does just that, collecting updates until 16384 is reached, and then
sending a WINDOW_UPDATE for 16384.  Cost: a single 1-RTT delay in
sending a WINDOW_UPDATE to continue uploads larger than 112k-1.

Another approach might be to pre-emptively effectively double the
window size by sending WINDOW_UPDATE with the entire window size (65535)
back to the client upon receive of first DATA frame, and then keeping
track on server-side until that 65535 is exhausted before sending
another WINDOW_UPDATE.  Again, instead of 65535, I would recommend a
65536 or other multiple of SETTINGS_MAX_FRAME_SIZE.

Cheers, Glenn