Hi Christopher, Thanks for the response.
> Sorry, I don't understand, the response was successfully sent to the client > when this happens or not ? It is "just" an issue with the termination state > or there is also an issue with the response itself ? It's also an issue with the response. The chain is: Varnish (status: 503) -> HAProxy (status: 200; termination: SD--) -> HAProxy Upstream (status: 200, termination: ----) > At first glance, there is not so much fix that can explain that. Maybe the > following one, not sure: I had the same thought…nothing really made sense to me either. I'll try with `-dZ` and report back! Best, Luke — Luke Seelenbinder Stadia Maps | Founder & CEO stadiamaps.com > On Sep 26, 2024, at 16:28, Christopher Faulet <cfau...@haproxy.com> wrote: > > Hi Luke, > > Le 26/09/2024 à 12:28, Luke Seelenbinder a écrit : >> On upgrading to 3.0.5, we began to see a lot of failed backend requests. >> They are successful status codes but fail with connection state `SD--`. On >> the upstream side, the request succeeds (the upstream is also HAProxy, its >> state is `----`). >> The data appears to be fully transferred without error, but something goes >> wrong towards the end of the request. This happens on a rather small >> percentage of requests, but I'm struggling to determine how to isolate the >> problem further. Timing and bytes transferred on both sides match up. >> Varnish is in the loop for most of these requests (but not all), and it ends >> up returning an error response, so it's not a spurious log line where the >> client doesn't register an error. To make matters worse, the response status >> code from the backend is successful, so the requests can't be retried using >> L7. > > Sorry, I don't understand, the response was successfully sent to the client > when this happens or not ? It is "just" an issue with the termination state > or there is also an issue with the response itself ? > >> The only thing that was changed should be the upgrade between 3.0.4 and >> 3.0.5. >> Our settings are pretty standard. TLS on both sides; a mix of H3, H2, and >> H1.1 for the frontend; exclusively client-cert TLS + H1.1 for the backend. >> Errors happen on all FE protocols. >> Any tips on how to debug this further? Possibly relevant config below. > > Well, if it is a issue with the termination state while the response is fully > sent to the client, it may be a server shutdown that is caught too early, > when it is received with the last bytes of data. > > At first glance, there is not so much fix that can explain that. Maybe the > following one, not sure: > > commit e2a93b649286b30245333eec5851acd3991fda47 > Author: Christopher Faulet <cfau...@haproxy.com> > Date: Mon Jul 29 17:48:16 2024 +0200 > > BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was > set > > When a send on a connection is performed, if a SE error (or a pending > error) > was already reported earlier, we leave immediately. No send is performed. > However, we must be sure to report the error at the SC level if necessary. > Indeed, the SE error may have been reported during the zero-copy data > forwarding. So during receive on the opposite side. In that case, we may > have missed the opportunity to report it at the SC level. > > The patch must be backported as far as 2.8. > > (cherry picked from commit 5dc45445ff18207dbacebf1f777e1f1abcd5065d) > Signed-off-by: Christopher Faulet <cfau...@haproxy.com> > > You may try do disable the zero-copy data forwarding with -dZ command line > option. > > -- > Christopher Faulet >