Thank you Maxim for that comprehensive explanation. I will think about non_idempotent then, and wait for an eventual release of freenginx that natively solves that issue :) Have a great day Sébastien.
Le lun. 25 mars 2024 à 16:20, Maxim Dounin <mdou...@mdounin.ru> a écrit : > Hello! > > On Mon, Mar 25, 2024 at 01:31:26PM +0100, Sébastien Rebecchi wrote: > > > I have an issue with nginx closing prematurely connections when reload is > > performed. > > > > I have some nginx servers configured to proxy_pass requests to an > upstream > > group. This group itself is composed of several servers which are nginx > > themselves, and is configured to use keepalive connections. > > > > When I trigger a reload (-s reload) on an nginx of one of the servers > which > > is target of the upstream, I see in error logs of all servers in front > that > > connection was reset by the nginx which was reloaded. > > [...] > > > And here the kind of error messages I get when I reload nginx of "IP_1": > > > > --- BEGIN --- > > > > 2024/03/25 11:24:25 [error] 3758170#0: *1795895162 recv() failed (104: > > Connection reset by peer) while reading response header from upstream, > > client: CLIENT_IP_HIDDEN, server: SERVER_HIDDEN, request: "POST > > /REQUEST_LOCATION_HIDDEN HTTP/2.0", upstream: " > > http://IP_1:80/REQUEST_LOCATION_HIDDEN", host: "HOST_HIDDEN", referrer: > > "REFERRER_HIDDEN" > > > > --- END --- > > > > > > I thought -s reload was doing graceful shutdown of connections. Is it due > > to the fact that nginx can not handle that when using keepalive > > connections? Is it a bug? > > > > I am using nginx 1.24.0 everywhere, no particular > > This looks like a well known race condition when closing HTTP > connections. In RFC 2616, it is documented as follows > (https://datatracker.ietf.org/doc/html/rfc2616#section-8.1.4): > > A client, server, or proxy MAY close the transport connection at any > time. For example, a client might have started to send a new request > at the same time that the server has decided to close the "idle" > connection. From the server's point of view, the connection is being > closed while it was idle, but from the client's point of view, a > request is in progress. > > This means that clients, servers, and proxies MUST be able to recover > from asynchronous close events. Client software SHOULD reopen the > transport connection and retransmit the aborted sequence of requests > without user interaction so long as the request sequence is > idempotent (see section 9.1.2). Non-idempotent methods or sequences > MUST NOT be automatically retried, although user agents MAY offer a > human operator the choice of retrying the request(s). Confirmation by > user-agent software with semantic understanding of the application > MAY substitute for user confirmation. The automatic retry SHOULD NOT > be repeated if the second sequence of requests fails. > > That is, when you shutdown your backend server, it closes the > keepalive connection - which is expected to be perfectly safe from > the server point of view. But if at the same time a request is > being sent to this connection by the client (frontend nginx server > in your case) - this might result in an error. > > Note that the race is generally unavoidable and such errors can > happen at any time, during any connection close by the server. > Closing multiple keepalive connections during shutdown makes such > errors more likely though, since connections are closed right > away, and not after keepalive timeout expires. Further, since in > your case there are just a few loaded keepalive connections, this > also makes errors during shutdown more likely. > > Typical solution is to retry such requests, as RFC 2616 > recommends. In particular, nginx does so based on the > "proxy_next_upstream" setting. Note that to retry POST requests > you will need "proxy_next_upstream ... non_idempotent;" (which > implies that non-idempotent requests will be retried on errors, > and might not be the desired behaviour). > > Another possible approach is to try to minimize the race window by > waiting some time after the shutdown before closing keepalive > connections. There were several attempts in the past to implement > this, the last one can be found here: > > > https://mailman.nginx.org/pipermail/nginx-devel/2024-January/YSJATQMPXDIBETCDS46OTKUZNOJK6Q22.html > > While there are some questions to the particular patch, something > like this should probably be implemented. > > This is my TODO list, so a proper solution should be eventually > available out of the box in upcoming freenginx releases. > > Hope this helps. > > -- > Maxim Dounin > http://mdounin.ru/ > -- > nginx mailing list > nginx@freenginx.org > https://freenginx.org/mailman/listinfo/nginx >
-- nginx mailing list nginx@freenginx.org https://freenginx.org/mailman/listinfo/nginx