Hi, Sending a patch in previous email, I fully understand that similar proposals have already been discussed and rejected (those were implemented in a different approach though): https://trac.nginx.org/nginx/ticket/1022 https://mailman.nginx.org/pipermail/nginx-devel/2019-January/011804.html https://mailman.nginx.org/pipermail/nginx-devel/2021-December/014644.html
Thus I understand that my patch might be rejected for similar reasons, so I'll try to be specific and clarify what problems I'm trying to solve. Problems: While it's perfectly fine to close idle HTTP/1.1 connections at any time according to RFC 9112, it can still cause unnecessary and potentially avoidable problems during configuration reloads, such as: - Retries for requests that are problematic for automatic retries (non-idempotent requests). Unfortunately, it's not always possible to make all requests idempotent and safe for automatic retries. As a result, some of the closed idle connections will cause errors in client applications that can only be resolved manually. This, in turn, leads to a poor user experience. - Unnecessary retries for all other types of requests. While the client should be prepared for this, it will still result in additional latency for retried requests that could be avoided (the more nginx reloads, the more unnecessary retries). - According to rfc9110 [1] it's not allowed to retry non-idempotent requests by proxy. So it's not possible to retry them by some intermediate proxy before nginx, and all those errors should be returned to the client - this will lead to an additional latency even if the client supports automatic retries for non-idempotent requests. - According to rfc9110 [1] a client should not automatically retry a failed automatic retry. There is a high probability that during nginx reloads, a client will retry some requests via closing connections because they all close simultaneously. This, again, will lead to a portion of errors that can only be resolved manually. The main idea: while clients should be prepared for unexpected errors (such as server crashes, lack of free connections, etc.), there is no point in deliberately creating such situations (especially in environments with frequent nginx reloads). With properly configured clients: client_idle_timeout < min(server_shutdown_delay, server_idle_timeout) this patch will allow to minimize the number of unnecessary retries and errors for environments where it's critical. Specifics of the environment: - Nginx hosts as frontend L7 balancers for all incoming external L7 traffic in the datacenter, and in some cases for east-west traffic between microservices. - Almost all of http connections are keep-alive HTTP/1.1 with 300s idle_client_timeout. - Frequent nginx configuration reloads. - Large amount of requests with non-idempotent methods. - Large amount of long-lived "in-progress" connections such as websockets (so in this situation there is no point to immediately shutdown idle HTTP/1.1 connections, because old nginx workers will continue to work anyway). - High diversity of client applications (different mobile platforms, different desktop platforms, different client libraries, different dev teams for those client applications). - We're currently using nginx-plus, so we can't just patch it, that's why it's important for us to propagate this or a similar change in the official release (we've also opened a ticket through the nginx-plus support channel, in case it matters). Additional comments: This patch doesn't change the default behavior and tries to be as unobtrusive as possible. With the option enabled, the configured shutdown delay will only be applied to idle keepalive HTTP/1.1 connections - all other protocols will continue to work as usual (for example, there is no point in delaying shutdown for HTTP/2 or HTTP/3 protocols - they have built-in graceful shutdown mechanisms). However, if needed, the same behavior could easily be added for other protocols in the future. If you don't mind adding the proposed behavior, but don't like the way it was done in the patch, then please advise how it could be reworked. I also understand that additional tests should be added to the nginx-tests repo. If you accept the patch, I will prepare the tests a bit later. Also, let me know if you need more details about use cases where this patch might be useful. [1] https://www.rfc-editor.org/rfc/rfc9110#section-9.2.2-7 -- Artem Pogartsev _______________________________________________ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel