Hello. I noticed that this patch has not been merged yet. Are there any other issues? Can I help with anything?
On Tue, May 21, 2024 at 5:26 PM Maxim Dounin <mdou...@mdounin.ru> wrote: > > Hello! > > On Fri, May 17, 2024 at 04:35:57PM +0800, Kasei Wang wrote: > > > Hello! > > > > On Fri, May 17, 2024 at 2:06 AM Maxim Dounin <mdou...@mdounin.ru> wrote: > > > > > > Hello! > > > > > > On Wed, May 15, 2024 at 11:46:29AM +0800, Kasei Wang wrote: > > > > > > > Hello, > > > > > > > > I found that there is a slight probability of HTTP/2 connections not > > > > properly closing during graceful shutdown, leading to worker processes > > > > in shutting down state remaining stuck for an extended period. After > > > > investigation, the issue appears to stem from the following: > > > > > > > > 1. worker processes in shutting down state use > > > > ngx_close_idle_connections to close all idle connections, including > > > > HTTP/2 connections. > > > > 2. For HTTP/2 connections, c->idle is set to true in ngx_http_v2_init. > > > > According to the explanation in > > > > <https://hg.nginx.org/nginx/rev/5e95b9fb33b7>, GOAWAY should be sent to > > > > all HTTP/2 connections. > > > > 3. There might be a time gap between ngx_event_accept and > > > > ngx_http_v2_init. For TLS connections, ngx_http_v2_init will be > > > > executed after ALPN received, and for plaintext http2 connections, > > > > ngx_http_v2_init will be executed after parsing the http2 preface. If > > > > ngx_close_idle_connections is executed between ngx_event_accept and > > > > ngx_http_v2_init, there's a possibility that c->idle of some > > > > connections is set to true AFTER ngx_close_idle_connections, causing > > > > those connections to not enter the GOAWAY process and leading to the > > > > aforementioned problem. > > > > > > > > To verify this, I've written a simple HTTP/2 client. This program will > > > > wait 15 seconds after TCP connection establishment before starting to > > > > send data. The purpose of sleep is to to raise the probability of > > > > encountering the issue. You can reproduce the problem by executing > > > > "nginx -s reload" during this 15-second wait. If you're interested, you > > > > can try my test program > > > > (<https://github.com/kaseiwang/http2-grace-shutdown-test>) to reproduce > > > > the issue. > > > > > > > > The following patch would call ngx_http_v2_finalize_connection to close > > > > http2 connections which is initialized after > > > > ngx_close_idle_connections. > > > > > > > > And here are some previous discussion on the another maillist: > > > > <https://mailman.nginx.org/pipermail/nginx-devel/2024-April/TSBIREWXEN7B4U7SRSR4LRJDPOR3XH4Q.html > > > > > > > > > > > > > Please confirm if this issue exists, review my analysis and the patch > > > > if possible. Thank you very much. > > > > > > I think your analysis is correct. Thanks for catching this. > > > > > > > # HG changeset patch > > > > # User Kasei Wang <ka...@kasei.im> > > > > # Date 1715744317 -28800 > > > > # Wed May 15 11:38:37 2024 +0800 > > > > # Node ID 7f18358cc012039f6bb4e502a6f39a7bd50d669b > > > > # Parent 4cfd64a8330dd5bacd839796732fd8c49d87e86e > > > > HTTP/2: close http2 connections initialized during graceful shutdown. > > > > > > > > In some rare cases, a HTTP/2 connections can be initialized during a > > > > graceful shutdown. Now close such an connection to avoid unexcepted > > > > delays in the graceful shutdown. > > > > > > > > diff -r 4cfd64a8330d -r 7f18358cc012 src/http/v2/ngx_http_v2.c > > > > --- a/src/http/v2/ngx_http_v2.c Tue May 14 17:47:44 2024 +0300 > > > > +++ b/src/http/v2/ngx_http_v2.c Wed May 15 11:38:37 2024 +0800 > > > > @@ -304,6 +304,11 @@ > > > > c->idle = 1; > > > > ngx_reusable_connection(c, 0); > > > > > > > > + if (ngx_exiting) { > > > > + ngx_http_v2_finalize_connection(h2c, NGX_HTTP_V2_NO_ERROR); > > > > + return; > > > > + } > > > > + > > > > if (c->buffer) { > > > > p = c->buffer->pos; > > > > end = c->buffer->last; > > > > > > The patch looks working, though in the case in question the code > > > will do various otherwise unneeded things, notably will send > > > SETTINGS and WINDOW_UPDATE frames to the client. OTOH, avoiding > > > these might be tricky and will require more complex changes, and > > > probably don't worth the effort. > > > > > > Another approach might be to allow just one request if a > > > connection is established just before the shutdown, much like > > > HTTP/1.x code does. Something as simple as the following patch > > > should do the trick: > > > > > > diff --git a/src/http/v2/ngx_http_v2.c b/src/http/v2/ngx_http_v2.c > > > --- a/src/http/v2/ngx_http_v2.c > > > +++ b/src/http/v2/ngx_http_v2.c > > > @@ -1336,7 +1336,8 @@ ngx_http_v2_state_headers(ngx_http_v2_co > > > clcf = ngx_http_get_module_loc_conf(h2c->http_connection->conf_ctx, > > > ngx_http_core_module); > > > > > > - if (clcf->keepalive_timeout == 0 > > > + if (ngx_exiting > > > + || clcf->keepalive_timeout == 0 > > > || h2c->connection->requests >= clcf->keepalive_requests > > > || ngx_current_msec - h2c->connection->start_time > > > > clcf->keepalive_time) > > > > > > > > > What do you think? > > [...] > > > Thanks for your response. > > > > Actually, I was worried that sending GOAWAY on stream 0 might cause > > compatibility issues for some clients. There are so many client > > implementations, and I can't be sure that all clients can correctly > > retry the request in this situation. > > > > The disadvantage of your approach is that the worker process may still > > need to wait for the first request to arrive. This is not a very long > > time, and it should be limited by the client_header_timeout, which is > > 60s by default. So I think the waiting is acceptable. The advantage of > > this approach is that the process to send GOAWAY frame would be more > > similar to the existing reload process, reducing the possibility of > > affecting client compatibility. > > > > Overall, I prefer your approach because it minimizes the change impact > > for the client. > > Thanks for sharing. > > For completeness, full patch with commit log below. > > # HG changeset patch > # User Maxim Dounin <mdou...@mdounin.ru> > # Date 1716283099 -10800 > # Tue May 21 12:18:19 2024 +0300 > # Node ID 429d40e8275d2606da7cb710de5bc40a905fe52f > # Parent 46ecad404a296042c0088e699f275a92758e5ab9 > HTTP/2: handling of connections initialized during shutdown. > > If an HTTP/2 connection opened before a graceful shutdown, but > ngx_http_v2_init() is called after idle connections were closed, such > a connection ended up being open till closed by the client (or up to > keepalive_time), delaying shutdown. > > With this change, such connections are allowed to serve just one request, > much like it happens in HTTP/1.x, and closed afterwards. > > Reported by Kasei Wang, > https://freenginx.org/pipermail/nginx-devel/2024-May/000277.html > > diff --git a/src/http/v2/ngx_http_v2.c b/src/http/v2/ngx_http_v2.c > --- a/src/http/v2/ngx_http_v2.c > +++ b/src/http/v2/ngx_http_v2.c > @@ -1336,7 +1336,8 @@ ngx_http_v2_state_headers(ngx_http_v2_co > clcf = ngx_http_get_module_loc_conf(h2c->http_connection->conf_ctx, > ngx_http_core_module); > > - if (clcf->keepalive_timeout == 0 > + if (ngx_exiting > + || clcf->keepalive_timeout == 0 > || h2c->connection->requests >= clcf->keepalive_requests > || ngx_current_msec - h2c->connection->start_time > > clcf->keepalive_time) > > -- > Maxim Dounin > http://mdounin.ru/ > -- > nginx-devel mailing list > nginx-devel@freenginx.org > https://freenginx.org/mailman/listinfo/nginx-devel