Hi, > 2023年1月25日 10:17,Maxim Dounin <mdou...@mdounin.ru> 写道: > > Hello! > > On Mon, Jan 23, 2023 at 07:01:16PM +0800, Miao Wang wrote: > >>> 2023年1月23日 12:05,Maxim Dounin <mdou...@mdounin.ru> 写道: >>> >>> On Wed, Jan 18, 2023 at 11:28:52PM +0800, Miao Wang wrote: >>> >>>> # HG changeset patch >>>> # User Miao Wang <shankerwangm...@gmail.com> >>>> # Date 1674055068 -28800 >>>> # Wed Jan 18 23:17:48 2023 +0800 >>>> # Node ID 73aa64bd29f3dec9e43e97560d6b5a07cdf40063 >>>> # Parent 07b0bee87f32be91a33210bc06973e07c4c1dac9 >>>> HTTP: trigger lingering close when keepalive connection will be closed >>>> >>>> When finalizing a request, if the request is not keepalive but >>>> its connection has served more than one request, then the connection >>>> has been a keepalive connection previously and this connection will >>>> be closed after this response. In this condition, it is likely that >>>> there are pipelined requests following this request, which we should >>>> ignore. As a result, lingering close is necessary in this case. >>>> >>>> Without this patch, nginx (with its default configuration) will send >>>> out TCP RST when there are more pipelined requests. The symptom is >>>> obvious when nginx is serving a debian repository and apt is >>>> downloading massive of packages. See [1]. It becomes more obvious >>>> when `keepalive_requests` is lower or nginx is under a relative >>>> higher load, and it disappears when specifying >>>> `lingering_close always`. >>>> >>>> [1]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10 >>>> >>>> diff -r 07b0bee87f32 -r 73aa64bd29f3 src/http/ngx_http_request.c >>>> --- a/src/http/ngx_http_request.c Wed Dec 21 14:53:27 2022 +0300 >>>> +++ b/src/http/ngx_http_request.c Wed Jan 18 23:17:48 2023 +0800 >>>> @@ -2749,6 +2749,10 @@ >>>> return; >>>> } >>>> >>>> + if (!r->keepalive && r->connection->requests > 1) { >>>> + r->lingering_close = 1; >>>> + } >>>> + >>>> if (clcf->lingering_close == NGX_HTTP_LINGERING_ALWAYS >>>> || (clcf->lingering_close == NGX_HTTP_LINGERING_ON >>>> && (r->lingering_close >>> >>> Thanks for the patch and the link to the Debian bug report. >>> >>> Lingering close implies noticeable additional resource usage: even >>> if nothing happens on the connection, it will be kept open for >>> lingering_timeout, which is 5 seconds by default. Given that >>> pipelining is not used by most of the clients, forcing lingering >>> close for all clients which are using keepalive does not look like >>> a good solution to me. >>> >>> In general, nginx tries hard to determine if any additional data >>> are expected on the connection, and uses lingering close if there >>> is a good chance there will be some, but avoids lingering close by >>> default if additional data are unlikely. If this logic does not >>> work for some reason, lingering close can be explicitly requested >>> with "lingering_close always;". >> >> That's true since the symptom I described can be worked around with >> that option. >> >>> >>> In particular, note the "r->header_in->pos < r->header_in->last" >>> and "r->connection->read->ready" checks - these are expected to >>> catch connections with additional pipelined requests (see revision >>> 3981:77604e9a1ed8). And from the log provided in the report it >>> looks like it works most of the time - there are more than 6k HTTP >>> requests, and 60+ connections. But sometimes it fails - there are >>> two RST errors logged (and one "Undetermined Error", which looks >>> like a bug in apt, but might be related). >>> >>> It looks like when apt is downloading many resources, it does not >>> send all the requests at once (or in batches), but instead tries >>> to maintain a constant "depth", a number of pipelined requests in >>> flight. This essentially means that after reading of a response >>> it sends an additional request. >> >> That's right. From a traffic dump, I can see apt first sends one >> request, and after receiving the response, it will send out 10 >> more requests, and maintain a depth of 10, since by default >> Acquire::http::Pipeline-Depth is 10. >> >>> >>> I see at least two possible cases which can result in nginx not >>> using lingering close with such a load: >>> >>> 1. If a response where keepalive_requests is reached happens to >>> be the last request in the r->header_in buffer (so the >>> "r->header_in->pos < r->header_in->last" won't be effective), and >>> there is a chance that nginx wasn't yet got an event from kernel >>> about additional data (and therefore "r->connection->read->ready" >>> will not be set). As such, nginx won't use lingering close, and >>> might close connection with unread data in the socket buffer, >>> resulting in RST. >>> >>> 2. Similarly, if nginx happens to be faster than apt, and socket >>> buffers are large enough, it might sent all the responses, >>> including the last one with "Connection: close", and close the >>> connection (since there are no pending pipelined requests at the >>> moment) even before an additional request is sent by apt. When >>> later apt will send an additional request after reading some of >>> the responses, it will send the request to already closed >>> connection, again resulting in RST. >> >> Actually, comparing the debug log and the pcap, nginx calls >> close() after writing the last response. However, at that time, >> that response is not fully transmitted to the client and there >> seems to be more requests not processed in the kernel buffer. >> Thus close() triggers an immediate RST. > > Thanks for the details. This looks more like the first case, and > probably can be addressed by improving likelihood of detecting the > read event. > > Could you please test if the patch below fixes the particular > issue you are seeing? It is somewhat unrelated, but it might be > a good enough solution (and is more or less equivalent to > checking r->pipeline). > >>> It would be interesting to see more details, such as tcpdump >>> and/or nginx debug logs, to find out what actually goes on here. >> >> The tcpdump and debug logs are too large to send in this mail list. >> I wonder if I can directly email it to you. > > Feel free to, my email should accept up to 100M messages. > Alternatively, a good solution might be to make the files > available for download and post a link here. > >>> Overall, given how apt uses pipelining, I tend to think that at >>> least (2) is unavoidable and can happen with certain sizes of the >>> responses. >>> >>> A good enough solution might be check for r->pipeline, which is >>> set by nginx as long as it reads a pipelined request. It might >>> not be enough though, since r->pipeline is only set for requests >>> seen by nginx as pipelined, which might not be true for the last >>> request. >>> >>> A more complete solution might be to introduce something like >>> c->pipeline flag and use lingering close if any pipelined requests >>> were seen on the connection. > > The following patch reworks handling of pipelined requests by > postponing them to the next event loop iteration. It is expected > make it more likely for nginx to know there are any additional > unread data in the socket buffer (and right now is mostly > equivalent to checking r->pipeline, since c->read->ready is always > set for pipelined requests): > > # HG changeset patch > # User Maxim Dounin <mdou...@mdounin.ru> > # Date 1674610218 -10800 > # Wed Jan 25 04:30:18 2023 +0300 > # Node ID 8cfd22c325a3db370b9e45aa6f897ff7bc8222f3 > # Parent c7e103acb409f0352cb73997c053b3bdbb8dd5db > Reworked pipelined requests to use posted next events. > > This is expected to improve handling of pipelined requests in a number > of ways, including: > > 1) It will make a room for additional requests from other clients, > reducing worker monopolization by a single client. > > 2) The c->read->ready flag will be set, so nginx will either read the > additional data, or will use lingering close. This is expected to help > with clients using pipelining with some constant depth, such as apt[1][2]. > > The ngx_event_move_posted_next() was modified to make it possible to > post read events on connections with kqueue. Previously, it used to > set ev->available to -1, potentially overwriting a valid positive value > provided by kqueue, so ngx_unix_recv() and ngx_readv_chain() will stop > reading from the socket before reading all the data available. > > Note that currently ngx_event_move_posted_next() will always set > the ev->ready flag. While this is expected behaviour for the ev->available > use case (where ev->ready is explicitly cleared), this is not needed for > pipelining. For pipelining, this will result in extra unneeded read() > syscall after processing of all pipelined requests, and there might be > a room for improvement here. > > [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10 > [2] > https://mailman.nginx.org/pipermail/nginx-devel/2023-January/ZA2SP5SJU55LHEBCJMFDB2AZVELRLTHI.html > > diff --git a/src/event/ngx_event_posted.c b/src/event/ngx_event_posted.c > --- a/src/event/ngx_event_posted.c > +++ b/src/event/ngx_event_posted.c > @@ -51,8 +51,10 @@ ngx_event_move_posted_next(ngx_cycle_t * > ngx_log_debug1(NGX_LOG_DEBUG_EVENT, cycle->log, 0, > "posted next event %p", ev); > > - ev->ready = 1; > - ev->available = -1; > + if (!ev->ready) { > + ev->ready = 1; > + ev->available = -1; > + } > } > > ngx_queue_add(&ngx_posted_events, &ngx_posted_next_events); > diff --git a/src/http/ngx_http_request.c b/src/http/ngx_http_request.c > --- a/src/http/ngx_http_request.c > +++ b/src/http/ngx_http_request.c > @@ -3129,7 +3129,7 @@ ngx_http_set_keepalive(ngx_http_request_ > } > > rev->handler = ngx_http_process_request_line; > - ngx_post_event(rev, &ngx_posted_events); > + ngx_post_event(rev, &ngx_posted_next_events); > return; > } > > > -- > Maxim Dounin > http://mdounin.ru/ > _______________________________________________ > nginx-devel mailing list > nginx-devel@nginx.org > https://mailman.nginx.org/mailman/listinfo/nginx-devel
I can confirm that the symptom disappears after applying this patch Cheers, Miao Wang _______________________________________________ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel