> On 27 Oct 2021, at 22:19, Maxim Dounin <[email protected]> wrote: > > Hello! > > On Wed, Oct 27, 2021 at 05:19:19PM +0300, Sergey Kandaurov wrote: > >>> On 11 Oct 2021, at 21:58, Maxim Dounin <[email protected]> wrote: >>> >>> # HG changeset patch >>> # User Maxim Dounin <[email protected]> >>> # Date 1633978587 -10800 >>> # Mon Oct 11 21:56:27 2021 +0300 >>> # Node ID 489323e194e4c3b1a7937c51bd4e1671c70f52f8 >>> # Parent d175cd09ac9d2bab7f7226eac3bfce196a296cc0 >>> Simplified sendfile_max_chunk handling. >>> >>> Previously, it was checked that sendfile_max_chunk was enabled and >>> almost whole sendfile_max_chunk was sent (see e67ef50c3176), to avoid >>> delaying connections where sendfile_max_chunk wasn't reached (for example, >>> when sending responses smaller than sendfile_max_chunk). Now we instead >>> check if there are unsent data, and the connection is still ready for >>> writing. >>> Additionally we also check c->write->delayed to ignore connections already >>> delayed by limit_rate. >>> >>> This approach is believed to be more robust, and correctly handles >>> not only sendfile_max_chunk, but also internal limits of c->send_chain(), >>> such as sendfile() maximum supported length (ticket #1870). >>> >>> diff --git a/src/http/ngx_http_write_filter_module.c >>> b/src/http/ngx_http_write_filter_module.c >>> --- a/src/http/ngx_http_write_filter_module.c >>> +++ b/src/http/ngx_http_write_filter_module.c >>> @@ -321,16 +321,12 @@ ngx_http_write_filter(ngx_http_request_t >>> delay = (ngx_msec_t) ((nsent - sent) * 1000 / r->limit_rate); >>> >>> if (delay > 0) { >>> - limit = 0; >>> c->write->delayed = 1; >>> ngx_add_timer(c->write, delay); >>> } >>> } >>> >>> - if (limit >>> - && c->write->ready >>> - && c->sent - sent >= limit - (off_t) (2 * ngx_pagesize)) >>> - { >>> + if (chain && c->write->ready && !c->write->delayed) { >>> ngx_post_event(c->write, &ngx_posted_next_events); >>> } >>> >> >> Looks good. >> >> Not strictly related to this change, so FYI. I noticed a stray writev() >> after Linux sendfile(), when it writes more than its internal limits. >> >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 write old buf t:0 f:1 >> 0000000000000000, >> pos 0000000000000000, size: 0 file: 416072437, size: 3878894859 >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 http write filter: l:1 f:0 >> s:3878894859 >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 http write filter limit 0 >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 sendfile: @416072437 2147482891 >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 sendfile: 2147479552 of 2147482891 >> @416072437 >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 writev: 0 of 0 >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 http write filter 0000561528695820 >> 2021/10/27 12:44:34 [debug] 1462058#0: *1 post event 00005615289C2310 >> >> Here sendfile() partially sent 2147479552, which is above its internal >> limit NGX_SENDFILE_MAXSIZE - ngx_pagesize. On the second iteration, >> due to this, it falls back to writev() with zero-size headers. >> Then, with the patch applied, it posts the next write event, as designed >> (previously, it would seemingly stuck instead, such as in ticket #1870). > > Interesting. > > Overall it looks harmless, but we may want to look further why > sendfile() only sent 2147479552 instead of 2147482891. It seems > that 2147479552 is in pages (524287 x 4096) despite the fact that > the initial offset is not page-aligned. We expect sendfile() to > send page-aligned ranges instead (416072437 + 2147482891 == 625868 x 4096). > > Looking into Linux sendfile() manpage suggests that 2,147,479,552 > is a documented limit: > > sendfile() will transfer at most 0x7ffff000 (2,147,479,552) > bytes, returning the number of bytes actually transferred. > (This is true on both 32-bit and 64-bit systems.) > > This seems to be mostly arbitrary limitation appeared in Linux > kernel 2.6.16 > (https://github.com/torvalds/linux/commit/e28cc71572da38a5a12c1cfe4d7032017adccf69). > > Interesting enough, the actual limitation is not 0x7ffff000 as > documented, but instead MAX_RW_COUNT, which is defined as > (INT_MAX & PAGE_MASK). This suggests that the behaviour will be > actually different on platforms with larger pages. > > Something as simple as: > > diff --git a/src/os/unix/ngx_linux_sendfile_chain.c > b/src/os/unix/ngx_linux_sendfile_chain.c > --- a/src/os/unix/ngx_linux_sendfile_chain.c > +++ b/src/os/unix/ngx_linux_sendfile_chain.c > @@ -216,7 +216,6 @@ ngx_linux_sendfile_chain(ngx_connection_ > */ > > send = prev_send + sent; > - continue; > } > > if (send >= limit || in == NULL) { > > should be enough to resolve this additional 0-sized writev(). > Untested though, as I don't have a test playground on hand where > 2G sendfile() can be reached. It would be great if you'll test > it. >
That seems to help: 2021/10/27 20:36:31 [debug] 1498568#0: *1 write old buf t:1 f:0 000055D8D328FDB0, pos 000055D8D328FDB0, size: 252 file: 0, size: 0 2021/10/27 20:36:31 [debug] 1498568#0: *1 write new buf t:0 f:1 0000000000000000, pos 0000000000000000, size: 0 file: 0, size: 4294967296 2021/10/27 20:36:31 [debug] 1498568#0: *1 http write filter: l:1 f:0 s:4294967548 2021/10/27 20:36:31 [debug] 1498568#0: *1 http write filter limit 0 2021/10/27 20:36:31 [debug] 1498568#0: *1 writev: 252 of 252 [.. next ngx_linux_sendfile_chain() loop iteration ..] 2021/10/27 20:36:31 [debug] 1498568#0: *1 sendfile: @0 2147479552 2021/10/27 20:36:31 [debug] 1498568#0: *1 sendfile: 2147479552 of 2147479552 @0 [.. return from ngx_linux_sendfile_chain() on exceeded limit ..] 2021/10/27 20:36:31 [debug] 1498568#0: *1 http write filter 000055D8D329C8D0 2021/10/27 20:36:31 [debug] 1498568#0: *1 post event 000055D8D35CC660 > Full patch: > > # HG changeset patch > # User Maxim Dounin <[email protected]> > # Date 1635361800 -10800 > # Wed Oct 27 22:10:00 2021 +0300 > # Node ID 859447c7b7076b676a3421597514b324b708658d > # Parent 2a7155733855d1c2ea1c1ded8d1a4ba654b533cb > Fixed sendfile() limit handling on Linux. > > On Linux starting with 2.6.16, sendfile() silently limits all operations > to MAX_RW_COUNT, defined as (INT_MAX & PAGE_MASK). This incorrectly > triggered the interrupt check, and resulted in 0-sized writev() on the > next loop iteration. > > Fix is to make sure the limit is always checked, so we will return from > the loop if the limit is already reached even if number of bytes sent is > not exactly equal to the number of bytes we've tried to send. > > diff --git a/src/os/unix/ngx_linux_sendfile_chain.c > b/src/os/unix/ngx_linux_sendfile_chain.c > --- a/src/os/unix/ngx_linux_sendfile_chain.c > +++ b/src/os/unix/ngx_linux_sendfile_chain.c > @@ -216,7 +216,6 @@ ngx_linux_sendfile_chain(ngx_connection_ > */ > > send = prev_send + sent; > - continue; > } > > if (send >= limit || in == NULL) { > The change looks good to me. Btw, this should also stop exceeding the limit after several sendfile() calls each interrupted, on Linux 4.3+ (which is rather theoretical). It probably deserves updating comments in this file about the count parameter constraints. -- Sergey Kandaurov _______________________________________________ nginx-devel mailing list [email protected] http://mailman.nginx.org/mailman/listinfo/nginx-devel
