On Fri, Jan 10, 2025 at 6:09 PM Rob Landley <r...@landley.net> wrote: > > On 1/10/25 15:40, enh wrote: > > On Fri, May 26, 2023 at 10:26 AM Rob Landley <r...@landley.net> wrote: > >> > >> On 5/25/23 19:08, enh via Toybox wrote: > >>> so i finally enabled copy_file_range for the _host_ toybox because someone > >>> pointed out that we copying 16GiB zip files around in the build, and even > >>> though > >>> obviously we should stop doing that, 30s seemed unreasonable, and > >>> coreutils cp > >>> "only" took 20s because of copy_file_range. > >> > >> Hardlinking them is not an option? :) > >> > >>> but toybox cp with copy_file_range still takes 25s. why? > >>> > >>> if (bytes<0 || bytes>(1<<30)) len = (1<<30); > >>> > >>> the checkin comment being: > >>> > >>> Update comments and add "sanity check" from kernel commit f16acc9d9b376. > >>> (The kernel's been doing this since 2019, but older kernels may not, > >>> so...) > >> > >> The problem being that _before_ that commit, too big a sendfile didn't work > >> right (returned an error from the kernel?). I suspect my range check was > >> just > >> the largest power of 2 that fit in the constraint... > > > > is that true? the diff for that commit makes it look like it > > internally silently used `min(MAX_RW_COUNT, len)` which should be fine > > with the usual "subtract what was actually written" logic? > > > > (libc++ just started to use copy_file_range(), and i asked whether > > they knew about this limit, and then couldn't explain why toybox has a > > special case...) > > Let's see... grep for '1<<30' in lib, git annotate the file, search for > the 1<<30... cut and paste the hash and do a ^1 to peel off that commit > (sigh, I want a GUI/IDE tool for this where I could just click)... it > was introduced in toybox commit 9b368059deec which says "Update comments > and add "sanity check" from kernel commit f16acc9d9b376. (The kernel's > been doing this since 2019, but older kernels may not, so...)" > > The check from that kernel commit was: > > + return do_splice_direct(file_in, &pos_in, file_out, &pos_out, > + len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0); > > And checking out that commit and searching for MAX_RW_COUNT : > > include/linux/fs.h:#define MAX_RW_COUNT (INT_MAX & PAGE_MASK) > > And of course: > > include/asm-generic/page.h:#define PAGE_MASK (~(PAGE_SIZE-1)) > > So they're using 2 gigs minus 4k, so yes 1<<30 (1 gig) is the largest > power of 2 that fits within that constraint. > > Is there a big performance difference between queueing up 1 gigabyte at > a time and 2 gigabytes at a time?
that's not really my concern here, so much as "was this an actual bug, and which range of kernel versions did it affect?", so that i can pass the advice on to other projects using copy_file_range() [and add it to the man page], or even just add the clamping in libc itself so no-one has to think about this. > We can be exact, I just didn't want it > to start going boing again on systems where PAGE_SIZE isn't 4k or some > such. For a 16 gigabyte file it returns back into our code 16 times (vs > 8 times) which either way you'd THINK would take a grand total of like a > milisecond even with switching overhead to deal with. A 5 second > difference seems... odd? (If you've got zero readahead and zero > writeback cacheing and you're on rotating media where a slight hitch > costs a rotational delay then maybe? Even so, it should just be 16 extra > rotations vs 8 extra rotations. A dog slow ancient hard drive was 1200 > rpm meaning 20 rps meaning 8 extra was less than half a second...) > > I mean, IDEALLY the kernel would have a "repeat until spanked" -1 value > where it just copies until EOF and stops bothering us, but the kernel > didn't seem to offer that as an option last I checked... > > >>> what the kernel _actually_ does though is clamp to MAX_RW_COUNT. which is > >>> actually (INT_MAX & PAGE_MASK). which i'm assuming changes for a non-4KiB > >>> page > >>> kernel? > > Yeah, that. ^^^ > > The constraint is there because older kernels would error out on too big > a value, rather than clamping, and I didn't want to query the kernel > version and have two codepaths. that's what i was trying to find, and i still haven't. `git log -p read_write.c` makes it look like the first commit that started calling do_splice_direct() had the clamping. commit eac70053a141998c40907747d6cea1d53a9414be Author: Anna Schumaker <anna.schuma...@netapp.com> Date: Tue Nov 10 16:53:33 2015 -0500 vfs: Add vfs_copy_file_range() support for pagecache copies This allows us to have an in-kernel copy mechanism that avoids frequent switches between kernel and user space. This is especially useful so NFSD can support server-side copies. The default (flags=0) means to first attempt copy acceleration, but use the pagecache if that fails. Signed-off-by: Anna Schumaker <anna.schuma...@netapp.com> Reviewed-by: Darrick J. Wong <darrick.w...@oracle.com> Reviewed-by: Padraig Brady <p...@draigbrady.com> Reviewed-by: Christoph Hellwig <h...@lst.de> Signed-off-by: Al Viro <v...@zeniv.linux.org.uk> but in particular, i don't think the kernel commit you quote -- f16acc9d9b376 -- _adds_ a sanity check; it just refactors it into a new function? > Rob _______________________________________________ Toybox mailing list Toybox@lists.landley.net http://lists.landley.net/listinfo.cgi/toybox-landley.net