Re: cat: adjust the maximum data copied by copy_file_range

Matteo Croce Mon, 22 Dec 2025 17:30:16 -0800

Il giorno lun 22 dic 2025 alle ore 19:41 Paul Eggert
<[email protected]> ha scritto:
>
> [cc'ing [email protected]; this coreutils thread can be
> found in <https://lists.gnu.org/r/coreutils/2025-12/threads.html#00055>.]
>
> On 2025-12-20 00:51, Matteo Croce wrote:
> > This can be triggered with a huge file:
> >
> > $ truncate -s $((2**63 - 1)) file1
> >
> > $ ( dd bs=1M skip=$((2**43 - 2)) count=0 && cat ) < file1
> > 0+0 records in
> > 0+0 records out
> > 0 bytes copied, 2,825e-05 s, 0,0 kB/s
> > cat: -: Invalid argument
> >
> > $ dd if=file1 bs=1M skip=$((2**43 - 2))
> > dd: error reading 'file1': Invalid argument
> > 1+0 records in
> > 1+0 records out
> > 1048576 bytes (1,0 MB, 1,0 MiB) copied, 0,103536 s, 10,1 MB/s
>
> OK, but in bleeding-edge coreutils neither of these examples call
> copy_file_range. The diagnostics result from plain 'read' syscalls near
> TYPE_MAXIMUM (off_t). (dd never calls copy_file_range, and ironically
> the code in 'cat' that does call copy_file_range avoids the overflow
> itself, before invoking copy_file_range, and relies on plain 'read' to
> do the right thing near TYPE_MAXIMUM (off_t).) So these examples have
> nothing to do with copy_file_range.
>


Yes I know that copy_file_range is unrelated, my commands are just a
simple reproducers for the kernel issue.
Where in cat.c the code avoids the overflow? I see:

ssize_t copy_max = MIN (SSIZE_MAX, SIZE_MAX) >> 30 << 30;

which should evaluate to 0x7FFFFFFFC0000000
also strace says:

$ strace -e copy_file_range cat /etc/fstab >fstab
copy_file_range(3, NULL, 1, NULL, 9223372035781033984, 0) = 568
copy_file_range(3, NULL, 1, NULL, 9223372035781033984, 0) = 0
+++ exited with 0 +++

> You've found a Linux kernel bug that affects countless apps, and we
> can't reasonably expect app developers to patch all the apps to work
> around the bug. So the fix should be done in the kernel.
>
> I looked at the kernel patch you suggested in
> <https://lore.kernel.org/linux-fsdevel/[email protected]/T/>.
> Unfortunately, I see two problems with it, the first minor, the second
> less so.
>
> The minor problem is that the unpatched kernel code is merely
> incorrectly checking whether pos + count fits into loff_t. MAX_RW_COUNT
> should not be involved with the fix, as MAX_RW_COUNT is irrelevant to
> file offset range. Better would be to do correct overflow checks, with
> something like the attached patch (which I have not compiled or tested).
>
> Second and more important, the patch doesn't fix the real bug which is
> that read(FD, BUF, SIZE) fails with -EINVAL if adding SIZE to the
> current file position would overflow off_t. That's wrong: the syscall
> should read whatever bytes are present (up to EOF), and then report the
> number of bytes read. We cannot fix this bug merely via something like
> the attached patch.
>
> One possible fix for the second problem would be to change
> rw_verify_area's API to return the possibly-smaller number of bytes that
> can be read, and then modify its callers to do the right thing.
> ("correct" in the sense of "don't try to read past TYPE_MAXIMUM
> (off_t)".) Alternatively, we could fix rw_verify_area's callers to not
> try to read past TYPE_MAXIMUM (off_t), without changing the API.

Yes, the kernel bug has to be fixed, of course.
Your patch doesn't compile due to an unmatched curly brace, I fixed it
but it panics at boot, can you check if I preserved the correct logic?

Regards,
-- 
Matteo Croce

perl -e 'for($t=0;;$t++){print chr($t*($t>>8|$t>>13)&255)}' |aplay

rw_verify_area-overflow.diff
Description: Binary data

Re: cat: adjust the maximum data copied by copy_file_range

Reply via email to