On 30/10/2025 02:18, Collin Funk wrote:
Pádraig Brady <[email protected]> writes:
I was debugging a change in coreutils 9.8 and noticed that fadvise was
being passed OFF_T_MAX to indicate to apply advice to end of file.
This struck me as a bit unusual and coreutils <= 9.7 used 0 to do the same
thing.
It seems to me like 0 may be treated specially or more efficiently in more
cases?
How about the following to go back to 0 meaning, to end of file?
cheers,
Padraig
diff --git a/src/copy-file-data.c b/src/copy-file-data.c
index 1eefd3071..c3abe41e2 100644
--- a/src/copy-file-data.c
+++ b/src/copy-file-data.c
@@ -538,7 +538,7 @@ copy_file_data (int ifd, struct stat const *ist, off_t
ipos, char const *iname,
/* Don't bother calling fadvise for small copies, as it is not
likely to help performance and might even hurt it. */
if (IO_BUFSIZE < ibytes)
- fdadvise (ifd, ipos, ibytes <= OFF_T_MAX - ipos ? ibytes : 0,
+ fdadvise (ifd, ipos, ibytes < OFF_T_MAX - ipos ? ibytes : 0,
FADVISE_SEQUENTIAL);
/* If not making a sparse file, try to use a more-efficient
Your change seems to match the expectations of POSIX [1]:
If len is zero, all data from offset to the largest possible value
of the file offset for that file shall be specified.
And the Linux man pages [2]:
or until the end of the file if size is 0
Whether or not it causes issues, I am not sure.
Collin
[1]
https://pubs.opengroup.org/onlinepubs/9799919799/functions/posix_fadvise.html
[2] https://www.man7.org/linux/man-pages/man2/posix_fadvise.2.html
This patch is in fact required with OpenZFS as my hunch was correct
that a zero length would be treated specially (even perhaps because
cp has used 0 for so long). With a specific length OpenZFS 2.2.2
synchronously (decompresses and) populates the cache, which
defeats the subsequent copy offload. Details at:
https://github.com/coreutils/coreutils/issues/122
I'll apply the above, but also the attached more defensive patch,
to avoid fdadvise unless we really need it.
cheers,
Padraig
From 11ae77d3da29865337a8f0b7addfce6b73f778b2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]>
Date: Fri, 31 Oct 2025 15:37:55 +0000
Subject: [PATCH] copy: be more defensive/restrictive with posix_fadvise
* src/copy-file-data.c (copy_file_data): Only give the
POSIX_FADV_SEQUENTIAL hint when we _know_ we'll definitely
use a read/write loop to copy the data. Also only apply
the hint to the whole file, as we've seen OpenZFS at least
special case that.
(sparse_copy): Update stale comment.
---
src/copy-file-data.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/src/copy-file-data.c b/src/copy-file-data.c
index 8fd25fee9..204c6f173 100644
--- a/src/copy-file-data.c
+++ b/src/copy-file-data.c
@@ -105,8 +105,6 @@ is_CLONENOTSUP (int err)
If HOLE_SIZE, look for holes in the input; *HOLE_SIZE contains
the size of the current hole so far, and update *HOLE_SIZE
at end to be the size of the hole at the end of the copy.
- Set *TOTAL_N_READ to the number of bytes read; this counts
- the trailing hole, which has not yet been output.
Read and update *DEBUG as needed.
If successful, return the number of bytes copied,
otherwise diagnose the failure and return -1. */
@@ -542,14 +540,17 @@ copy_file_data (int ifd, struct stat const *ist, off_t ipos, char const *iname,
|| (x->sparse_mode == SPARSE_AUTO
&& scantype != PLAIN_SCANTYPE)));
- /* Don't bother calling fadvise for small copies, as it is not
- likely to help performance and might even hurt it.
- Note it's important to use a 0 length to indicate the whole file
+ /* If we _know_ we're going to read data sequentially into the process,
+ i.e. --reflink or --sparse are not in auto mode,
+ give that hint to the kernel so it can tune caching behavior.
+ Also we don't bother calling fadvise for small copies,
+ as it is not likely to help performance and might even hurt it.
+ Also we only apply this hint for the whole file (0 length)
as OpenZFS 2.2.2 at least will otherwise synchronously
(decompress and) populate the cache when given a specific length. */
- if (IO_BUFSIZE < ibytes)
- fdadvise (ifd, ipos, ibytes < OFF_T_MAX - ipos ? ibytes : 0,
- FADVISE_SEQUENTIAL);
+ if (IO_BUFSIZE < ibytes && ipos == 0
+ && (x->reflink_mode != REFLINK_AUTO || x->sparse_mode != SPARSE_AUTO))
+ fdadvise (ifd, 0, 0, FADVISE_SEQUENTIAL);
/* If not making a sparse file, try to use a more-efficient
buffer size. */
--
2.51.0