bug#63850: cp fails for files > 2 GB if copy offload is unsupported

2023-06-02 Thread Paul Eggert

On 2023-06-02 09:31, Pádraig Brady wrote:

I'm not sure it was working correctly before 9.3 either.
Before 9.3 we would have switched from copy_file_range() to read()/write()


Actually, cp shouldn't have been using copy_file_range at all, as the 
code is supposed to never use copy_file_range unless the Linux kernel 
version is 5.3 or later. See m4/copy-file-range.m4 and 
lib/copy-file-range.c.


Since the bug is being reported against kernel 4.19, someone needs to 
investigate why the Gentoo build is using the copy_file_range syscall on 
that kernel. Either the Gentoo build isn't properly compiling the 
replacement function in coreutils/lib/copy-file-range.c, or the 
replacement function is incorrectly deciding that the kernel is new 
enough, or something like that.


We shouldn't need to fiddle with src/copy.c on this.





bug#63850: cp fails for files > 2 GB if copy offload is unsupported

2023-06-02 Thread Pádraig Brady

On 03/06/2023 02:02, Mike Gilbert wrote:

On Fri, Jun 02, 2023 at 05:31:50PM +0100, Pádraig Brady wrote:

I'm not sure it was working correctly before 9.3 either.
Before 9.3 we would have switched from copy_file_range() to read()/write()
upon receiving the EINVAL, which might have worked, but also I'm not sure
the file offsets would be correct in that case. Could you show the output with:

diff --git a/src/copy.c b/src/copy.c
index 0dd059d2e..35c54b905 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -363,7 +363,16 @@ sparse_copy (int src_fd, int dest_fd, char **abuf, size_t 
buf_size,
  edge case where the file is made immutable after creating,
  in which case the (more accurate) error is still shown.  */
   if (*total_n_read == 0 && is_CLONENOTSUP (errno))
-  break;
+  {
+if (*total_n_read != 0)
+  {
+off_t clone_read_offset = lseek (src_fd, 0, SEEK_CUR);
+off_t clone_write_offset = lseek (dest_fd, 0, SEEK_CUR);
+printf ("switching to standard copy at :%"PRIdMAX" read=%"PRIdMAX" 
write=%"PRIdMAX"\n",
+*total_n_read, clone_read_offset, 
clone_write_offset);
+  }
+break;
+  }

   /* ENOENT was seen sometimes across CIFS shares, resulting in
  no data being copied, but subsequent standard copies succeed. 
 */


I don't think this patch will do anything useful: *total_n_read cannot be 0
and not 0 simultaneously, so the new block of code will never be
executed. Maybe you meant to insert this block somewhere else?


Sorry was rushing.

Yes I meant to remove the first total_n_read check
as done in the following.

In any case an external check would be as useful.
I suppose we would have heard at this stage
but it would be good to have verification with md5sum or similar
on the source and destination files that the copy worked fine
on such a file >2G.

cheers,
Pádraig.

diff --git a/src/copy.c b/src/copy.c
index 0dd059d2e..296707c39 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -362,8 +362,17 @@ sparse_copy (int src_fd, int dest_fd, char **abuf, size_t 
buf_size,
also occur for immutable files, but that would only be in the
edge case where the file is made immutable after creating,
in which case the (more accurate) error is still shown.  */
-if (*total_n_read == 0 && is_CLONENOTSUP (errno))
-  break;
+if (is_CLONENOTSUP (errno))
+  {
+if (*total_n_read != 0)
+  {
+off_t clone_read_offset = lseek (src_fd, 0, SEEK_CUR);
+off_t clone_write_offset = lseek (dest_fd, 0, SEEK_CUR);
+printf ("switching to standard copy at :%"PRIdMAX" read=%"PRIdMAX" 
write=%"PRIdMAX"\n",
+*total_n_read, clone_read_offset, 
clone_write_offset);
+  }
+break;
+  }

 /* ENOENT was seen sometimes across CIFS shares, resulting in
no data being copied, but subsequent standard copies succeed.  
*/






bug#63858: GNU "shuf" on Linux calls getrandom without GRND_NONBLOCK, hangs indefinitely (9.x regression)

2023-06-02 Thread Nick Bowler
Hi,

I installed a new version of GNU coreutils (9.3), and now "shuf" appears
to be blocking on Linux's cryptographic RNG init.  On this particular
machine, there is not a lot of entropy sources so Linux's RNG init takes
an unbounded and potentially very long time.

I hope nobody expects "shuf" to provide cryptographically-secure
randomness (I certainly don't), and it would be nice for shuf to
not hang indefinitely.

Running with strace I see shuf makes two calls to getrandom, the
first passes the GRND_NONBLOCK flag, and shuf appears to fall back to
using clock_gettime in this case, but the second call passes 0 for flags
which means "block until RNG init", and this is the one that hangs.
shuf only makes the second getrandom call if there is actually input
(so shuf &1 | grep getrandom
  % getrandom("\xd5\x55\xc9\x73", 4, GRND_NONBLOCK) = 4
  % getrandom("\x76", 1, 0) = 1

note the 0 flags in the second call.  When it is not working (before RNG
init), the second call hangs indefinitely (killed by a signal in this
case):

  % echo x | strace ./shuf 2>&1
  [...]
  getrandom(0xb6f148c8, 4, GRND_NONBLOCK) = -1 EAGAIN (Resource
temporarily unavailable)
  clock_gettime64(CLOCK_MONOTONIC, 0xbed4e810) = -1 ENOSYS (Function
not implemented)
  clock_gettime(CLOCK_MONOTONIC, {tv_sec=270, tv_nsec=854650394}) = 0
  [...]
  getrandom(0x1c40228, 1, 0)  = ? ERESTARTSYS (To be
restarted if SA_RESTART is set)
  --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
  +++ killed by SIGINT +++

I do not have this problem with coreutils 8.28.  Using the GNU-specific
--random-source=/dev/urandom option is a possible workaround (since
/dev/urandom reads do not block).

Let me know if you need any more info.

Thanks,
  Nick





bug#63856: >=coreutils-9.2 cp: preserving permissions: Operation not supported when copying from no_root_squash nfs export

2023-06-02 Thread Peter Robertson
I have an nfs export: /mnt/Backup Hostname(no_root_squash,ro,mp)
It exports this drive, LABEL=Backup/mnt/Backup
btrfs   noatime,compress-force=zstd:99

I have a locally mounted drive, /mnt/Mirror (LABEL=Mirror
/mnt/Mirror btrfs   compress-force=zstd:99,noatime,nofail)

I mount the nfs share: mount Hostname:/mnt/Backup /mnt/Backup.

I cp a file from one to the other, cp -av
/mnt/Backup/snaps/elden/90/info.xml
/mnt/Mirror/snaps/elden/90/info.xml.

Under coreutils-9.1 I get no error. Under coreutils-9.2 I get cp:
preserving permissions for ‘/mnt/Mirror/snaps/elden/90/info.xml’:
Operation not supported

In both cases the resulting copy looks like this:

# ls -l /mnt/*/snaps/elden/90/info.xml
-rw--- 1 root root 184 Apr 17 10:52 /mnt/Backup/snaps/elden/90/info.xml
-rw--- 1 root root 184 Apr 17 10:52 /mnt/Mirror/snaps/elden/90/info.xml

I git bisected with gentoo's live ebuild coreutils-
# first bad commit: [28a85116feef1f9a6f31c5ab8cfe50d7aa8d6fc4] build:
update gnulib submodule to latest

# cp --version
cp (GNU coreutils) 9.2
Packaged by Gentoo (9.2-r2 (p0))
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later .
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Torbjorn Granlund, David MacKenzie, and Jim Meyering.

Reproducible: Always





bug#63850: cp fails for files > 2 GB if copy offload is unsupported

2023-06-02 Thread Pádraig Brady

On 02/06/2023 16:44, Sam James wrote:

Hello,

Forwarding a downstream report of a behaviour change between
coreutils-9.1 and coreutils-9.3 from https://bugs.gentoo.org/907474.

The reporter bisected it to 093a8b4bfaba60005f14493ce7ef11ed665a0176
("copy: fix --reflink=auto to fallback in more cases", see bug#62404)
and gave strace output showing:
```
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 2147479552
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = -1 EINVAL
(Invalid argument)
```

"""
When I try to copy a large file (> 2 GB) like so:

cp --debug file_a file_b

output looks like this:

'file_a' -> 'file_b'
cp: error copying 'file_a' to 'file_b': Invalid argument
copy offload: unsupported, reflink: unsupported, sparse detection: no

Afterwards file_b has a size of 2147479552 bytes (= 2G - 4K).

On another system (with newer kernel version) it looks like this:

cp --debug file_a file_b
'file_a' -> 'file_b'
copy offload: yes, reflink: unsupported, sparse detection: no
"""

Let me know if you need further information, although there's
some more on the downstream Gentoo bug I linked.

Apparently this is only happening w/ the 4.19.x kernels.


I'm not sure it was working correctly before 9.3 either.
Before 9.3 we would have switched from copy_file_range() to read()/write()
upon receiving the EINVAL, which might have worked, but also I'm not sure
the file offsets would be correct in that case. Could you show the output with:

diff --git a/src/copy.c b/src/copy.c
index 0dd059d2e..35c54b905 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -363,7 +363,16 @@ sparse_copy (int src_fd, int dest_fd, char **abuf, size_t 
buf_size,
edge case where the file is made immutable after creating,
in which case the (more accurate) error is still shown.  */
 if (*total_n_read == 0 && is_CLONENOTSUP (errno))
-  break;
+  {
+if (*total_n_read != 0)
+  {
+off_t clone_read_offset = lseek (src_fd, 0, SEEK_CUR);
+off_t clone_write_offset = lseek (dest_fd, 0, SEEK_CUR);
+printf ("switching to standard copy at :%"PRIdMAX" read=%"PRIdMAX" 
write=%"PRIdMAX"\n",
+*total_n_read, clone_read_offset, 
clone_write_offset);
+  }
+break;
+  }

 /* ENOENT was seen sometimes across CIFS shares, resulting in
no data being copied, but subsequent standard copies succeed.  
*/







bug#63850: cp fails for files > 2 GB if copy offload is unsupported

2023-06-02 Thread Sam James
Hello,

Forwarding a downstream report of a behaviour change between
coreutils-9.1 and coreutils-9.3 from https://bugs.gentoo.org/907474.

The reporter bisected it to 093a8b4bfaba60005f14493ce7ef11ed665a0176
("copy: fix --reflink=auto to fallback in more cases", see bug#62404)
and gave strace output showing:
```
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 2147479552
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = -1 EINVAL
(Invalid argument)
```

"""
When I try to copy a large file (> 2 GB) like so:

cp --debug file_a file_b

output looks like this:

'file_a' -> 'file_b'
cp: error copying 'file_a' to 'file_b': Invalid argument
copy offload: unsupported, reflink: unsupported, sparse detection: no

Afterwards file_b has a size of 2147479552 bytes (= 2G - 4K).

On another system (with newer kernel version) it looks like this:

cp --debug file_a file_b
'file_a' -> 'file_b'
copy offload: yes, reflink: unsupported, sparse detection: no
"""

Let me know if you need further information, although there's
some more on the downstream Gentoo bug I linked.

Apparently this is only happening w/ the 4.19.x kernels.

Thanks!



signature.asc
Description: PGP signature