bug#51482: dd with status=progress does not update its output when the main writing is finished

2021-10-29 Thread Sworddragon
Originally found as additional minor bug in ticket #51345 (Knoppix 9.1 with
GNU Coreutils 8.32), I think it is a good idea to create a dedicated report
for it to avoid getting it lost.

When dd is being used with status=progress it appears to update the status
every second but does not do a final update when dd finished its main
writing task (e.g. when dd starts flushing via conv=fsync and it still
blocks for like over a minute) causing the output to be incorrect and
inconsistent across multiple tries.


bug#51345:

2021-10-29 Thread Pádraig Brady

On 28/10/2021 22:59, Sworddragon wrote:

Despite I'm not using Linux as main system anymore and wanted to avoid
getting into too much work I found some time to do some tests as this issue
bugs me just too much.


You could try running the following immediately after,
to see if it also returns quickly:

   blockdev --flushbufs /dev/sdb


Yes, this command also blocks for a bit over 1 minute when this issue
occurs.


Right that suggests the conv=fsync to dd was ineffective


Here is the output (I had to freely translate the strings since
this Knoppix instance is only in german so they might be slightly
inaccurate; Also I had to type all text since it was executed on a
different system but carefully checked to not introduce any typos):

root@Microknoppix:~# dd if=/dev/random of=/dev/sdb bs=1M conv=fsync
status=progress
1039138816 Bytes(1,0 GB, 991 MiB) copied, 56 s, 18,5 MB/s
dd: Error on writing '/dev/sdb': The device has not enough free space
999+0 records in
998+0 records out


Ah right. What's happening is dd is not doing the fsync()
as it's exiting early due to write(2) getting ENOSPC.

As you've seen you can avoid the need for fsync()
to flush buffers with oflag=direct.
A reason that might be faster also is that depending on your free mem,
you would avoid churning the kernel caches.

Another way to at least ensure the conv=fsync was effective,
would be to not write too much.  I.e. you could determine
the exact size of the disk (with `blockdev --getsize64 /dev/sdb` for e.g.)
and then use an appropriate bs= and count=. That's awkward though
and difficult to do with good performance due to larger block sizes
not generally aligning with the device size.

So this is a gotcha that should at least be documented.
Though I'm leaning towards improving this by always
doing an fsync on exit if we get a read or write error
and have successfully written any data, so that
previously written data is sync'd as requested.

cheers,
Pádraig





bug#51345:

2021-10-29 Thread Sworddragon
I got an email that somebody replied to this report but it looks like they
did send it to the wrong mail address (maybe they will be merged in order)
but I will still reply to it:

> Suggestion as a possible workaround: Have a look at random(4) and
random(7),
> and ask yourself whether your use of /dev/random, rather than
/dev/urandom,
> is really necessary for your application. If not, you might try
/dev/urandom
> instead and report what you observe.
>
> As documented in those man pages, there are good reasons to avoid using
> /dev/random, not the least of which is that it blocks frequently (every
time
> the entropy pool is exhausted), whereas /dev/urandom does not. That alone
may
> explain the execution time inconsistencies you're reporting.

I'm aware of the differences of both and I don't see how the use of
/dev/random here could explain any of the issues:

- The drastically increased writing times are caused after dd claimed about
no free empty space (assuming this means dd has finished doing its explicit
writing task, e.g. writing to the transparent underlying cache) and dd's
times until this point were fine implying that either significant blocking
has never occured or dd correcly handles blocking input files by
asynchronously continuing to write to the output file to avoid unneccessary
delays.
- That dd returns while the buffers are not flushed can't also be explained
by the use of /dev/random unless there would be some very crazy out of mind
bug somewhere.

But I still did some tests with /dev/urandom and conv=fsync and I did see
all 3 cases too (Normal writing times that are slightly longer (133.247s);
Drastically increased writing times (235.906s) and early returning from dd
(56.4327s) while an immediated executed sync still blocked for over a
minute).

Also for the slightly delayed writing times (~133s with conv=fsync compared
to ~129s with oflag=direct) I noticed that with conv=fsync the LED of the
USB Thumb Drive starts to blink a few seconds after dd started showing
status progress so I assume Knoppix's kernel/settings cause the cache to be
flushed slightly delayed which seems more or less normal and would explain
this specific case of being ~4s slower.


And while I was writing this the next message got in:


> Ah right. What's happening is dd is not doing the fsync()
> as it's exiting early due to write(2) getting ENOSPC.

But that would make not much sense for 2 reasons:

1. The error message that the space went empty is the expected behavior
here and there is no rational dd should exit then instead of still calling
fsync().
2. In the most attempts after dd has thrown the error message about no free
empty space it still blocked for at least over a minute and an immediated
excuted sync always returned.

So it looks dd sometimes does call fsync() on ENOSPC and sometimes it
doesn't. And why a successfull call to fsync() then is sometimes ~2.5
slower is also a mystery.


> So this is a gotcha that should at least be documented.
> Though I'm leaning towards improving this by always
> doing an fsync on exit if we get a read or write error
> and have successfully written any data, so that
> previously written data is sync'd as requested.

Yes, ensuring fsync() being called seems to be the better option here. But
this still leaves the question from above why dd seemingly does this
sometimes already on ENOSPC? Maybe it is a race between ENOSPC and fsync()
in dd causing this bug? Eventually the sometimes occuring very delayed
writes (e.g. ~235s instead of ~133s) play a role here?


bug#51345:

2021-10-29 Thread Pádraig Brady

On 28/10/2021 22:59, Sworddragon wrote:

Despite I'm not using Linux as main system anymore and wanted to avoid
getting into too much work I found some time to do some tests as this issue
bugs me just too much.


You could try running the following immediately after,
to see if it also returns quickly:

   blockdev --flushbufs /dev/sdb


Yes, this command also blocks for a bit over 1 minute when this issue
occurs.


Right, so that suggests conv=fsync on dd was ineffective.


Here is the output (I had to freely translate the strings since
this Knoppix instance is only in german so they might be slightly
inaccurate; Also I had to type all text since it was executed on a
different system but carefully checked to not introduce any typos):

root@Microknoppix:~# dd if=/dev/random of=/dev/sdb bs=1M conv=fsync
status=progress
1039138816 Bytes(1,0 GB, 991 MiB) copied, 56 s, 18,5 MB/s
dd: Error on writing '/dev/sdb': The device has not enough free space
999+0 records in
998+0 records out


Ah right. What's probably happening is that dd is not doing the fsync
because it's exiting early because of the ENOSPC from write(2).

To avoid needing the buffer drain (with fsync), you can use





bug#51345:

2021-10-29 Thread Paul Eggert

On 10/29/21 05:38, Pádraig Brady wrote:

I'm leaning towards improving this by always
doing an fsync on exit if we get a read or write error
and have successfully written any data, so that
previously written data is sync'd as requested.


Sounds like a good idea to me too.





bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

2021-10-29 Thread Paul Eggert

On 10/28/21 12:11, Paul Eggert wrote:
Wait - lseek returns a number less than -1?! We could easily check for 
that FreeBSD bug, perhaps as an independent patch; this shouldn't 
require any extra syscalls.


I installed the attached patch to do this. This doesn't fix coreutils 
bug#51433; it merely makes 'cp' and similar programs more likely to 
detect and report the FreeBSD 9.1 bug you mentioned.From 674912c6f7c45b5f0b1f77da89663cac02c118f7 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 29 Oct 2021 18:01:34 -0700
Subject: [PATCH] cp: defend better against FreeBSD 9.1 zfs bug
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Pádraig Brady (Bug#51433#14).
* src/copy.c (lseek_copy, infer_scantype): Report an error if
lseek with SEEK_DATA or SEEK_HOLE returns less than -1,
as this is an lseek bug.
---
 src/copy.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index cb9018f93..1cbc9480c 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -530,7 +530,7 @@ lseek_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
   off_t ext_end = lseek (src_fd, ext_start, SEEK_HOLE);
   if (ext_end < 0)
 {
-  if (errno != ENXIO)
+  if (! (ext_end == -1 && errno == ENXIO))
 goto cannot_lseek;
   ext_end = src_total_size;
   if (ext_end <= ext_start)
@@ -607,12 +607,8 @@ lseek_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
 }
 
   ext_start = lseek (src_fd, dest_pos, SEEK_DATA);
-  if (ext_start < 0)
-{
-  if (errno != ENXIO)
-goto cannot_lseek;
-  break;
-}
+  if (ext_start < 0 && ! (ext_start == -1 && errno == ENXIO))
+goto cannot_lseek;
 }
 
   /* When the source file ends with a hole, we have to do a little more work,
@@ -1097,10 +1093,11 @@ infer_scantype (int fd, struct stat const *sb,
 
 #ifdef SEEK_HOLE
   scan_inference->ext_start = lseek (fd, 0, SEEK_DATA);
-  if (0 <= scan_inference->ext_start)
+  if (0 <= scan_inference->ext_start
+  || (scan_inference->ext_start == -1 && errno == ENXIO))
 return LSEEK_SCANTYPE;
   else if (errno != EINVAL && !is_ENOTSUP (errno))
-return errno == ENXIO ? LSEEK_SCANTYPE : ERROR_SCANTYPE;
+return ERROR_SCANTYPE;
 #endif
 
   return ZERO_SCANTYPE;
-- 
2.31.1



bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

2021-10-29 Thread Paul Eggert

On 10/28/21 13:59, Pádraig Brady wrote:


I wonder after getting a SEEK_DATA ENXIO, might we correlate
we really are at end of file by checking SEEK_HOLE also returns ENXIO?


Wouldn't SEEK_HOLE return the current offset, instead of ENXIO? That is, 
if the underlying data structure is wrong and claims that the rest of 
the file is a hole, wouldn't SEEK_HOLE merely repeat that information?



Perhaps zfs is updating st_blocks async, or perhaps there are
runs of zeros in these files that a linker or whatever is sparsifying?


Even if st_blocks was out-of-date, that's just a heuristic and the later 
lseeks should still work. I don't think the files contain actual holes.


I don't see an easy workaround for the ZFS bug, unless we want to slow 
down 'cp' for everybody. This really needs to be fixed on the ZFS side.


The attached patch to OpenZFS might work around the bug (I haven't 
tested it; perhaps someone who uses ZFS could give it a try).diff --git a/module/zfs/zfs_vnops.c b/module/zfs/zfs_vnops.c
index 9bd75c011..1d1bef079 100644
--- a/module/zfs/zfs_vnops.c
+++ b/module/zfs/zfs_vnops.c
@@ -100,7 +100,13 @@ zfs_holey_common(znode_t *zp, ulong_t cmd, loff_t *off)
 	else
 		hole = B_FALSE;
 
+	/* Work around OpenZFS bug 11900
+	   .  */
+# if 0
 	error = dmu_offset_next(ZTOZSB(zp)->z_os, zp->z_id, hole, );
+# else
+	error = EBUSY;
+# fi
 
 	if (error == ESRCH)
 		return (SET_ERROR(ENXIO));