bug#51345: dd with conv=fsync sometimes returns when its writes are still cached

2022-01-28 Thread Paul Eggert
I found a bit of time to work on this and installed the attached patch, 
which should address the issue. Thanks for reporting it.From 3368b8745046aeaa89f418f560e714b374f1a560 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 28 Jan 2022 00:01:07 -0800
Subject: [PATCH] dd: synchronize output after write errors

Problem reported by Sworddragon (Bug#51345).
* src/dd.c (cleanup): Synchronize output unless dd has been interrupted.
(synchronize_output): New function, split out from dd_copy.
Update conversions_mask so synchronization is done at most once.
(main): Do not die with the output file open, since we want to be
able to synchronize it before exiting.  Synchronize output before
exiting.
---
 NEWS   |  3 ++
 doc/coreutils.texi | 10 +++---
 src/dd.c   | 76 +-
 3 files changed, 64 insertions(+), 25 deletions(-)

diff --git a/NEWS b/NEWS
index 15c9428bd..757abee15 100644
--- a/NEWS
+++ b/NEWS
@@ -41,6 +41,9 @@ GNU coreutils NEWS-*- outline -*-
   padding them with zeros to 9 digits.  It uses clock_getres and
   clock_gettime to infer the clock resolution.
 
+  dd conv=fsync now synchronizes output even after a write error,
+  and similarly for dd conv=fdatasync.
+
   timeout --foreground --kill-after=... will now exit with status 137
   if the kill signal was sent, which is consistent with the behavior
   when the --foreground option is not specified.  This allows users to
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index c17406550..088d1764c 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9397,14 +9397,16 @@ Continue after read errors.
 @item fdatasync
 @opindex fdatasync
 @cindex synchronized data writes, before finishing
-Synchronize output data just before finishing.  This forces a physical
-write of output data.
+Synchronize output data just before finishing,
+even if there were write errors.
+This forces a physical write of output data.
 
 @item fsync
 @opindex fsync
 @cindex synchronized data and metadata writes, before finishing
-Synchronize output data and metadata just before finishing.  This
-forces a physical write of output data and metadata.
+Synchronize output data and metadata just before finishing,
+even if there were write errors.
+This forces a physical write of output data and metadata.
 
 @end table
 
diff --git a/src/dd.c b/src/dd.c
index 957ad129e..4ddc6db12 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -939,6 +939,8 @@ iclose (int fd)
   return 0;
 }
 
+static int synchronize_output (void);
+
 static void
 cleanup (void)
 {
@@ -948,6 +950,13 @@ cleanup (void)
   alignfree (obuf);
 #endif
 
+  if (!interrupt_signal)
+{
+  int sync_status = synchronize_output ();
+  if (sync_status)
+exit (sync_status);
+}
+
   if (iclose (STDIN_FILENO) != 0)
 die (EXIT_FAILURE, errno, _("closing input file %s"), quoteaf (input_file));
 
@@ -2377,17 +2386,33 @@ dd_copy (void)
   && 0 <= reported_w_bytes && reported_w_bytes < w_bytes)
 print_xfer_stats (0);
 
-  if ((conversions_mask & C_FDATASYNC) && ifdatasync (STDOUT_FILENO) != 0)
+  return exit_status;
+}
+
+/* Synchronize output according to conversions_mask.
+   Do this even if w_bytes is zero, as fsync and fdatasync
+   flush out write requests from other processes too.
+   Clear bits in conversions_mask so that synchronization is done only once.
+   Return zero if successful, an exit status otherwise.  */
+
+static int
+synchronize_output (void)
+{
+  int exit_status = 0;
+  int mask = conversions_mask;
+  conversions_mask &= ~ (C_FDATASYNC | C_FSYNC);
+
+  if ((mask & C_FDATASYNC) && ifdatasync (STDOUT_FILENO) != 0)
 {
   if (errno != ENOSYS && errno != EINVAL)
 {
   error (0, errno, _("fdatasync failed for %s"), quoteaf (output_file));
   exit_status = EXIT_FAILURE;
 }
-  conversions_mask |= C_FSYNC;
+  mask |= C_FSYNC;
 }
 
-  if ((conversions_mask & C_FSYNC) && ifsync (STDOUT_FILENO) != 0)
+  if ((mask & C_FSYNC) && ifsync (STDOUT_FILENO) != 0)
 {
   error (0, errno, _("fsync failed for %s"), quoteaf (output_file));
   return EXIT_FAILURE;
@@ -2460,6 +2485,16 @@ main (int argc, char **argv)
| (conversions_mask & C_EXCL ? O_EXCL : 0)
| (seek_records || (conversions_mask & C_NOTRUNC) ? 0 : O_TRUNC));
 
+  off_t size;
+  if ((INT_MULTIPLY_WRAPV (seek_records, output_blocksize, )
+   || INT_ADD_WRAPV (seek_bytes, size, ))
+  && !(conversions_mask & C_NOTRUNC))
+die (EXIT_FAILURE, 0,
+ _("offset too large: "
+   "cannot truncate to a length of seek=%"PRIdMAX""
+   " (%td-byte) blocks"),
+ seek_records, output_blocksize);
+
   /* Open the output file with *read* access only if we might
  need to read to satisfy a 'seek=' request.  If we can't read
  the file, go ahead with write-only access; it might work.  */

bug#51345: dd with conv=fsync sometimes returns when its writes are still cached

2021-10-25 Thread Bob Proulx
Sworddragon wrote:
> On Knoppix 9.1 with the Linux Kernel 5.10.10-64 x86_64 and GNU Coreutils
> 8.32 I wanted to overwrite my USB Thumb Drive a few times with random data
> via "dd if=/dev/random of=/dev/sdb bs=1M conv=fsync". While it usually
> takes ~2+ minutes to perform this action dd returned once after less than
> 60 seconds which made me a bit curious.

I suggest another try using oflag=direct instead of conv=fsync.

dd if=/dev/random of=/dev/sdb bs=1M oflag=direct

Also with rates status.

dd if=/dev/random of=/dev/sdb bs=1M oflag=direct status=progress

Here is the documentation for it.

  ‘oflag=FLAG[,FLAG]...’

 ‘direct’
  Use direct I/O for data, avoiding the buffer cache.  Note that
  the kernel may impose restrictions on read or write buffer
  sizes.  For example, with an ext4 destination file system and
  a Linux-based kernel, using ‘oflag=direct’ will cause writes
  to fail with ‘EINVAL’ if the output buffer size is not a
  multiple of 512.

Bob






bug#51345: dd with conv=fsync sometimes returns when its writes are still cached

2021-10-23 Thread Pádraig Brady

On 23/10/2021 09:18, Sworddragon wrote:

On Knoppix 9.1 with the Linux Kernel 5.10.10-64 x86_64 and GNU Coreutils
8.32 I wanted to overwrite my USB Thumb Drive a few times with random data
via "dd if=/dev/random of=/dev/sdb bs=1M conv=fsync". While it usually
takes ~2+ minutes to perform this action dd returned once after less than
60 seconds which made me a bit curious. On a later attempt dd returned with
this command after ~1 minute again but the LED on my USB Thumb Drive was
still blinking and an immediated executed sync on the terminal blocked for
~1 minute and once it returned the LED on my USB Thumb Drive also stopped
blinking.

Knoppix 9.1 was booted as a live system from a DVD and the only another
persistent storage attached to it was an internal HDD which was already
overwritten the same way via a past booting session.


Unfortunately Linux is not my main operating system anymore so I randomly
encountered this issue on a nearly 1 year old distribution which might be
slightly outdated. But the issue is pretty severe as it can lead to
significant data loss so it is worth at least reporting it (and having the
"bad" behavior to always call sync after dd nonetheless will probably now
continue to stay for quite a while).



Well we're relying on the kernel here to not return from fync()
until appropriate.  You could try running the following immediately after,
to see if it also returns quickly:

  blockdev --flushbufs /dev/sdb

Note a subsequent `sync /dev/sdb` would probably not be effective,
since that would only consider data written by theh sync process
(which would be nothing).

cheers,
Pádraig





bug#51345: dd with conv=fsync sometimes returns when its writes are still cached

2021-10-23 Thread Sworddragon
On Knoppix 9.1 with the Linux Kernel 5.10.10-64 x86_64 and GNU Coreutils
8.32 I wanted to overwrite my USB Thumb Drive a few times with random data
via "dd if=/dev/random of=/dev/sdb bs=1M conv=fsync". While it usually
takes ~2+ minutes to perform this action dd returned once after less than
60 seconds which made me a bit curious. On a later attempt dd returned with
this command after ~1 minute again but the LED on my USB Thumb Drive was
still blinking and an immediated executed sync on the terminal blocked for
~1 minute and once it returned the LED on my USB Thumb Drive also stopped
blinking.

Knoppix 9.1 was booted as a live system from a DVD and the only another
persistent storage attached to it was an internal HDD which was already
overwritten the same way via a past booting session.


Unfortunately Linux is not my main operating system anymore so I randomly
encountered this issue on a nearly 1 year old distribution which might be
slightly outdated. But the issue is pretty severe as it can lead to
significant data loss so it is worth at least reporting it (and having the
"bad" behavior to always call sync after dd nonetheless will probably now
continue to stay for quite a while).