bug#81269: dd (coreutils 9.7): incorrect elapsed time reporting causes impossible throughput (7.9 GB/s)

Paul Eggert Fri, 19 Jun 2026 00:55:25 -0700

On 2026-06-18 22:57, Collin Funk wrote:

I believe the behavior is expected, at least after the following commit:

That commit is about writing a progress report before the fsync, but thebug report is about the final status report, after the fsync. So I doubtwhether the commit is related to the bug.

I suspect that what we're seeing is a bug in the Linux driver (ahci,most likely) or in your drive's firmware.


Sick Pigs, is the problem reproducible?

If you look at the system log (run "dmesg -T" or look in/var/log/syslog) do you see any messages from the kernel? Somethingcontaining "timeout" or "watchdog", or "reset", or "block", or "ata", or"sda"?


The scenario I'm thinking is like this:

* dd issues an fsync system call.
* The Linux kernel issues an ATA FLUSH CACHE EXT.

* You have a DRAM-less drive, and your drive's firmware kinda panics. Itfocuses on moving data out of its SLC write buffer and into its muchslower TLC main storage. While doing this, it stops responding to theSATA bus, and holds the hardware line busy for many seconds.* Meanwhile, the Linux ahci driver is waiting for a hardware interruptfrom the motherboard's SATA controller, using an uninterruptible pollingloop inside kernel space.

* The Linux system timer cannot fire during an uninterruptible polling loop.
* The CLOCK_MONOTONIC clock does not advance.
* 'dd' therefore thinks no time has passed.

Whether this is a bug in your drive's firmware or the ahci driver Ileave up to you. But from dd's point of view, it asked for the time andgot the wrong time from the kernel.

If the problem is reproducible, can you run the following shell commandsand let us know the output? They use 'grep' to filter out lengthy (andpossibly private) read/write traces. If the output of 'grep' is reallylong, please compress it and attach the compressed file. This might helpus confirm the diagnosis.


  LC_ALL=C time strace --relative-timestamps=ns -o /tmp/tr \
    dd if=image.iso of=/dev/sda bs=4M conv=fsync status=progress

  grep -Ev ' (read\(0|write\(1), .* = [^0]' /tmp/tr

If my guess is correct, you might be able to work around the problem bydisabling native command queueing (NCQ) for the drive("libata.force=noncq" when booting Linux), or by telling the Linuxkernel to bypass the CPU's timers ("clocksource=hpet" or"clocksource=acpi_pm"), or maybe even switch your I/O scheduler to beless aggressive. These all have performance implications of course andshould be done only with some expertise.

Or you could buy a better drive, or switch to NVMe which shouldn't havethis problem. I know, expensive.

bug#81269: dd (coreutils 9.7): incorrect elapsed time reporting causes impossible throughput (7.9 GB/s)

Reply via email to