On 09/26/2014 07:27 AM, Bernhard Voelker wrote: > On 09/25/2014 03:12 PM, Pádraig Brady wrote: >> On 09/25/2014 01:26 PM, Federico Simoncelli wrote: >>> ----- Original Message ----- >>>> From: "Pádraig Brady" <[email protected]> >>>> To: "Coreutils" <[email protected]> >>>> Cc: "Federico Simoncelli" <[email protected]> >>>> Sent: Monday, September 22, 2014 5:25:58 PM >>>> Subject: dd SIGUSR1 race >>>> >>>> There is a race with dd setting up the signal handler for SIGUSR1, >>>> and some async process sending it. If dd loses the race it's killed. > > Just for the record: even the little shell example in dd's manpage seems > to make the program loose the race quite frequently. The following is just > changed a bit to demonstrate that dd(1) didn't even have the chance to > create the output file: > > $ dd if=/dev/zero of=file & pid=$! ; kill -USR1 $pid > [1] 21271 > [1]+ User defined signal 1 dd if=/dev/zero of=file > > $ test -e file || echo 'dd killed' > dd killed > >>>> Though doing the above would still need existing apps to change >>>> (to ignore SIGUSR1), so instead it might be better to just support >>>> a simpler mechanism through something like a "status=progress" option, >>>> which would avoid the general issues and awkwardness of signals as >>>> mentioned >>>> here: >>>> >>>> http://lists.gnu.org/archive/html/bug-coreutils/2004-04/threads.html#00127 >>>> >>>> I'm leaning towards the latter option here. > >>> - how often should it be reported/updated? (every x seconds or x byte >>> transferred? configurable?) >> >> We could support status=none,xfer to only output the above transfer >> statistics line. >> I suppose we could do this every second but that would be approximate due >> to blocking reads() and writes(). > > The current implementation has the same minor issue, so I don't see a problem > here. > > I only see a problem with removing the existing behavior: if we remove the > SIGURS1 handler, then dd(1) would be killed if an existing script relies > on dd(1) to work as before. > So should we leave the racy stuff in for compatibility? Well, we could enable > the new status=xfer option in the signal handler, but this would maybe also > change the behavior the caller would expect.
We'd definitely leave the existing SIGUSR1 handling in place but we could augment it to allow avoiding races. Option1: Unblock SIGUSR1 explicitly, allowing the parent to mask that and avoid races, Option2: Set the handler even if SIGUSR1 was set to ignored. Note the default behavior for SIGINFO on BSD is to discard it, so that doesn't have the "kill dd by mistake" issue, but does it seem have the possibility to lose SIGINFO, so that for robustness they'd have to be resent after a timeout. Now that doesn't seem too onerous since you'd have to be dealing with timeouts anyway, so one could take the same approach with existing GNU dd implementations. I.E. just set SIGUSR1 to ignore in the parent before the fork and then reset. Another reason to favor Option2 is that can be easily controlled from the shell using: trap '' USR1 Note we only print stats on receiving the SIGUSR1 so I don't see much issue in requisitioning it (even though not strictly POSIX compliant). Also we can even suppress that stats output if problematic with status=none. Therefore I'm leaning towards option2. The attached patch supports option2 by fixing up the various races. Note this doesn't preclude also having an explicit option to periodically output the statistics, though I'd be slightly in favor of avoiding that if possible. cheers, Pádraig.
>From 04d208de1a2951969f6cc40ed879a9bdff7e62de Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]> Date: Fri, 26 Sep 2014 15:46:28 +0100 Subject: [PATCH] dd: use more robust SIGUSR handling * src/dd.c (ifd_reopen): A new wrapper to ensure we don't exit upon receiving a SIGUSR1 in a blocking open() on a fifo for example. (iread): Process signals also after a short read. (install_signal_handlers): Install SIGINFO/SIGUSR1 handler even if set to SIG_IGN, as this is what the parent can easily set from a shell script that can send SIGUSR1 without the possiblity of inadvertently killing the dd process. * doc/corutils.texi (dd invocation): Improve the example to show robust usage wrt signal races and short reads. * tests/dd/stats.sh: A new test for various signal races. * tests/local.mk: Reference the new test. * NEWS: Mention the fix. --- NEWS | 3 ++ doc/coreutils.texi | 39 ++++++++++++++++++++++++----------- src/dd.c | 33 +++++++++++++++++++++-------- tests/dd/stats.sh | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/local.mk | 1 + 5 files changed, 112 insertions(+), 21 deletions(-) create mode 100755 tests/dd/stats.sh diff --git a/NEWS b/NEWS index 077c5fc..3e10ac4 100644 --- a/NEWS +++ b/NEWS @@ -4,6 +4,9 @@ GNU coreutils NEWS -*- outline -*- ** Bug fixes + dd supports more robust SIGINFO/SIGUSR1 handling for outputting statistics. + Previously those signals may have inadvertently terminated the process. + cp no longer issues an incorrect warning about directory hardlinks when a source directory is specified multiple times. Now, consistent with other file types, a warning is issued for source directories with duplicate names, diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 1519fcb..6e1a2fb 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -9001,23 +9001,38 @@ occur on disk based devices): dd conv=noerror,sync iflag=fullblock </dev/sda1 > /mnt/rescue.img @end example -Sending an @samp{INFO} signal to a running @command{dd} -process makes it print I/O statistics to standard error -and then resume copying. In the example below, -@command{dd} is run in the background to copy 10 million blocks. +Sending an @samp{INFO} signal (or @samp{USR1} signal where that is unavailable) +to a running @command{dd} process makes it print I/O statistics to +standard error and then resume copying. In the example below, +@command{dd} is run in the background to copy 5GB of data. The @command{kill} command makes it output intermediate I/O statistics, and when @command{dd} completes normally or is killed by the @code{SIGINT} signal, it outputs the final statistics. @example -$ dd if=/dev/zero of=/dev/null count=10MB & pid=$! -$ kill -s INFO $pid; wait $pid -3385223+0 records in -3385223+0 records out -1733234176 bytes (1.7 GB) copied, 6.42173 seconds, 270 MB/s -10000000+0 records in -10000000+0 records out -5120000000 bytes (5.1 GB) copied, 18.913 seconds, 271 MB/s +# Ignore signal so we never inadvertently terminate the dd child. +# Note this is not needed when SIGINFO is available. +trap '' USR1 + +# Run dd with the fullblock iflag to avoid short reads +# which can be triggered by reception of signals. +dd iflag=fullblock if=/dev/zero of=/dev/null count=1000 bs=5000000 & pid=$! +until ! kill -0 $pid 2>/dev/null; do + # sleep first to give time for work and enablement of signal handler + sleep .5 + kill -s USR1 $pid 2>/dev/null || break +done +@end example + +The above script will output in the following format + +@example +859+0 records in +859+0 records out +4295000000 bytes (4.3 GB) copied, 0.539934 s, 8.0 GB/s +1000+0 records in +1000+0 records out +5000000000 bytes (5.0 GB) copied, 0.630785 s, 7.9 GB/s @end example @vindex POSIXLY_CORRECT diff --git a/src/dd.c b/src/dd.c index c17f7d8..a0c5bb1 100644 --- a/src/dd.c +++ b/src/dd.c @@ -830,11 +830,7 @@ install_signal_handlers (void) struct sigaction act; sigemptyset (&caught_signals); if (catch_siginfo) - { - sigaction (SIGINFO, NULL, &act); - if (act.sa_handler != SIG_IGN) - sigaddset (&caught_signals, SIGINFO); - } + sigaddset (&caught_signals, SIGINFO); sigaction (SIGINT, NULL, &act); if (act.sa_handler != SIG_IGN) sigaddset (&caught_signals, SIGINT); @@ -856,7 +852,7 @@ install_signal_handlers (void) #else - if (catch_siginfo && signal (SIGINFO, SIG_IGN) != SIG_IGN) + if (catch_siginfo) { signal (SIGINFO, siginfo_handler); siginterrupt (SIGINFO, 1); @@ -1033,6 +1029,10 @@ iread (int fd, char *buf, size_t size) } while (nread < 0 && errno == EINTR); + /* Short read may be due to received signal. */ + if (0 < nread && nread < size) + process_signals (); + if (0 < nread && warn_partial_read) { static ssize_t prev_nread; @@ -2205,6 +2205,21 @@ dd_copy (void) return exit_status; } +static int +ifd_reopen (int desired_fd, char const *file, int flag, mode_t mode) +{ + int ret; + + do + { + process_signals (); + ret = fd_reopen (desired_fd, file, flag, mode); + } + while (ret < 0 && errno == EINTR); + + return ret; +} + int main (int argc, char **argv) { @@ -2248,7 +2263,7 @@ main (int argc, char **argv) } else { - if (fd_reopen (STDIN_FILENO, input_file, O_RDONLY | input_flags, 0) < 0) + if (ifd_reopen (STDIN_FILENO, input_file, O_RDONLY | input_flags, 0) < 0) error (EXIT_FAILURE, errno, _("failed to open %s"), quote (input_file)); } @@ -2275,8 +2290,8 @@ main (int argc, char **argv) need to read to satisfy a 'seek=' request. If we can't read the file, go ahead with write-only access; it might work. */ if ((! seek_records - || fd_reopen (STDOUT_FILENO, output_file, O_RDWR | opts, perms) < 0) - && (fd_reopen (STDOUT_FILENO, output_file, O_WRONLY | opts, perms) + || ifd_reopen (STDOUT_FILENO, output_file, O_RDWR | opts, perms) < 0) + && (ifd_reopen (STDOUT_FILENO, output_file, O_WRONLY | opts, perms) < 0)) error (EXIT_FAILURE, errno, _("failed to open %s"), quote (output_file)); diff --git a/tests/dd/stats.sh b/tests/dd/stats.sh new file mode 100755 index 0000000..386752e --- /dev/null +++ b/tests/dd/stats.sh @@ -0,0 +1,57 @@ +#!/bin/sh +# Check robust handling of SIG{INFO,USR1} + +# Copyright (C) 2014 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src +print_ver_ dd + +env kill -l | grep '^INFO$' && SIGINFO='INFO' || SIGINFO='USR1' + +# This to avoid races in the USR1 case +# as the dd process will terminate by default until +# it has its handler enabled. +trap '' $SIGINFO + +mkfifo_or_skip_ fifo + +for open in '' '1'; do + # Run dd with the fullblock iflag to avoid short reads + # which can be triggered by reception of signals + dd iflag=fullblock if=/dev/zero of=fifo count=100 bs=5000000 2>err & pid=$! + + # Note if we sleep here we give dd a chance to exec and block on open. + # Otherwise we're probably testing SIG_IGN in the forked shell or early dd. + test "$open" && sleep .1 + + # dd will block on open until fifo is opened for reading. + # Timeout in case dd goes away erroneously which we check for below. + timeout 10 sh -c 'wc -c < fifo > nwritten' & + + # Send lots of signals immediately to ensure dd not killed due + # to race setting handler, or blocking on open of fifo. + # Many signals also check that short reads are handled. + until ! kill -s $SIGINFO $pid 2>/dev/null; do + sleep .01 + done + + wait + + # Ensure all data processed and at least last status written + grep '500000000 bytes .* copied' err || { cat err; fail=1; } +done + +Exit $fail diff --git a/tests/local.mk b/tests/local.mk index 97bf5ed..8498acb 100644 --- a/tests/local.mk +++ b/tests/local.mk @@ -490,6 +490,7 @@ all_tests = \ tests/dd/stderr.sh \ tests/dd/unblock.pl \ tests/dd/unblock-sync.sh \ + tests/dd/stats.sh \ tests/df/total-verify.sh \ tests/du/2g.sh \ tests/du/8gb.sh \ -- 1.7.7.6
