bug#34524: wc: word count incorrect when words separated only by no-break space
$ wc --version wc (GNU coreutils) 8.29 Packaged by Gentoo (8.29-r1 (p1.0)) The man page for wc states: "A word is a... sequence of characters delimited by white space." But its concept of white space only seems to include ASCII white space. U+00A0 NO-BREAK SPACE, for instance, is not recognized. If your terminal displays UTF-8 encoding: printf 'how are\xC2\xA0you\n' or if your terminal displays ISO 8859-1 encoding: printf 'how are\xA0you\n' the visible output of this printf is "how are you". In either case, wc does not recognize the second space as white space, resulting in an incorrect word count: $ printf 'how are\xC2\xA0you\n' | LC_ALL=en_US.utf8 wc -w 2 $ printf 'how are\xA0you\n' | LC_ALL=en_US.iso88591 wc -w 2
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
Hello, Thanks for all comments (on and off list). Attached an updated patch with documentation. The supported options are: --default-signal[=SIG] reset signal SIG to its default signal handler. without SIG, all known signals are included. multiple signals can be comma-separated. --ignore-signal[=SIG] set signal SIG to be IGNORED. without SIG, all known signals are included. multiple signals can be comma-separated. -p same as --default-signal=PIPE (lower-case "-p" as to not conflict with BSD, but of course can be changed to another letter). The new 'env-signal-handler.sh' test passes on GNU/linux, non-gnu/linux (alpine), and Free/Open/Net BSD. Comments very welcomed, - assaf >From 3542f1762c9f14e2275fe5e61d5d7f6275b420a9 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 15 Feb 2019 12:31:48 -0700 Subject: [PATCH] env: new options -p/--default-signal=SIG/--ignore-signal=SIG New options to set signal handlers to default (SIG_DFL) or ignore (SIG_IGN) This is useful to overcome POSIX limitation that shell must not override inherited signal state, e.g. the second 'trap' here is a no-op: trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1' Instead use: trap '' PIPE && sh -c 'env -p seq inf | head -n1' Similarly, the following will prevent CTRL-C from terminating the program: env --ignore-signal=INT seq inf > /dev/null See https://bugs.gnu.org/34488#8 . * NEWS: Mention new options. * doc/coreutils.texi (env invocation): Document new options. * man/env.x: Add example of --default-signal=SIG usage. * src/env.c (signals): New global variable. (shortopts,longopts): Add new options. (usage): Print new options. (parse_signal_params): Parse comma-separated list of signals, store in signals variable. (reset_signal_handlers): Set each signal to SIG_DFL/SIG_IGN. (main): Process new options. * src/local.mk (src_env_SOURCES): Add operand2sig.c. * tests/misc/env-signal-handler.sh: New test. * tests/local.mk (all_tests): Add new test. --- NEWS | 3 + doc/coreutils.texi | 43 man/env.x| 35 ++ src/env.c| 127 +- src/local.mk | 1 + tests/local.mk | 1 + tests/misc/env-signal-handler.sh | 146 +++ 7 files changed, 355 insertions(+), 1 deletion(-) create mode 100755 tests/misc/env-signal-handler.sh diff --git a/NEWS b/NEWS index fdde47593..5a8e8a3de 100644 --- a/NEWS +++ b/NEWS @@ -67,6 +67,9 @@ GNU coreutils NEWS-*- outline -*- test now supports the '-N FILE' unary operator (like e.g. bash) to check whether FILE exists and has been modified since it was last read. + env now supports '--default-singal[=SIG]' and '--ignore-signal[=SIG]' + options to set signal handlers before executing a program. + ** New commands basenc is added to complement existing base64,base32 commands, diff --git a/doc/coreutils.texi b/doc/coreutils.texi index be35de490..57b209e07 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -17227,6 +17227,49 @@ chroot /chroot env --chdir=/srv true env --chdir=/build FOO=bar timeout 5 true @end example +@item --default-signal[=@var{sig}] +Reset signal @var{sig} to its default signal handler. Without @var{sig} all +known signals are reset to their defaults. Multiple signals can be +comma-separated. The following command runs @command{seq} with SIGINT and +SIGPIPE set to their default (which is to terminate the program): + +@example +env --default-signal=PIPE,INT seq 1000 | head -n1 +@end example + +In the following example: + +@example +trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1' +@end example + +The first trap command sets SIGPIPE to ignore. The second trap command +ostensibly sets it back to its default, but POSIX mandates that the shell +must not change inherited state of the signal - so it is a no-op. + +Using @option{--default-signal=PIPE} (or its shortcut @option{-p}) can be +used to force the signal to its default behavior: + +@example +trap '' PIPE && sh -c "env -p seq inf | head -n1' +@end example + + +@item --ignore-signal[=@var{sig}] +Ignore signal @var{sig} when running a program. Without @var{sig} all +known signals are set to ignore. Multiple signals can be +comma-separated. The following command runs @command{seq} with SIGINT set +to be ignored - pressing @kbd{Ctrl-C} will not terminate it: + +@example +env --ignore-signal=INT seq inf > /dev/null +@end example + + +@item -p +Equivalent to @option{--default-signal=PIPE} - sets SIGPIPE to its default +behavior (terminate a program upon SIGPIPE). + @item -v @itemx --debug @opindex -v diff --git a/man/env.x b/man/env.x index 8eea79655..5e0ef975e 100644 --- a/man/env.x +++ b/man/env.x @@
bug#33468: A bug with yes and --help
Hello, On 2019-02-15 1:19 p.m., Eric Blake wrote: On 2/15/19 12:32 PM, Assaf Gordon wrote: There is at least one change in behavior, not sure if this is bad enough to be a regression or doesn't really matter: $ yes-OLD me -- --help | head -n1 me -- --help $ yes-NEW me -- --help | head -n1 me --help I would argue bug-fix. [...] So, I would suspect (although I have not yet tesed) that as patched, you would get: $ yes-NEW me -- --help | head -n1 me --help $ POSIXLY_CORRECT=1 yes-NEW me -- --help | head -n1 me -- --help $ yes-NEW -- me -- --help me -- --help Indeed - that's how it behaves with the patch. Thanks for explaining. In the gnulib patch: s/optional/option/ In the coreutils patch: s/non-options/non-option/ Attached updates with your suggested fixes. Also, all coreutils callers pass reset_optind==false; does the gnulib interface still need to provide a reset_optind parameter, given that setting the parameter true forces reliance on the getopt-gnu module as currently coded? The "getopt-gnu" was already a dependency before this patch, not sure if removing this parameter will save much hassle - what do you think ? -assaf >From 08d0505683cebed0fc10cff082255fd79da2d989 Mon Sep 17 00:00:00 2001 From: Bernhard Voelker Date: Thu, 29 Nov 2018 09:06:26 +0100 Subject: [PATCH] long-options: add parse_gnu_standard_options_only Discussed in https://bugs.gnu.org/33468 . * lib/long-options.c (parse_long_options): Use EXIT_SUCCESS instead of 0. (parse_gnu_standard_options_only): Add function to process the GNU default options --help and --version and fail for any other unknown long or short option. See https://gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html . * lib/long-options.h (parse_gnu_standard_options_only): Declare it. * modules/long-options (depends-on): Add stdbool, exitfail. * top/maint.mk (sc_prohibit_long_options_without_use): Update syntax-check rule, add new function name. --- lib/long-options.c | 68 +++- lib/long-options.h | 17 + modules/long-options | 2 ++ top/maint.mk | 2 +- 4 files changed, 87 insertions(+), 2 deletions(-) diff --git a/lib/long-options.c b/lib/long-options.c index 037f74b3a..b7acdb040 100644 --- a/lib/long-options.c +++ b/lib/long-options.c @@ -29,6 +29,7 @@ #include #include "version-etc.h" +#include "exitfail.h" static struct option const long_options[] = { @@ -71,7 +72,7 @@ parse_long_options (int argc, va_list authors; va_start (authors, usage_func); version_etc_va (stdout, command_name, package, version, authors); -exit (0); +exit (EXIT_SUCCESS); } default: @@ -87,3 +88,68 @@ parse_long_options (int argc, the probably-new parameters when/if getopt is called later. */ optind = 0; } + +/* Process the GNU default long options --help and --version (see also + https://gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html), + and fail for any other unknown long or short option. + Use with SCAN_ALL=true to scan until "--", or with SCAN_ALL=false to stop + at the first non-option argument (or "--", whichever comes first). + + if RESET_OPTIND=true, the global optind variable will be reset to zero, + preparing (and requiring) a follow-up gnu-compatible getopt() call + (non-gnu getopt functions use optreset=optind=1 instead of 0 for reset). + + if RESET_OPTIND=false, optind is left as-is (suitable for programs + which do not process further option parameters (but could still + process parameters directly by examining argv[optind]). */ +void +parse_gnu_standard_options_only (int argc, + char **argv, + const char *command_name, + const char *package, + const char *version, + bool scan_all, + bool reset_optind, + void (*usage_func) (int), + /* const char *author1, ...*/ ...) +{ + int c; + int saved_opterr; + + saved_opterr = opterr; + + /* Print an error message for unrecognized options. */ + opterr = 1; + + const char *optstring = scan_all ? "" : "+"; + + if ((c = getopt_long (argc, argv, optstring, long_options, NULL)) != -1) +{ + switch (c) +{ +case 'h': + (*usage_func) (EXIT_SUCCESS); + break; + +case 'v': + { +va_list authors; +va_start (authors, usage_func); +version_etc_va (stdout, command_name, package, version, authors); +exit (EXIT_SUCCESS); + } + +default: + (*usage_func) (exit_failure); + break; +} +} + + /* Restore previous value. */ + opterr = saved_opterr; + + /* Reset this to zero so
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
On 2/17/19 8:20 PM, Pádraig Brady wrote: > On 15/02/19 07:20, Eric Blake wrote: >> Except that POSIX has the nasty requirement that sh started with an >> inherited ignored SIGPIPE must silently ignore all attempts from within >> the shell to restore SIGPIPE handling to child processes of the shell: >> >> $ (trap '' PIPE; bash -c 'trap - PIPE; \ >>seq | sort -n | sed 5q | wc -l') >> 5 >> sort: write failed: 'standard output': Broken pipe >> sort: write error > >> You HAVE to use some other intermediate program if you want to override >> an inherited ignored SIGPIPE in sh into an inherited default-behavior >> SIGPIPE in sort. > > Should we also propose to POSIX to allow trap to specify default? That's what "trap - PIPE" is already supposed to do, except that POSIX has the odd requirement that a signal that was inherited ignored cannot be reset to default. > Maybe `trap 0 PIPE` or similar? Alas, bash has already defined that to mean the same as 'trap - EXIT PIPE'. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature