bug#49741: basenc --base64url decoding bug
tag 49741 fixed close 49741 stop On 2021-08-22 4:15 p.m., Assaf Gordon wrote: Attached a suggested fix. pushed in: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=709d1f8253072804cc27189a6f2b873d8d563399
bug#50151: Coreutils, aarch64 and chroot
tag 50151 notabug close 50151 stop On 2021-08-25 12:54 p.m., Frans de Boer wrote: On 8/25/21 10:16 AM, Assaf Gordon wrote: qemu-aarch64 -strace -L /newroot \ /newroot/usr/sbin/chroot /newroot /usr/bin/env --version 2&1 \ | tee log.txt @assaf: your suggestions no. 1 and 2, had the predicted results. Thus, suggestion no. 3 failed because of suggestion no.2. I followed then suggestion 4 and attached the strace output to this message. It seems that chroot is working as expected, only env seems to fail with an error. Not exactly: The 'chroot' system-call *seems* to succeed, followed by a failed "execve(2)" system call to execute another binary. That "execve" system fails - so it is not 'env' per-se, it is any program that will try to execute another aarch64 binary. Learning that, searching for "qemu-user", "chroot" and "architecture" leads to several web pages detailing similar errors (and few suggested solutions): https://wiki.gentoo.org/wiki/Crossdev_qemu-static-user-chroot https://newbedev.com/how-can-i-chroot-into-a-filesystem-with-a-different-architechture https://ownyourbits.com/2018/06/13/transparently-running-binaries-from-any-architecture-in-linux-with-qemu-and-binfmt_misc/ I hope you have some clue of what is going wrong. With the above information, we can conclude this is not a bug in coreutils - it is a limitation of the linux+qemu-user setup. So I'm closing this item and marking it as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#50151: Coreutils, aarch64 and chroot
Hello, On 2021-08-24 2:39 a.m., Paul Eggert wrote: However, I think it'll be a better use of our time for you to debug this one yourself. It doesn't sound like a Coreutils problem; it sounds like a problem in your virtual machine setup, and you're the best expert on that setup. Few suggestions to check, that might help you and us to troubleshoot: 1. ensure the binaries are indeed for aarch64: file /newroot/usr/sbin/chroot file /newroot/usr/bin/env file /newroot/usr/bin/bash it should say something like "ELF 64-bit LSB pie executable, ARM aarch64" for all of them. 2. ensure each binary works by itself: qemu-aarch64 -L /newroot /newroot/usr/sbin/chroot --version qemu-aarch64 -L /newroot /newroot/usr/bin/env --version qemu-aarch64 -L /newroot /newroot/usr/bin/bash --version (the actual version doesn't matter here, the main thing is that the qemu user-mode emulator was able to run the binaries.) On 2021-08-21 4:33 a.m., Frans de Boer wrote: Running 'qemu-aarch64 -L /newroot /newroot/usr/bin/bash -c /usr/bin/env> --help' does show the env help text. So, I guess chroot is to blame? Note that the above command runs your *host's* /usr/bin/env because chroot is not used - the binary under qemu (/newroot/usr/bin/bash) sees your host's file system. Observe with: qemu-aarch64 -L /newroot /newroot/usr/bin/bash -c /bin/uname -m qemu-aarch64 -L /newroot /newroot/usr/bin/env /bin/uname -m I'm guessing you will see "x86_64", not "aarch64". 3. What you should try is: qemu-aarch64 -L /newroot \ /newroot/usr/bin/bash -c /newroot/usr/bin/env --version and: qemu-aarch64 -L /newroot \ /newroot/usr/bin/env /newroot/usr/bin/bash --version In both cases, one aarch64 binary will try to execute another aach64 binary. Do these work for you, or are you seeing an error? 4. Use qemu's "-strace" to see the syscalls, hopefully that will help pinpoint the cause: qemu-aarch64 -strace -L /newroot \ /newroot/usr/sbin/chroot /newroot /usr/bin/env --version 2&1 \ | tee log.txt If the command results in an error, the "log.txt" file will show more details about what failed. If you're not familiar with 'strace' output, post it here as an email attachment. Hope this helps, - assaf P.S. On 2021-08-24 2:39 a.m., Paul Eggert wrote: A complete set of instructions for an outsider to reproduce the problem from scratch. Assume the outsider is running Fedora 34 x86-64 (since that's what I'm running :-). I'm not familiar with Fedora, but on Debian/x86_64 the following works: apt-get qemu-user apt-get install crossbuild-essential-arm64 libc6-arm64-cross cd coreutils ./configure --host=aarch64-linux-gnu make then: $ qemu-aarch64 -L /usr/aarch64-linux-gnu/ ./src/uname -m aarch64 Somewhat related: $ qemu-aarch64 -L /usr/aarch64-linux-gnu/ ./src/env ./src/uname -m /lib/ld-linux-aarch64.so.1: No such file or directory This fails because once "inside" qemu, the aarch64 searches for "/lib/ld-linux-aarch64.so.1" but the file is in "/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1". One possible work-around is to build static binaries. I don't want to assume that is the culprit for Frans, so we'll wait for the logs...
bug#49741: basenc --base64url decoding bug
On 2021-08-17 3:37 a.m., Jim Meyering wrote: On Tue, Aug 17, 2021 at 2:02 AM Pádraig Brady wrote: On 16/08/2021 22:17, Assaf Gordon wrote: Attached a suggested fix. minor nit in NEWS: a nit in the commit log: Thanks, attached updated patch. Will push this week if there are no other comments. -assaf >From 090663068a23662b36ddc0603fc1c2c752b6aff1 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Mon, 16 Aug 2021 15:03:36 -0600 Subject: [PATCH] basenc: fix bug49741: using wrong decoding buffer length Emil Lundberg reports in https://bugs.gnu.org/49741 about a 'basenc --base64 -d' decoding bug. The input buffer length was not divisible by 3, resulting in decoding errors. * NEWS: Mention fix. * src/basenc.c (DEC_BLOCKSIZE): Change from 1024*5 to 4200 (35*3*5*8) which is divisible by 3,4,5,8 - satisfying both base32 and base64; Use compile-time verify() macro to enforce the above. * tests/misc/basenc.pl: Add test. --- NEWS | 4 src/basenc.c | 4 +++- tests/misc/basenc.pl | 9 + 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index ddec56bdf..efdb1450e 100644 --- a/NEWS +++ b/NEWS @@ -60,6 +60,10 @@ GNU coreutils NEWS-*- outline -*- invalid combinations of case character classes. [bug introduced in coreutils-8.6] + basenc --base64 --decode no longer silently discards decoded characters + on (1024*5) buffer boundaries + [bug introduced in coreutils-8.31] + ** Changes in behavior cp and install now default to copy-on-write (COW) if available. diff --git a/src/basenc.c b/src/basenc.c index 5c97a3652..2ffdb2d27 100644 --- a/src/basenc.c +++ b/src/basenc.c @@ -213,7 +213,9 @@ verify (DEC_BLOCKSIZE % 12 == 0); /* So complete encoded blocks are used. */ /* Note that increasing this may decrease performance if --ignore-garbage is used, because of the memmove operation below. */ -# define DEC_BLOCKSIZE (1024*5) +# define DEC_BLOCKSIZE (4200) +verify (DEC_BLOCKSIZE % 40 == 0); /* complete encoded blocks for base32 */ +verify (DEC_BLOCKSIZE % 12 == 0); /* complete encoded blocks for base64 */ static int (*base_length) (int i); static bool (*isbase) (char ch); diff --git a/tests/misc/basenc.pl b/tests/misc/basenc.pl index 3383aaeef..ac5394731 100755 --- a/tests/misc/basenc.pl +++ b/tests/misc/basenc.pl @@ -37,6 +37,13 @@ my $base64url_out_nl = $base64url_out; $base64url_out_nl =~ s/(..)/\1\n/g; # add newline every two characters +# Bug 49741: +# The input is 'abc' in base64, in an 8K buffer (larger than 1024*5, +# the buffer size which caused the bug). +my $base64_bug49741_in = "YWJj" x 2000 ; +my $base64_bug49741_out = "abc" x 2000 ; + + my $base32_in = "\xfd\xd8\x07\xd1\xa5"; my $base32_out = "7XMAPUNF"; my $x = $base32_out; @@ -111,6 +118,8 @@ my @Tests = ['b64u_7', '--base64url -d', {IN=>$base64_out}, {EXIT=>1}, {ERR=>"$prog: invalid input\n"}], + ['b64_bug49741', '--base64 -d', {IN=>$base64_bug49741_in}, + {OUT=>$base64_bug49741_out}], -- 2.20.1
bug#49741: basenc --base64url decoding bug
Hello Emil and all, Thanks for the clear and easily reproducible bug report. Attached a suggested fix. Comments very welcomed, - Assaf >From 11330058443e7cc92b4a53322d810725d42b4e34 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Mon, 16 Aug 2021 15:03:36 -0600 Subject: [PATCH] basenc: fix bug49741: using wrong decoding buffer length Emil Lundberg reports in https://bugs.gnu.org/49741 about a 'basenc --base64 -d' decoding bug. The input buffer was not divisible by 3, resulting in decoding errors. * NEWS: Mention fix. * src/basenc.c (DEC_BLOCKSIZE): Change from 1024*5 to 4200 (35*3*5*8) which is divisible by 3,4,5,8 - satisfying both base32 and base64; Use compile-time verify() macro to enforce the above. * tests/misc/basenc.pl: Add test. --- NEWS | 4 src/basenc.c | 4 +++- tests/misc/basenc.pl | 9 + 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index ddec56bdf..d490ed101 100644 --- a/NEWS +++ b/NEWS @@ -60,6 +60,10 @@ GNU coreutils NEWS-*- outline -*- invalid combinations of case character classes. [bug introduced in coreutils-8.6] + basenc --base64 --decode no longer silently discard decoded characters + on (1024*5) buffer boundaries + [bug introduced in coreutils-8.31] + ** Changes in behavior cp and install now default to copy-on-write (COW) if available. diff --git a/src/basenc.c b/src/basenc.c index 5c97a3652..2ffdb2d27 100644 --- a/src/basenc.c +++ b/src/basenc.c @@ -213,7 +213,9 @@ verify (DEC_BLOCKSIZE % 12 == 0); /* So complete encoded blocks are used. */ /* Note that increasing this may decrease performance if --ignore-garbage is used, because of the memmove operation below. */ -# define DEC_BLOCKSIZE (1024*5) +# define DEC_BLOCKSIZE (4200) +verify (DEC_BLOCKSIZE % 40 == 0); /* complete encoded blocks for base32 */ +verify (DEC_BLOCKSIZE % 12 == 0); /* complete encoded blocks for base64 */ static int (*base_length) (int i); static bool (*isbase) (char ch); diff --git a/tests/misc/basenc.pl b/tests/misc/basenc.pl index 3383aaeef..ac5394731 100755 --- a/tests/misc/basenc.pl +++ b/tests/misc/basenc.pl @@ -37,6 +37,13 @@ my $base64url_out_nl = $base64url_out; $base64url_out_nl =~ s/(..)/\1\n/g; # add newline every two characters +# Bug 49741: +# The input is 'abc' in base64, in an 8K buffer (larger than 1024*5, +# the buffer size which caused the bug). +my $base64_bug49741_in = "YWJj" x 2000 ; +my $base64_bug49741_out = "abc" x 2000 ; + + my $base32_in = "\xfd\xd8\x07\xd1\xa5"; my $base32_out = "7XMAPUNF"; my $x = $base32_out; @@ -111,6 +118,8 @@ my @Tests = ['b64u_7', '--base64url -d', {IN=>$base64_out}, {EXIT=>1}, {ERR=>"$prog: invalid input\n"}], + ['b64_bug49741', '--base64 -d', {IN=>$base64_bug49741_in}, + {OUT=>$base64_bug49741_out}], -- 2.20.1
bug#49741: basenc --base64url decoding bug
Hi, I will also work on it this weekend. -assaf On 2021-08-12 7:37 p.m., Paul Eggert wrote: Simon, this looks like some sort of minor buffering problem in 'basenc --base64', since plain 'base64' works correctly. Is this something you have time to look into? https://bugs.gnu.org/49741
bug#44704: uniq: replace repeated lines with a message about how many repeated lines
tag 44704 notabug severity 44704 wishlist stop Hello, On 2020-11-17 6:32 a.m., Brian J. Murrell wrote: It would be a useful enhancement to uniq to replace all lines considered non-uniq (i.e. those that would be removed from the output) with a message about how many times the previous line was repeated. I.e. $ cat < [...] uniq supports the "--group" option, which adds a blank line after each group of identical lines - this can be used down-stream to process groups in any way you want. Example: $ cat < in first line second line repeated line repeated line repeated line repeated line repeated line third line EOF $ cat in | uniq --group=append first line second line repeated line repeated line repeated line repeated line repeated line third line $ cat in | uniq --group=append \ | awk '$0=="" { print "do something after group" ; next } ; 1 { print }' first line do something after group second line do something after group repeated line repeated line repeated line repeated line repeated line do something after group third line do something after group And with counting: $ cat in | uniq --group=append \ | awk 'BEGIN { c = 0 } ; $0=="" { print "Group has " c " lines" ; c=0 ; next } ; 1 { print ; c++ }' first line Group has 1 lines second line Group has 1 lines repeated line repeated line repeated line repeated line repeated line Group has 5 lines third line Group has 1 lines Hope this helps. More information about "uniq --group=X" is here: https://www.gnu.org/software/coreutils/manual/html_node/uniq-invocation.html I'm marking this as "notabug/wishlist", but will likely close soon as "wontfix" unless we come up with convincing argument why "--group" is not sufficient for your use case. Regardless of the status, discussion can continue by replying to this thread. regards, - assaf
bug#43684: Problem with numerical splitting with files > 90*l
On 29/09/2020 02:18, ned haughton wrote: When splitting with -d, the numbering screws up after 89: In addition to Pádraig explanation, please see previous similar discussion here: https://lists.gnu.org/archive/html/bug-coreutils/2017-02/msg00050.html http://bugs.gnu.org/25832 regards, - assaf
bug#42340: Fwd: bug#42340: "join" reports that "sort"ed input is not sorted
Hello, On 2020-07-15 2:12 p.m., Beth Andres-Beck wrote: If that is the intended behavior, the bug is that: printf '12,\n1,\n' | sort -t, -k1 -s 1, 12, does _not_ take the remainder of the line into account, and only sorts on the initial field, prioritizing length. It is at the very least unexpected that adding an `a` to the end of both lines would change the sort order of those lines: printf '12,a\n1,a\n' | sort -t, -k1 -s 12,a 1,a Not a bug, just an incomplete usage :) sort's -k/--key parameter takes two values (the second being optional): the first and last column to use as the key. If the second value is omitted (as in your case), then the key is taken from the first field to the end of the line. And so: "sort -k1,1" means take the first *and only the first* field as the key. "sort -k1" means take the first field until the end of the line as the key. "sort -k1,3" means take the first,second and third fields as the single key. "sort -k1,1 -k2,2 -k3,3" means take the first field as the first key, second field as the second key, and third field as the third key. --- The "--debug" option can help illustrate what sort is doing, by adding underscore characters to show which characters are being used as keys in each line. Consider the following: $ printf '12,\n1,\n' | sort -t, -k1 -s --debug sort: using ‘en_CA.utf8’ sorting rules 1, __ 12, ___ $ printf '12,\n1,\n' | sort -t, -k1,1 -s --debug sort: using ‘en_CA.utf8’ sorting rules 1, _ 12, __ In the first example, the "-k1" means from first field till end of line, the underscore includes the "," characters. In the second example, the "-k1,1" means only the first field, and the comma is not used. Now consider your second case of adding an "a" at the end of each line: $ printf '12,a\n1,a\n' | sort -t, -k1 -s --debug sort: using ‘en_CA.utf8’ sorting rules 12,a 1,a ___ $ printf '12,a\n1,a\n' | sort -t, -k1,1 -s --debug sort: using ‘en_CA.utf8’ sorting rules 1,a _ 12,a __ In the first example, "-k1" means: from first field until the end of the line, and so the entire string "12,a" is compared against "1,a". **AND**, because the locale is a "utf-8" locale, punctuation characters are ignored (as mentioned in the previous email in this thread). So effectively the compared strings are "12a" vs "1a". The ASCII value of "2" is smaller than the ASCII value of "a", and therefore "12a" appears before "1a". If we force C locale, then the order is reversed: $ printf '12,a\n1,a\n' | LC_ALL=C sort -t, -k1 -s --debug sort: using simple byte comparison 1,a ___ 12,a Because now punctuation characters are used, and the ASCII value of "," is smaller than the ASCII value of "2". **HOWEVER**, this result of using "LC_ALL=C" together with "-k1" is only correct by a happy accident :) it is still very likely that "-k1" is not what you wanted - you probably meant to do "-k1,1". --- Lastly, the "-s/--stable" option in the above contrived examples is superfluous - it doesn't affect the output order because there are no equal field values (i.e. "1" vs "12"). A slightly better example to illustrate how "-s" affects ordering is this: $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1 1,a 2,b 2,x $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1 -s 1,a 2,x 2,b Here, "1" comes before "2" - that's obvious. But should "2,b" come before "2,x" ? If we do not use "-s/--stable", then "sort" ALSO does one additional comparison of the entire line as a last step (hence "sort --help" says "[disable] last-resort comparison" about "-s/--stable"). The substring ",b" comes before ",x" - therefore "2,b" appears first. If we add "-s/--stable", the last comparison step of the entire line is skipped, and the lines of "2" appear in the order they were in the input (hence - "stable"). By using "--debug" we can see the additional comparison step (indicated by additional underscore lines); $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1 --debug sort: using ‘en_CA.utf8’ sorting rules 1,a _ ___ 2,b _ ___ 2,x _ ___ $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1 -s --debug sort: using ‘en_CA.utf8’ sorting rules 1,a _ 2,x _ 2,b _ --- Hope this helps. regards, - assaf
bug#42340: "join" reports that "sort"ed input is not sorted
tags 42340 notabug close 42340 stop Hello, On 2020-07-12 5:57 p.m., Beth Andres-Beck wrote: In trying to use `join` with `sort` I discovered odd behavior: even after running a file through `sort` using the same delimiter, `join` would still complain that it was out of order. [...] Here is a way to reproduce the problem: printf '1.1.1,2\n1.1.12,2\n1.1.2,1' | sort -t, > a.txt printf '1.1.12,a\n1.1.1,b\n1.1.21,c' | sort -t, > b.txt join -t, a.txt b.txt join: b.txt:2: is not sorted: 1.1.1,b The expected behavior would be that if a file has been sorted by "sort" it will also be considered sorted by join. [...] I traced this back to what I believe to be a bug in sort.c This is not a bug in sort or join, just a side-effect of the locale on your system on the sorting results. By forcing a C locale with "LC_ALL=C" (meaning simple ASCII order), the files are ordered in the same way 'join' expected them to be: $ printf '1.1.1,2\n1.1.12,2\n1.1.2,1' | LC_ALL=C sort -t, > a.txt $ printf '1.1.12,a\n1.1.1,b\n1.1.21,c' | LC_ALL=C sort -t, > b.txt $ join -t, a.txt b.txt 1.1.1,2,b 1.1.12,2,a --- More details: I'm going to assume your system uses some locale based on UTF-8. You can check it by running 'locale', e.g. on my system: $ locale LANG=en_CA.utf8 LANGUAGE=en_CA:en LC_CTYPE="en_CA.utf8" .. .. Under most UTF-8 locales, punctuation characters are *ignored* in the compared input lines. This might be confusing and non-intuitive, but that's the way most systems have been working for many years (locale ordering is defined in the GNU C Library, and coreutils has no way to change it). Observe the following: $ printf '12,a\n1,b\n' | LC_ALL=en_CA.utf8 sort 12,a 1,b $ printf '12,a\n1,b\n' | LC_ALL=C sort 1,b 12,a With a UTF-8 locale, the comma character is ignored, and then "12a" appears before "1b" (since the character '2' comes before the character 'b'). With "C" locale, forcing ASCII or "byte comparison", punctuation characters are not ignored, and "1,b" appears before "12,a" (because the comma ',' ASCII value is 44 , which is smaller then the ASCII value digit '2'). --- Somewhat related: Your sort command defines the delimiter ("-t,") but does not define which columns to sort by; sort then uses the entire input line - and there's no need to specify delimiter at all. --- As such, I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#40530: feature proposal: coreutils -> sort: adding sorting ability for Hebrew numerals
Hello, > On Apr 9, 2020, at 3:23 PM, Zeev Pekar wrote: > > it would be nice to be able to sort (coreutils -> sort) Hebrew numerals: An interesting idea, but I think it is a bit too niche to be included in the coreutils “sort” program (tradeoff of usefulness vs bloat). However, such functionality is very suitable to an old idea of an auxiliary “decorate” program that will allow many more sorting options when used in tandem with “sort”. I’ve started writing such program some time ago, based on Pádraig's idea (never completed, but perhaps these days are perfect opportunity to complete it): https://lists.gnu.org/archive/html/coreutils/2019-03/msg00056.html Would you like to try your hand at coding the sorting rules for such Hebrew-numerals sort? regards, - Assaf
bug#38003: date --date=-1month gives same month today
tag 38003 notabug close 38003 stop Hello, On 2019-10-31 2:34 a.m., Ilja Honkonen wrote: Please CC me as I'm not on this list. Running date (GNU coreutils) 8.26 on fedora 30 today (date --utc -I: 2019-10-31) with --date=-1month gives the same month which doesn't make sense: $ date --utc -I --date=-1month 2019-10-01 date gained a "--debug" option that helps diagnosing the issue: $ date --utc -I --debug --date=-1month date: parsed relative part: -1 month(s) [...] date: using current date as starting value: '(Y-M-D) 2019-10-31' [...] date: warning: when adding relative months/years, it is recommended to specify the 15th of the months < date: after date adjustment (+0 years, -1 months, +0 days), date: new date/time = '(Y-M-D) 2019-10-01 17:29:20' date: warning: month/year adjustment resulted in shifted dates: date: adjusted Y M D: 2019 09 31< date:normalized Y M D: 2019 10 01< [...] date: final: (Y-M-D) 2019-10-01 17:29:20 (UTC) 2019-10-01 -- Subtracting 1 month from October 31st results in September 31st. Since the date doesn't exist, it is normalized: September 31st is "one day after September 30th", which results in October 1st. The "--debug" option also warns: when subtracting months, it is recommended to specify the 15th (middle) of the month, exactly to avoid such issues. $ date --utc -I --date="2019-10-15 -1month" 2019-09-15 regards, - assaf
bug#37702: Suggestion for 'df' utility
Hello Bernhard, On 2019-10-13 3:57 p.m., Bernhard Voelker wrote: On 2019-10-13 23:28, Paul Eggert wrote: In any sane system there would be only four lines of non-header output (for tmpfs etc, /, /home, and /media/eggert/B827-D456), but df is outputting 28 lines. What is so special about tmpfs so that you would like to see it? As an interesting use-case (though not common), I recently configured a raspberry PI device, and wanted to mount as many locations on tmpfs as possible, e.g. "/tmp" "/var/tmp", "/var/log" etc. In was very useful in those cases to be able to see separate tmpfs file system listed, with information about how big they are and how much space was used. Also in other systems where "/tmp" is a "tmpfs", users might want to see how much space is available. If we hide it by default, they can of course use "df /tmp" or "df --all" - it's not about removing this option, it is just about making users' life harder or easier, and making unexpected changes. I recently also encountered a change in a default behavior of a program which I've been using a very long time - and it is *very* frustrating to have something that worked "just fine" for so long being changed. Here on my openSUSE:Tumbleweed system, I see the following: $ df -T Filesystem Type 1K-blocks Used Available Use% Mounted on [...] /dev/loop0 ext2 31729 31729 0 100% /FULL_PARTITION_TMPDIR [...] (The /FULL_PARTITION_TMPDIR is used by a special coreutils test.) That's an interesting case, where I would think you'd want to see it, because you explicitly mounted it. I think I could well live with adding 'devtmpfs' and 'tmpfs' to the pseudo file systems in gnulib's "mountlist.c". I agree, but think this needs to be communicated very well, and in advance - perhaps announce this change ahead of time to the respective package maintainers of each distribution - just so they'll know it's coming (and also have a way to revert it if they don't like it). This seems to be a small change, and not satisfying the snap case. Possibly hiding "squashfs" of readonly-mounts could get rid of those snaps? regards, -assaf
bug#37702: Suggestion for 'df' utility
On 2019-10-13 3:28 p.m., Paul Eggert wrote: [..] I mean c'mon, here's the output of 'df' on the Ubuntu 18.04.3 LTS workstation I'm typing this particular message on. In any sane system there would be only four lines of non-header output (for tmpfs etc, /, /home, and /media/eggert/B827-D456), but df is outputting 28 lines. This is ridiculous. It is certainly inconvenient if that's not what you are looking for (and certainly most desktop users aren't). But I'm not sure if it's easy to find a set of criteria that would work well while having minimal unexpected side effects of hiding entries people in other systems do expect to see. Out of curiosity, can you share the output of the following commands on the same system? lsblk df -x tmpfs -x devtmpfs -x squashfs Thanks, - assaf
bug#37702: Suggestion for 'df' utility
Hi all, On 2019-10-13 2:27 p.m., Paul Eggert wrote: On 10/13/19 2:41 AM, Pádraig Brady wrote: I wonder could we key (also) on used==0||available==0. Yes, looking at the sample output I gave earlier, I'd say we could by default drop filesystems where usage is 1% or less. That would solve the problem for my workstation. This is roughly akin to the "used==0" test you're suggesting. I would humbly suggest caution with such unexpected user-facing changes to the default output of 'df' - learning the lessons from changing the quotes in 'ls'. Countless users have been using 'df' in their own ways, and have gotten used to certain outputs. This thread originated by a request to "clean up" the output on newer ubuntu machines which use "snap" packages as /dev/loopN . Let's not turn that into a drastic change that will affect many other existing systems - the users on other systems did not ask for any changes. --- Specifically for "default drop filesystems where usage is 1% or less" - I can think of few cases off the top of my head where this would be extremely confusing: - I recently installed a 33TB raid file system. The usage on that system is at %1 and will stay like so for at least several days. - Amazon cloud services (AWS) offers an NFS4 service (they call it "EFS") that has reported size of 8 exabytes. There too usage could be at %1 for a long long time. --- For cases where I want to list only the "real" storage, I typically use an alias such as: alias dff='df -h -x tmpfs -x devtmpfs' And it would be very easy and least disruptive to recommend to ubuntu users to add "-x squashfs" or another file system to ignore. Perhaps we can come up with a recommended list of "lesser" file systems to ignore (or conditions such as read-only file systems) and add it as a new option, but please let's not make it the default. My two cents, - assaf
bug#37093: wc runs 100% cpu when in pipeline or tee >(wc)
tag 37093 notabug close 37093 stop Hello, On 2019-08-19 10:44 p.m., Edward Huff wrote: In the demo below, dd uses 0.665s to write 1GiB of zeros. sha256sum uses 4.285s to calculate the sha256 of 1GiB of zeros. wc uses 32.160s to count 1GiB of zeros. [...] baseline results: $ dd if=/dev/zero count=$((1024*1024)) bs=1024 | tee >(sha256sum>&2) | wc 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5007 s, 33.0 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 0 0 1073741824 $ First, Try to avoid UTF8 locales (i.e., force a C/POSIX locale with LC_ALL=C) which makes 'wc' much faster. On my computer: With UTF8 locale: $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \ | tee >(sha256sum>&2) | time --portability wc 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 46.5928 s, 23.0 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 0 0 1073741824 real 46.59 user 46.37 sys 0.19 With C locale: $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \ | tee >(sha256sum>&2) | LC_ALL=C time --portability wc 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 8.60285 s, 125 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 0 0 1073741824 real 8.60 user 5.22 sys 0.26 Second, The "word counting" feature in 'wc' is the main cpu-hog. If you avoid that (i.e. counting only lines, or only characters), 'wc' is even faster (and it automatically ignores UTF8 issues): $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \ | tee >(sha256sum>&2) \ | \time --portability wc -c 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.59429 s, 141 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 1073741824 real 7.59 user 0.10 sys 0.71 Notice that the "real time" wasn't changed much (from 8.6s to 7.59s), but the actual work performed by 'wc' (measured in "user time") is down drastically. Third, If you are comfortable with compiling Coreutils from source, you can build it using optimized hashing function from OpenSSL, like so: ./configure --with-openssl make Then, "sha256sum" will be faster (about 2x fast on my computer). If you don't want to re-compile it, consider using "openssl" directly to calculate the checksum, like so: dd if=/dev/zero count=1K bs=1M | tee >(openssl sha256>&2) | wc -c Fourth, To save few more microseconds, consider using dd with larger block size (bs=) and fewer blocks (count=), e.g.: $ time dd if=/dev/zero of=/dev/null count=1M bs=1K 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.865853 s, 1.2 GB/s real 0m0.868s user 0m0.288s sys 0m0.579s $ time dd if=/dev/zero of=/dev/null count=1K bs=1M 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.0998688 s, 10.8 GB/s real 0m0.102s user 0m0.000s sys 0m0.102s This won't reduce the total time by much, but will result in fewer sys-calls, and less CPU kernel time (at least by a tiny bit). The effect is more noticeable when reading or writing to a physical disk. Lastly, If you use GNU time instead of the shell's built-in 'time' function, you can specify custom output format, and easily show the timing of each program in the pipeline. Example: $ FMT="\n=== CMD: %C ===\nreal %e\tuser %U\tsys %S\n" $ \time -f "$FMT" dd if=/dev/zero count=1M bs=1K \ | \time -f "$FMT" tee >(\time -f "$FMT" sha256sum>&2) \ | \time -f "$FMT" wc -c 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.77339 s, 138 MB/s === CMD: dd if=/dev/zero count=1048576 bs=1024 === real 7.77 user 0.36 sys 1.65 === CMD: tee /dev/fd/63 === real 7.77 user 0.10 sys 1.30 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - === CMD: sha256sum === real 7.77 user 7.47 sys 0.27 1073741824 === CMD: wc -c === real 7.77 user 0.05 sys 0.76 As such, I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#37058: Error message with local deployment of Galaxy-k8s
tag 37058 notabug close 37058 stop Hello, Two issues are mixed here. First: On 2019-08-16 2:17 p.m., Gao, Jianliang wrote: I followed https://github.com/phnmnl/phenomenal-h2020/wiki/QuickStart-Installation-for-Local-PhenoMeNal-Workflow with Older Galaxy chart to deploy local galaxy-k8s instance with minikube on Windows 10. The following message came from the logs of my pod. I can't connect to my local instance. [...] kubectl logs galaxy-k8s-tr6fc [ run_galaxy_config.sh ] -- Galaxy sqlite directory created since we are not using postgresql [ run_galaxy_config.sh ] -- Replaced galaxy ini for the user's injected one [...] dpkg-preconfigure: unable to re-open stdin: [WARNING]: It is unneccessary to use '{{' in loops, leave variables in loop expressions bare. [...] galaxy.tools.deps WARNING 2019-08-16 19:20:48,175 Path './database/dependencies' does not exist, ignoring galaxy.tools.deps WARNING 2019-08-16 19:20:48,175 Path './database/dependencies' is not directory, ignoring galaxy.tools.deps.installable WARNING 2019-08-16 19:20:48,190 Conda not installed and auto-installation disabled. galaxy.tools.deps.installable WARNING 2019-08-16 19:20:48,190 Conda not installed and auto-installation disabled. These are issues related your Galaxy setup. (for other readers: "Galaxy" in this context is a web-based framework for bioinformatics analysis, see https://galaxyproject.org/ and https://usegalaxy.org ). Such issues are best asked in their support forums: https://galaxyproject.org/support/ https://help.galaxyproject.org This includes problems in underlying layers, such as the 'dpkg' errors above that result from deploying Galaxy VMs or instances or kubernetes or containers etc. tail: unrecognized file system type 0x794c7630 for 'paster.log'. please report this to bug-coreutils@gnu.org. reverting to polling This warning indeed comes from coreutils program 'tail', however it is harmless in your situation. For more details, see here: https://www.gnu.org/software/coreutils/filesystems.html --- A cursory look at the error logs makes it seem like "bug-coreutils@gnu.org" is the place to ask General questions about "Galaxy" server (because it is the last thing mentioned), but that is not the case. We can only help with coreutils programs (e.g. 'tail'). Please contact the Galaxy team for galaxy-related issues. Hope this helps. regards, - assaf
bug#36985: tail
close 36985 stop Hello, On 2019-08-09 12:55 a.m., Rob Hearne wrote: root@kafka-robh-vmdub-04:/kafka/bin# tail -f Control tail: unrecognized file system type 0x794c7630 for ‘Control’. please report this to bug-coreutils@gnu.org. reverting to polling This has been fixed in version 8.25 (released in 2016). For more details, see https://www.gnu.org/software/coreutils/filesystems.html -assaf
bug#36901: Enhance directory and file moves where target already exists
Hello, On Fri, Aug 02, 2019 at 10:47:18PM -0700, L A Walsh wrote: > It's not a wish list that 'mv' doesn't work as documented. The "wishlist" refers to the topic: You are asking to add new funtionality to 'mv'. That is a "wishlist" item. (answering out of order:) > > On 2019-08-02 9:56 p.m., L A Walsh wrote: > >> But you say posix wants it to perform as a rename? [...] > >> > >> So if I have: > >> mkdir A B > >> touch A/foo B/fee > >> So when I look at the system call on linux for rename: > >> oldpath can specify a directory. In this case, newpath must > >> either not > >> exist, or it must specify an empty directory. > >> (complying with POSIX_C_SOURCE >= 200809L) > >> > >> So move should give an error: Nope: > >> > >> mv A B > >>> tree B > >> B > >> ├── A > >> │ └── foo > >> └── fee > >> > >> 1 directory, 2 files > >> > >> So mv is violating POSIX - it didn't do the rename, but moved > >> A under B and neither dir had to be empty. > >> > >> Saying it has to follow POSIX when it doesn't appear to, seems > >> a bit contradictory? I previously quoted one small part of the entire "mv" POSIX specification (item #3, regarding using the 'rename(2)' function). It would be wise to read the entire specification before making claims about violating POSIX. Specifically, at the top of the page: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html SYNOPSIS mv [-if] source_file target_file mv [-if] source_file... target_dir DESCRIPTION [...] In the second synopsis form, mv shall move each file named by a source_file operand to a destination file in the existing directory named by the target_dir operand [...] This second form is assumed when the final operand names an existing directory In this regard GNU 'mv' is compliant with POSIX. > > On 2019-08-02 9:56 p.m., L A Walsh wrote: > >> On 2019/08/02 19:47, Assaf Gordon wrote: > >>> Can new merging features be added to 'mv'? yes. > >>> But it seems to me these would be better suited for 'higher level' > >>> programs (e.g. a GUI file manager). > >> --- > >> If the command was named 'ren', then I'd expect it to be dummer, > >> but 'mv'/move seem like it should be able to move files from > >> one dir into another. > >> > >> But you say posix wants it to perform as a rename? > >> I know, create a 're' command (or 'rn') for rename, and have > >> it do what 'mv' would do. Maybe posix would realize it would > >> be better to have re/rn behave like rename, and 'mv' to > >> behave it was moving something. The Austin group (https://www.opengroup.org/austin/) who is in charge of developing and maintaining the POSIX standard is the place to go when wanting to change things in POSIX (or add new things). You can write to them, suggest a modification, and if they change the standard, GNU coreutils will surely follow. As for renaming 'mv' or creating new 'rn' command - part of POSIX is to codify existing behavior (that is - programs which were in common use *before* POSIX). It's not always logic, it's not always ideal, but that's what has been in use for many years. Based on mv's wiki page (https://en.wikipedia.org/wiki/Mv), 'mv' was first introduced in 1971, 47 years ago. With hindsight of nearly 5 decades it's easy to point to faults in a program. If we were designing 'mv' today from scratch, I'm sure we would improve many of its aspects. But given that it is a long-standing program and its usage and quirks are well established, I'm inclined to say it is highly unlikely we will change mv's default behaviour or replace it with a different name. Adding new functionality (e.g. a new '--merge-directory' option) is possible, and concrete patches are always welcomed. However, given all the above, there is no guarentee that such new option will be accepted. I still think that such specific features are better suited for more sophisticated programs (whether GUI or command line). regards, - assaf
bug#36901: Enhance directory and file moves where target already exists
severity 36901 wishlist retitle 36901 mv: merge directories where target already exists stop Hello, (for context: this is a new topic, diverged at https://bugs.gnu.org/36831#38 ) For completeness, quoting your second message ( from https://bugs.gnu.org/36831#50 ): On 2019-08-02 9:56 p.m., L A Walsh wrote: > > On 2019/08/02 19:47, Assaf Gordon wrote: >> Can new merging features be added to 'mv'? yes. >> But it seems to me these would be better suited for 'higher level' >> programs (e.g. a GUI file manager). > --- > But neither the person who posted the original bug on this > nor I are using a GUI, we are running 'mv' GUI, we use the cmd line on > linux, so that wouldn't > be of any use. > > If the command was named 'ren', then I'd expect it to be dummer, > but 'mv'/move seem like it should be able to move files from > one dir into another. > > But you say posix wants it to perform as a rename? > I know, create a 're' command (or 'rn') for rename, and have > it do what 'mv' would do. Maybe posix would realize it would > be better to have re/rn behave like rename, and 'mv' to > behave it was moving something. > > So if I have: > mkdir A B > touch A/foo B/fee > > So when I look at the system call on linux for rename: > oldpath can specify a directory. In this case, newpath must > either not > exist, or it must specify an empty directory. > (complying with POSIX_C_SOURCE >= 200809L) > > So move should give an error: Nope: > > mv A B >> tree B > B > ├── A > │ └── foo > └── fee > > 1 directory, 2 files > > So mv is violating POSIX - it didn't do the rename, but moved > A under B and neither dir had to be empty. > > Saying it has to follow POSIX when it doesn't appear to, seems > a bit contradictory? >
bug#36831: Enhance directory move. (was Re: bug#36831: enhance 'directory not empty' message)
Hello, On 2019-08-02 9:56 p.m., L A Walsh wrote: On 2019/08/02 19:47, Assaf Gordon wrote: Can new merging features be added to 'mv'? yes. But it seems to me these would be better suited for 'higher level' programs (e.g. a GUI file manager). --- But neither the person who posted the original bug on this nor I are using a GUI, we are running 'mv' GUI, we use the cmd line on linux, so that wouldn't be of any use. The original post was about the error *message*, asking to make it clearer. That is the topic of this thread (and the previous patch) - so let's leave them at that. I see you started a new thread ( https://bugs.gnu.org/36901 ), so I'll reply there.
bug#36831: Enhance directory move. (was Re: bug#36831: enhance 'directory not empty' message)
Hello, On Fri, Aug 02, 2019 at 02:41:31AM -0700, L A Walsh wrote: > On 2019/07/28 23:28, Assaf Gordon wrote: > > > > > > $ mkdir A B B/A > > $ touch A/bar B/A/foo > > $ mv A B > > mv: cannot move 'A' to 'B/A': Directory not empty > > > > And the reason (as you've found out) is that the target directory 'B/A' > > is not empty (has the 'foo' file in it). > > Had this been allowed, moving 'A' to 'B/A' would result in the 'foo' > > file disappearing. > > > --- > Why must foo disappear? > > Microsoft Windows handles this situation by telling the user that > the target directory already exists and giving the option to *MERGE* > the directories. > > If you attempt to move a file into a directory that already contains > a file by the same name, it pops up another notice asking [...] Certainly, GUI programs (and more 'feature-rich' programs than 'mv') offer many "merging" options. I'm sure Midnight-Commander, KDE/Doplhine, XFCE/Thunar, Gnome/Nautilus and many other free software GUI file managers have some "merging" capabilities. But 'mv' is more basic and does not have this capability. Partly that is because it adheres to the POSIX standards, which mandates: "3. The mv utility shall perform actions equivalent to the rename() function [...]" https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html Some rsync options (--remove-source-files) can mimick 'mv' with merging, but then they are more like "copy+delete" than actual "rename/move". Can new merging features be added to 'mv'? yes. But it seems to me these would be better suited for 'higher level' programs (e.g. a GUI file manager). regards, - assaf
bug#36831: enhance 'directory not empty' message
On Thu, Aug 01, 2019 at 03:58:51PM -0700, Paul Eggert wrote: > Thanks, that's better, but we're still missing some opportunities for > improvement. > > > mv: cannot move 'A' to 'B/A': Target directory not empty > > This should be "Destination" not "Target". [...] > You meant "mv" not "rm". [...] > > +static char* > Space before "*". [...] > > +strerror_target (int e) > Change name to "strerror_dest" [...] > This function should return NULL instead of aborting when the errno value is > inapplicable. That way, its callers need not hardcode which errno values it > handles. Thanks for the review and suggestions - attached an updated patch. > Come to think of it, the same improvement should be made to ln, cp, install > and shred. Basically, to any program that uses 'rename' or 'link' or similar > syscalls, and which reports an error if the syscall fails. OK, I will work on that next. -assaf >From 8dc6158a6fde668e55312b5fb69384f438b7e55a Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Mon, 29 Jul 2019 00:23:20 -0600 Subject: [PATCH] mv: improve error messages when destination directory is at fault Suggested by Alex Mantel in https://bugs.gnu.org/36831 . $ mkdir A B B/A $ touch A/bar B/A/foo Before: $ mv A B mv: cannot move 'A' to 'B/A': Directory not empty After: $ mv A B mv: cannot move 'A' to 'B/A': Destination directory not empty The following errors are handled: EDQUOT, EEXIST, ENOTEMPTY, EISDIR, ENOSPC, ETXTBSY. * src/copy.c (copy_internal): Print custom messages for errors that explicitly fault the destination directory. (strerror_dest): New function, return custom, translatable error messages for errors relating to 'destination' component. * tests/mv/dir2dir.sh: Adjust expected error message. * NEWS: Mention change. --- NEWS| 6 + src/copy.c | 53 ++--- tests/mv/dir2dir.sh | 8 --- 3 files changed, 61 insertions(+), 6 deletions(-) diff --git a/NEWS b/NEWS index fd0543351..3d80665ae 100644 --- a/NEWS +++ b/NEWS @@ -44,6 +44,12 @@ GNU coreutils NEWS-*- outline -*- stat(1) also supports a new --cached= option to control cache coherency of file system attributes, useful on network file systems. +** Improvements + + mv now prints clearer error messages when a failure relates to the + destination directory (e.g., "Destination directory is not empty" instead + of "Directory not empty"). + * Noteworthy changes in release 8.31 (2019-03-10) [stable] diff --git a/src/copy.c b/src/copy.c index 65cf65895..602c8307b 100644 --- a/src/copy.c +++ b/src/copy.c @@ -1867,6 +1867,44 @@ source_is_dst_backup (char const *srcbase, struct stat const *src_st, return dst_back_status == 0 && SAME_INODE (*src_st, dst_back_sb); } +/* Return custom error messages replacing the default libc's + messages. These messages explicity fault the destination component + in the error. + + Return NULL if E (errno value) is not handled (and by implication + should use the system's default text for the error message). */ +static char * +strerror_dest (int e) +{ + /* TRANSLATORS: These strings should mimick libc's standard + error messages (from strerror(3)), but explicitly mention + the fault is with the destination directory. */ + switch (errno) +{ +case EDQUOT: + return _("Disk quota exceeded on destination device"); +case EEXIST: +case ENOTEMPTY: + return _("Destination directory not empty"); +case EISDIR: + return _("Tried to overwrite a directory with a file"); +case ENOSPC: + return _("No space left on destination device"); +case ETXTBSY: + /* NOTE: The error is "Text file busy" - but "text" in that context + refers to "text segment" of an executable file (as opposed to + "data segment" and "BSS segment"). + + This error message is meant for users, and 'text file' can be easily + confused with an actual text file (i.e., one containing only ASCII + characters. Thus, say 'executable' instead of 'text'.*/ + return _("Destination executable file is busy"); +default: + return NULL; +} +} + + /* Copy the file SRC_NAME to the file DST_NAME. The files may be of any type. NEW_DST should be true if the file DST_NAME cannot exist because its parent directory was just created; NEW_DST should @@ -2477,9 +2515,18 @@ copy_internal (char const *src_name, char const *dst_name, If the permissions on the directory containi
bug#36831: enhance 'directory not empty' message
Hello, On Wed, Jul 31, 2019 at 08:03:45PM -0700, Paul Eggert wrote: > Assaf Gordon wrote: > > An explicit error explicitly saying "cannot move", and mention the source > > and > > destination, and also "blames" the target directory seems the most > > user-friendly and least ambiguous. > > Sure, but that handles only the ENOTEMPTY/EEXIST case. How would you handle > the EDQUOT, EISDIR, and ENOSPC cases? Will you invent a separate diagnostic > for each case, or just treat them as in my proposed patch? I assume the > latter, but either way I'd like to see a patch that handles these properly > too. Also, please handle ETXTBUSY while you're at it (sorry, I missed that > one). > > > For the second and third cases, > > "No space" and "Quota exceeded" seem to me to always relate to the > > destination, and I don't think users get confused about those > > (other opinions of course welcomed). > > What's obvious to experts like us is not always obvious to users. If users > get confused by the current diagnostic for ENOTEMPTY/EEXIST, I don't see why > they wouldn't also get confused for ETXTBUSY etc. > > > Your patch also added "EISDIR", for which rename(2) says: > > "newpath is an existing directory, but oldpath is not a directory." > > > > But I don't think this error can happen with gnu mv. > > It can, as a result of a race condition if some other process is mutating > the file system while 'mv' is running. Admittedly unlikely, but we might as > well improve this errno value while we're improving the others. All good points. Please see attached updated version. It does add explicit error string for each error code, but I hope the implementation is reasonable and easy to maintain and translate. -assaf >From 8ee71b24d74d7cfe81f151de430d38935cf04675 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Mon, 29 Jul 2019 00:23:20 -0600 Subject: [PATCH] mv: improve error messages when target directory is at fault Suggested by Alex Mantel in https://bugs.gnu.org/36831 . $ mkdir A B B/A $ touch A/bar B/A/foo Before: $ mv A B mv: cannot move 'A' to 'B/A': Directory not empty After: $ mv A B mv: cannot move 'A' to 'B/A': Target directory not empty The following errors are handled: EDQUOT, EEXIST, ENOTEMPTY, EISDIR, ENOSPC, ETXTBSY. * src/copy.c (copy_internal): Print custom messages for errors that explicitly fault the target directory. (strerror_target): New function, return custom and translatable error messages. * tests/mv/dir2dir.sh: Adjust expected error message. * NEWS: Mention change. --- NEWS| 6 + src/copy.c | 56 ++--- tests/mv/dir2dir.sh | 6 ++--- 3 files changed, 62 insertions(+), 6 deletions(-) diff --git a/NEWS b/NEWS index fd0543351..4ec4d0df0 100644 --- a/NEWS +++ b/NEWS @@ -44,6 +44,12 @@ GNU coreutils NEWS-*- outline -*- stat(1) also supports a new --cached= option to control cache coherency of file system attributes, useful on network file systems. +** Improvements + + rm now prints clearer error messages when a failure relates to the + target directory (e.g., "Target directory is not empty" instead of + "Directory not empty"). + * Noteworthy changes in release 8.31 (2019-03-10) [stable] diff --git a/src/copy.c b/src/copy.c index 65cf65895..9cf02ad9c 100644 --- a/src/copy.c +++ b/src/copy.c @@ -1867,6 +1867,38 @@ source_is_dst_backup (char const *srcbase, struct stat const *src_st, return dst_back_status == 0 && SAME_INODE (*src_st, dst_back_sb); } +static char* +strerror_target (int e) +{ + /* TRANSLATORS: These strings should mimick libc's standard + error messages (from strerror(3)), but explicitly mention + the fault is with the target directory. */ + switch (errno) +{ +case EDQUOT: + return _("Disk quota exceeded on target device"); +case EEXIST: +case ENOTEMPTY: + return _("Target directory not empty"); +case EISDIR: + return _("Tried to overwrite a directory with a file"); +case ENOSPC: + return _("No space left on target device"); +case ETXTBSY: + /* NOTE: The error is "Text file busy" - but "text" in that context + refers to "text segment" of an executable file (as opposed to + "data segment" and "BSS segment"). + + This error message is meant for users, and 'text file' can be easily + confused with an actual text file (i.e., one containing only ASCII + characters. Thus, say 'executable' instead of 'tex
bug#36831: enhance 'directory not empty' message
Hello Paul, On Mon, Jul 29, 2019 at 06:50:46PM -0500, Paul Eggert wrote: > On 7/29/19 1:28 AM, Assaf Gordon wrote: > > + if (rename_errno == ENOTEMPTY || rename_errno == EEXIST) > > +{ > > + error (0, 0, _("cannot move %s to %s: Target directory not > > empty"), > > + quoteaf_n (0, src_name), quoteaf_n (1, dst_name)); > > Although this is an improvement, it is not general enough, as other errno > values are relevant only for the destination. Better would be to have a > special case for errno values that matter only for the destination, and use > the existing code for errno values where we don't know whether the problem > is the source or the destination. Something like the attached, say. > +case EDQUOT: case EEXIST: case EISDIR: case ENOSPC: case > ENOTEMPTY: > + error (0, rename_errno, "%s", quotearg_colon (dst_name)); > + break; > + Thanks for the review. At the risk of bikeshedding, I'd like to argue for the prior method. While it is not general enough, I think it provides a clearer error message. For example, with the more general implementation the errors would be: $ mv A B mv: B/A: Directory not empty $ mv A B mv: B/A: No space left on device $ mv A B mv: B/A: Quota exceeded In the first case, I think this error is potentially more confusing than before: while it doesn't mention the source directory, it also doesn't say "cannot move" - so it is only implied it is an error (an inexperienced user might dismiss this as a warning). Also, it could be that there will be a source directory named very similarly to the destination directory, and from a quick glace it would not be easy to understand what happened. An explicit error explicitly saying "cannot move", and mention the source and destination, and also "blames" the target directory seems the most user-friendly and least ambiguous. --- For the second and third cases, "No space" and "Quota exceeded" seem to me to always relate to the destination, and I don't think users get confused about those (other opinions of course welcomed). --- Your patch also added "EISDIR", for which rename(2) says: "newpath is an existing directory, but oldpath is not a directory." But I don't think this error can happen with gnu mv. If we try to move a file onto a directory, we get: $ mkdir C C/D ; touch D $ mv D C mv: cannot overwrite directory 'C/D' with non-directory And this case is specifically handled in copy.c line 2131, before calling rename(2) (and also this is an example of a custom error message instead of using stock libc messages). --- Happy to hear your opinion, - assaf
bug#36831: enhance 'directory not empty' message
Hello, On Sun, Jul 28, 2019 at 08:58:59PM +0200, Alex Mantel wrote: [...] > Ah, the target directory does exist! Hmm... But i'd like the message to be > like: > > $ mv thing/ ../things > mv: cannot move 'thing' to '../things/things': Targetdirectory not empty > > ^ this little thing here, > it explains everyting. > > Change text from 'Directory not empty' to 'Targetdirectory not empty'. Thanks for the report. To clarify, the scenario is: $ mkdir A B B/A $ touch A/bar B/A/foo $ mv A B mv: cannot move 'A' to 'B/A': Directory not empty And the reason (as you've found out) is that the target directory 'B/A' is not empty (has the 'foo' file in it). Had this been allowed, moving 'A' to 'B/A' would result in the 'foo' file disappearing. --- How is a user expecting to know this error is about that target directory? There is a bit of a trade-off here between user-friendliness (especially for non-technical user) and more technical knowledge. If we go one step 'lower' to the programming interface, almost all sources mention this is about the 'target' directory not being empty: POSIX's says: https://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html [EEXIST] or [ENOTEMPTY] The link named by new is a directory that is not an empty directory. Linux's rename(2) manual page says: ENOTEMPTY or EEXIST newpath is a nonempty directory, that is, contains entries other than "." and "..". FreeBSD's rename(2) manual page says: [ENOTEMPTY]The to argument is a directory and is not empty. AIX rename(2) manual page says: ENOTEMPTY The ToPath parameter specifies an existing directory that is not empty. So there is some merit in claiming this helpful piece of information is lost when the error message is reported to the user. --- In GNU coreutils this error message originates from 'copy.c' line 2480: https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/copy.c#n2480 error (0, rename_errno, _("cannot move %s to %s"), quoteaf_n (0, src_name), quoteaf_n (1, dst_name)); And herein lies the (technical) problem: The actual message "Directory not empty" is not in the source code - it is a system error message that corresponds to the value of 'rename_errno' variable (ENOTEMPTY/EEXIST). It originates from GLibc (or another libc). So there is no trivial way to change the error message in coreutils. Attached a patch to add special handling for this error. --- What do others think? If this is a desired improvement, I'll finish the patch with news/tests/etc. regards, - assaf >From 430b30104234db719bf15e6fc681a62312c7124f Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Mon, 29 Jul 2019 00:23:20 -0600 Subject: [PATCH] mv: improve ENOTEMPTY/EEXIST error message Suggested by Alex Mantel in https://bugs.gnu.org/36831 . $ mkdir A B B/A $ touch A/bar B/A/foo Before: $ mv A B mv: cannot move 'A' to 'B/A': Directory not empty After: $ mv A B mv: cannot move 'A' to 'B/A': Target directory not empty * src/copy.c (copy_internal): Add special handling for ENOTEMPTY/EEXIST. TODO: NEWS, tests. --- src/copy.c | 8 1 file changed, 8 insertions(+) diff --git a/src/copy.c b/src/copy.c index 65cf65895..a5af570bf 100644 --- a/src/copy.c +++ b/src/copy.c @@ -2450,6 +2450,14 @@ copy_internal (char const *src_name, char const *dst_name, return true; } + if (rename_errno == ENOTEMPTY || rename_errno == EEXIST) +{ + error (0, 0, _("cannot move %s to %s: Target directory not empty"), + quoteaf_n (0, src_name), quoteaf_n (1, dst_name)); + forget_created (src_sb.st_ino, src_sb.st_dev); + return false; +} + /* WARNING: there probably exist systems for which an inter-device rename fails with a value of errno not handled here. If/as those are reported, add them to the condition below. -- 2.11.0
bug#36674: Sort Suggestion
tag 36674 notabug close 36674 stop Hello, On Mon, Jul 15, 2019 at 11:42:01AM -0700, Marshall Lake wrote: > Even though this isn't a bug, I was asked to send the following to this > email address. (General suggestions and discussions are better suited for coreut...@gnu.org mailing list, that way the system won't open a new bug item.) > > Re: SORT Command from GNU coreutils 8.25 > > A suggestion for an additional option to the SORT command is to ignore > non-alphanumeric characters. > > As an example, in attempting to sort an index ... > > Abbott, William259 > > sorts before: > > Abbot, William 099 > > If non-alphanumeric characters were ignored then the same two records > would sort as: > > Abbot, William 099 > Abbott, William259 > > There's actually something else at play here: In your case, sort does ignore non-alphanumeric characters, but it ALSO ignores white space. That happens because your locale is set to some language (for example, en_US.UTF8). Using such locale makes sort ignore all non-alphanumeric chareacters, whitespace, and upper/lower cases. In essense, you are compaing "AbbottWilliam" (two 't's) to 'AbbotWilliam' (one 't') - and then the second 't' is compared to a 'w', and is determined to come first. If you force a POSIX/C locate, then all characters are considered, and the result will be as you requested. Observe the following: $ printf "%s\n" AbbottWilliam AbbotWilliam | LC_ALL=en_CA.utf8 sort AbbottWilliam AbbotWilliam $ printf "%s\n" "Abbott William" "Abbot William" | LC_ALL=en_CA.utf8 sort Abbott William Abbot William $ printf "%s\n" "Abbott William" "Abbot William" | LC_ALL=C sort Abbot William Abbott William $ printf "%s\n" "Abbott, William" "Abbot, William" | LC_ALL=C sort Abbot, William Abbott, William Note that 'sort' already has an option for dictionary style sorting: -d, --dictionary-order: consider only blanks and alphanumeric characters. However, locale rules take precedence over it, so effectively it only works in "C" locale: $ printf "%s\n" "Ab,,b,,ott William" "Abbot William" | LC_ALL=C sort Ab,,b,,ott William Abbot William $ printf "%s\n" "Ab,,b,,ott William" "Abbot William" | LC_ALL=C sort -d Abbot William Ab,,b,,ott William You can read past discussion about the confusion resulting from locale sorting rules here: https://debbugs.gnu.org/11621 https://debbugs.gnu.org/12783 As such, I'm closing this as "not a bug", but discussion can continue by replying to this thread. -assaf
bug#36671: tail: unrecognized file system type 0x794c7630 for ‘/var/log/messages’. please report this to bug-coreutils@gnu.org. reverting to polling
tag 36671 notabug close 36671 stop Hello, On Mon, Jul 15, 2019 at 06:22:47PM +0200, John Koppolu wrote: > tail: unrecognized file system type 0x794c7630 for ‘/var/log/messages’. > please report this to bug-coreutils@gnu.org. reverting to polling You've previously reported this 4 days ago, please see the reply there: https://bugs.gnu.org/36600#8 -assaf
bug#36600: unrecognized file system type 0x794c7630 for ‘/var/log/messages’. please report this to bug-coreutils@gnu.org. reverting to polling
tag 36600 notabug close 36600 stop Hello, On Thu, Jul 11, 2019 at 05:53:16PM +0200, John Koppolu wrote: > unrecognized file system type 0x794c7630 for ‘/var/log/messages’. please > report this to bug-coreutils@gnu.org. reverting to polling > This has system (overlayfs, commonly used with Docker containers) has been added in version 8.25. Consider upgrading Coreutils if possible. See https://www.gnu.org/software/coreutils/filesystems.html for more details. regards, - assaf
bug#35939: version sort is incorrect with hyphen-minus
Hello Paul, On Wed, Jun 26, 2019 at 12:57:14PM -0700, Paul Eggert wrote: > GNU sort uses the same algorithm as glibc strverscmp, I think that both sort and ls use 'filevercmp' - a simplified version that does not support locales (and doesn't fail). The change (from 'strvercmp') was made in: commit e505736f8211a608b00dfe75fb186a5211e1a183 Author: Kamil Dudka Date: Fri Oct 3 11:03:40 2008 +0200 ls and sort: use filevercmp instead of strverscmp https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=e505736f8211a608b00dfe75fb186a5211e1a183 > Has the Debian version-comparison algorithm changed since 1997? If so, could > you give details about the changes to the Debian algorithm? I don't think the algorithm changed in Debian, and also in gnulib there are only a handful of relevant commits, all 10 years old: 9121662f1 2008-10-03 filevercmp: new module 0443c2f39 2009-03-05 filevercmp: Move hidden files up in ordering. 1721cf06d 2009-03-24 filevercmp: handle simple~ and numbered.~3~ backup suffixes 4fd008794 2009-04-09 filevercmp: fix regression cc96df30d 2009-04-09 filevercmp: correct today's change I think (also based on Ian's confirmation) that this discrepancy was from the beginning. I now notice that there's an additional difference: coreutils/gnulib has special handling for extension, hidden files and backup files. As Ian wrote, a documentation improvement is probably the best fix. I'll try to come up with a suggested change. -assaf P.S. For completion, here are few other threads with details/explanations about 'version-sort': https://bugs.gnu.org/18168 https://bugs.gnu.org/22275 https://bugs.gnu.org/22455 https://bugs.gnu.org/33786
bug#35939: version sort is incorrect with hyphen-minus
(Adding Ian Jackson for dpkg/debian-version details) Hello, On Tue, May 28, 2019 at 02:53:39AM +0200, Vincent Lefevre wrote: > With GNU coreutils 8.30 under Debian/unstable, I get: > > $ LC_ALL=C ls > ab-cd abb abe > $ LC_ALL=C ls -v > abb abe ab-cd > > The hyphen-minus character should still be regarded as being less > than the letters (there are no digits, so both are expected to be > equivalent). The GNU coreutils manual says: > [...] Thanks for the report and the clear details. To summarize, "ls -v" and "sort -V" (coreutils' version sort) behaves differently than other implementations in regards to minus character: $ printf "%s\n" abb ab-cd | sort -V abb ab-cd $ v1="abb" $ v2="ab-cd" $ dpkg --compare-versions "$v1" lt "$v2" && printf "$v1\n$v2\n" || printf "$v2\n$v1\n" ab-cd abb If I understand correctly, The reason is that in Debian's version comparison algorithm [1], the minus character has a special meaning: it separates the "upstream version" part from the "debian revision" part. In Debian's implementation [2], a version string is first split into three parts (epoch, upstream version, debian revision) using ":" for epoch delimiter and "-" for revision delimiter. Only then the three parts are compared, separately [3]. [1] https://www.debian.org/doc/debian-policy/ch-controlfields.html#version [2] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/parsehelp.c#n191 [3] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/version.c#n140 On ther other hand, coreutils' implementation (from gnulib [4]) does not break version string into three parts - it treats the entire string as a single "upstream version" part. The rules for sorting the "upstream version" string say: "... The lexical comparison is a comparison of ASCII values modified so that all the letters sort earlier than all the non-letters and so that a tilde sorts before anything" (from [1]) [4] https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/filevercmp.c Therefore, dpkg first seprates "ab" from "cd", then compares "ab" to "abb" - and 'ab' comes first; Coreutils compare "ab-cd" to "abb" (or technically, just "ab-" to "abb"), and because "letters sort earlier than all non-letters", "abb" comes first. I hope this helps explain the differences (I also hope this explanation is correct, and I invite others to chime in). regards, - assaf
bug#35654: We've found a vulnerability of gnu chown, please check it and request a cve id for us.
tag 35654 close 35654 stop Hello, On Thu, May 09, 2019 at 11:53:11PM +0800, st0n3 ss wrote: > Hello! we have found a vulnerability of command chown, please check it.If > it is a vulnerability. please request a cve id for use, thank you!chown -h > bypass Given Paul's and Bob's detailed answers, I'm closing this as "not a bug". Discussion can continue by replying to this thread. regards, - assaf
bug#36130: split bug
tag 36130 notabug close 36130 stop Hello, On Mon, Jun 10, 2019 at 04:50:20PM -0600, Assaf Gordon wrote: > On 2019-06-10 12:28 p.m., Heather Wick wrote: > > Verbose: This seems to have made the same number of files this time; not > > sure why the other 3-4 times I ran it it did not. They appear to be the > > same size, with paired last reads > [...] > > Glad to hear it worked. > > Could it be that in previous times the queued job ran out of disk space? > > That would be my first guess, as such things are common in shared > grid/cluster environments, particularly if your job runs in a temporary > and limited storage location (e.g. "/tmp/job-"). With no further comments, I'm closing this ticket. If more issues arise (or this was not adequate solution) we can always re-open this ticket. regards, -assaf
bug#35632: date Parse of '13:00 + 2 hours' Broken.
tag 35632 notabug close 35632 stop Hello, (sorry for the delayed reply) On Wed, May 08, 2019 at 12:57:10PM +0100, Ralph Corderoy wrote: > > Using date from coreutils 8.31-1 on Arch Linux. > This surprised me. > > $ TZ=UTC0 /bin/date -d '1pm + 2 hours' > Wed 8 May 15:00:00 UTC 2019 > $ TZ=UTC0 /bin/date -d '13:00 + 2 hours' > Wed 8 May 12:00:00 UTC 2019 > > The documentation doesn't suggest `1pm' and `13:00' are treated > differently. `--debug' helps. > > $ TZ=UTC0 /bin/date --debug -d '1pm + 2 hours' > date: parsed time part: 01:00:00pm > date: parsed relative part: +2 hour(s) > ... > $ TZ=UTC0 /bin/date --debug -d '13:00 + 2 hours' > date: parsed time part: 13:00:00 UTC+02 > date: parsed relative part: +1 hour(s) > date: input timezone: parsed date/time string (+02) > ... > > It looks like parsing is broken in the second case. Thank you for for providing detailed output with "--debug", makes things easier to troubleshoot. When encountering a time string (HH:MM or HH:MM:SS) followed by a plus sign and a number, date's parser *always* treats it as a timezone (giving timezones higher priority than time adjustments). > The result I wanted can also be obtained my omitting the `+'. > > $ TZ=UTC0 /bin/date -d '1pm 2 hours' > Wed 8 May 15:00:00 UTC 2019 > $ TZ=UTC0 /bin/date -d '13:00 2 hours' > Wed 8 May 15:00:00 UTC 2019 And this is indeed one possibly solution. Other similar issues are detailed here: https://lists.gnu.org/archive/html/bug-coreutils/2018-10/msg00126.html As such, I'm closing this ticket, but discussion can continue by replying to this thread. regards, - assaf
bug#36383: date command processes timezone differently when doing math
tag 36383 notabug close 36383 stop Hello, On Tue, Jun 25, 2019 at 04:10:07PM -0700, Brian Woods wrote: > When doing a math operation to a date command it appear to process the > timezone differently. [...] > > #echo $datNow > 2019-06-25 15:21:34 > > #date -d "$datNow + 1 minute" "+%Y-%m-%d %H:%M:%S" --debug > date: parsed date part: (Y-M-D) 2019-06-25 > date: parsed time part: 15:21:34 UTC+01 > date: parsed relative part: +1 minutes > date: input timezone: parsed date/time string (+01) Thank you for providing detailed examples with "--debug", makes things much easier to troubleshoot. The issue is that a time string (HH:MM:SS) followed by a plus sign and a number is *always* taken to be a time zone. Using a value other than 1 will show it more clearly: $ date -d "$datNow + 8 minutes" "+%Y-%m-%d %H:%M:%S" --debug date: parsed date part: (Y-M-D) 2019-06-25 date: parsed time part: 15:21:34 UTC+08 date: parsed relative part: +1 minutes date: input timezone: parsed date/time string (+08) The "+8" part is treated as timezone, and the remaining text ("minutes") is taken as a one-minute time adjustment. One solution is to just remove the plus sign: $ date -d "$datNow 8 minutes" "+%Y-%m-%d %H:%M:%S" --debug date: parsed date part: (Y-M-D) 2019-06-25 date: parsed time part: 15:21:34 date: parsed relative part: +8 minutes date: input timezone: system default [...] 2019-06-25 15:29:34 Another is to specify the time zone: $ date -d "$datNow +00:00 +8 minutes" "+%Y-%m-%d %H:%M:%S" --debug date: parsed date part: (Y-M-D) 2019-06-25 date: parsed time part: 15:21:34 UTC+00 date: parsed relative part: +8 minutes date: input timezone: parsed date/time string (+00) [...] 2019-06-25 09:29:34 More examples of adjusting time strings are here (your example is similar to case #1): https://lists.gnu.org/archive/html/bug-coreutils/2018-10/msg00126.html As such, I'm closing this ticket but discussion can continue by replying to this thread. regards, - assaf
bug#36130: split bug
Hello, On 2019-06-10 12:28 p.m., Heather Wick wrote: Thank you so much for your response. Here are the results of the tests you sent: Verbose: This seems to have made the same number of files this time; not sure why the other 3-4 times I ran it it did not. They appear to be the same size, with paired last reads [...] Glad to hear it worked. Could it be that in previous times the queued job ran out of disk space? That would be my first guess, as such things are common in shared grid/cluster environments, particularly if your job runs in a temporary and limited storage location (e.g. "/tmp/job-"). I would suspect that the exit-code you are seeing is the exit code of the entire job (that is - of the shell script that is being qsub'd), and not necessarily that of 'split' (then again, this might not be correct if you explicitly checked the exit code of 'split'). Given that your grid environment already has configuration issues (the bash and "module" related errors), I would not be surprised if the exit code is not reliable. I would strongly encourage to always look into the STDERR file of the job to verify no other errors occurred. Or, perhaps write shell scripts more defensively, like so: [...] zcat MH1_R1.fastq.gz | split -l 4000 - DHT_R1_ \ && echo split MH1_R1 OK \ || echo split MH1_R1 FAILED [...] Then checking the STDOUT for positive confirmation each program succeeded. Or perhaps: # define a shell function "die" to print an error and terminate die() { base=$(basename "$0") echo "$base: error: $*" >&2 exit 1 } zcat MH1_R1.fastq.gz | split -l 4000 - DHT_R1_ \ || die "split MH1_R1 failed" And then run at least one job that will fail on purpose, and ensure you see the error message in the STDERR log, and you get a non-zero exit code (and then ensure you use 'die' on every command). It is sometimes recommended to use "set -e" for "easy" error handling in shell scripts- but I would recommend against it. Many reasons detailed here: https://mywiki.wooledge.org/BashFAQ/105 It might be more frustrating to add such extra checks on every program, but from my humble experience, grid environments bring on so many more intermittent and transient problems that it is definitely worth it. STDERR: The only thing in the stderr file is an odd duck of: -sh: module: line 1: syntax error: unexpected end of file -sh: error importing function definition for `BASH_FUNC_module' Python 3.6.8 :: Anaconda, Inc. /bin/sh: module: line 1: syntax error: unexpected end of file /bin/sh: error importing function definition for `BASH_FUNC_module' but this prints for every job I run with this particular flavor of conda/bash and doesn't seem to affect anything else (as far as I know) These errors are specific to your grid/cluster environment, and the best place to ask is the I.T or bioinformatics department in your institute (whomever is in charge of the cluster). Broadly speaking, "module" is mechanism that ease the use of various software packages. It is usally setup by your IT administrators. A typical use-case is to have different version of programs in non- standard locations, e.g. samtools version 1.6 in /opt/it/programs/samtools-1.6 and samtools version 1.9 in /opt/bioinfo/tools/new/samtools/ and then cluster users (e.g. you) just need to add: "module load samtools-1.8" and have the command "samtools" just work without knowing the gritty details of where the program is. It seems that in your case, something relating to the "module" setup is broken. More information here: https://en.wikipedia.org/wiki/Environment_Modules_(software) All jobs finished well below allotted memory and with exit status 0, even when split didn't make the right number of output files. > > Do you know any reason why the behavior would be inconsistent? The "alloted memory" is a non-issue for this "split" command, it will always use very little amount of memory regardless of how big the input files are. As for "exit status 0" - I can't be sure, but I suspect the exit status you see is the one of the entire job (i.e. the shell script), and perhaps it does not represent the exit code of the "split" program. If you have the STDERR files of the jobs which failed, it's worth checking them for any additional error messages. Pairing check: unfortunately my server's version of bash doesn't support paste in this way, I've run into this issue before but I forget what the workaround is. I can't run this command interactively because my server times out (these files are > 3 billion lines each, so it takes a long time to zcat them) Ah yes, the construct: program <(other program) is a "bash" feature that is not available in simple shell scripts (interactive use vs non-interactive and other things). One work-around is to run (from inside your script): bash -c "paste <(zcat MH1_R2.fastq) <(zcat MH1_R2.fastq.gz)" \ | awk 'NR%4!=1 {
bug#36130: split bug
Hello, On Fri, Jun 07, 2019 at 09:48:44PM -0400, Heather Wick wrote: > Yes, sorry, I should have specified that I already checked that the > original fastq files are indeed paired and sorted with the same number of > lines and same starting/ending IDs, narrowing down the issue to a problem > with split. It could be a problem with "split", but we'll need to dig a bit deeper to be able to pinpoint the exact issue. Could you please try the following commands and post the results? zcat MH1_R1.fastq.gz \ | split --verbose -l 4000 - DHT_R1_ > DHT_R1.log ; echo DHT_R1 exit code: $? zcat MH1_R2.fastq.gz \ | split --verbose -l 4000 - DHT_R2_ > DHT_R2.log ; echo DHT_R2 exit code: $? wc -l DHT_R1.log DHT_R2.log Two more questions: 1. can you post the result of "split --version" ? 2. You mentioned "jobs" - if you are running these as submitted jobs on a cluster (e.g. with "qsub"), can you double-check the STDERR log files to ensure no errors where encountered ? If we still can't pinpoint the issue, the next steps would be to check the DHT_R{1,2}.log files, and then try to compare the content of the splitted files. I assume the input files are indeed correctly paired, but just to check, if you could try the following command, it should not print anything to the screen (indicating all sequence IDs are paired): paste <(zcat MH1_R2.fastq) <(zcat MH1_R2.fastq.gz) \ | awk 'NR%4!=1 { next } $1!=$3 { print "Error in line " NR ":" $1 " vs " $3 }' regards, - assaf
bug#36130: split bug
Hello, On Fri, Jun 07, 2019 at 02:23:15PM -0400, Heather Wick wrote: > I am using split to split up some large, paired fastq files [...]: > > zcat MH1_R1.fastq.gz | split - -l 4000 DHT_R1_ > zcat MH1_R2.fastq.gz | split - -l 4000 DHT_R2_ > > This creates 96 chunks for the R1 and 95 chunks for R2, even though the > orignal fastq files have the same number of reads. > > Do you have any suggestions for how to proceed? Perhaps zcatting and piping > the files is not the best way to call split? To help diagnose to issue better, please run the following commands and tell us what are the results: 1. number of lines in each file: zcat MH1_R1.fastq.gz | wc -l zcat MH1_R2.fastq.gz | wc -l 2. The first two sequence IDs: zcat MH1_R1.fastq.gz | head -n8 | grep ^@ zcat MH1_R2.fastq.gz | head -n8 | grep ^@ 3. Last two sequence IDs: zcat MH1_R1.fastq.gz | tail -n8 | grep ^@ zcat MH1_R2.fastq.gz | tail -n8 | grep ^@ These will just verify the FASTQ files are indeed paired with no surprises. The files should have the same number of lines, and matching sequence IDs in the first and last lines. regards, - assaf
bug#35587: sort order wrt lower/upper case
tags 35587 notabug close 35587 stop Hello, On 2019-05-05 1:01 p.m., Toralf Förster wrote: I'd expect "B" being the first line here: echo a B c d | xargs -n 1 | sort using sys-apps/coreutils-8.30 at a stable hardened Gentoo Linux, but it is "a". Is this a bug or a feature? This is just a matter of your locale (e.g. "de_DE.UTF8" ?) that sorts letters without regard to case. If you force C locale you'll get "B" first: $ echo a B c d | xargs -n 1 | LC_ALL=C sort B a c d Adding "--debug" will show a warning and help diagnose such issues in the future: $ sort --debug sort: using ‘ca_EN.utf8’ sorting rules ... ... As such, I'm closing this as not-a-bug, but discussion can continue by replying to this thread. -assaf
bug#34825: New fails in tests/{misc,cp} in v8.31 on OpenIndiana
tags 34825 fixed close 34825 stop Hello, On 2019-04-10 5:05 a.m., Michal Nowak wrote: the patch worked on OpenIndiana as well. Thanks for confirming, I'm closing this bug. regards, -assaf
bug#35289: closed (Re: bug#35289: date+%-Y -d "- N years" errors when N > 111)
Hello, On 2019-04-15 5:10 p.m., O. Emmerson wrote: For me it gives: $ ./inv-year time() = 1555369320 localtime() = 2019-04-16 00:02:00 (mday=16 wday=2, isdst=1) struct tm (after adjustment) = 0009-04-16 00:02:00 (mday=16 wday=2, isdst=1) inv-year: mktime() failed: Value too large for defined data type > On 2019-04-15 6:50 p.m., C de-Avillez wrote: [...] root@u1904:~# gcc -o inv-year inv-year.c root@u1904:~# ./inv-year time() = 1555375408 localtime() = 2019-04-16 00:43:28 (mday=16 wday=2, isdst=0) struct tm (after adjustment) = 0009-04-16 00:43:28 (mday=16 wday=2, isdst=0) mktime() after date adjustment = -61874061392 So: a pristine 19.04 runs it. My laptop (which is my work machine, full of other packages & programs), does not. Thank you both for testing. So, to summarize: whenever "inv-year" fails - it is a problem with glibc on your setup, *not* a problem in coreutils' date(1) program. If there is a setup where "inv-year" succeeds but date(1) still fails, then it is a problem in coreutils. I'm glad to hear latest Ubuntu 19.04 is working fine (though the reason for the earlier failure is still a mystery). As Paul suggested, trying 'strace' on the failing system might reveal more details. regards, - assaf
bug#35289: closed (Re: bug#35289: date+%-Y -d "- N years" errors when N > 111)
Thanks Bernhard, On 2019-04-15 2:14 p.m., Bernhard Voelker wrote: I can easily reproduce here on my regular openSUSE:Tumbleweed from latest git: $ src/date --debug '+%-Y' -d '- 2010 years' [] date: error: adding relative date resulted in an invalid date: '(Y-M-D) 0009-04-15 22:10:37' This makes it easy to pinpoint (hooray for "--debug" :) ). This error is given if gnulib's "mktime_z" fails to convert the adjusted "struct tm" to "time_t" (adjusted because its tm_year was decremented by 2010). https://opengrok.housegordon.com/source/xref/gnulib/lib/parse-datetime.y#2177 To see if this is glibc issue, or perhaps an gnulib/mktime_z wrapper issue, can you (and/or others) try the attached C program? It calls time(2)+localtime(3)+mktime(3) to emulate the date adjustment. Because the adjustment is to year 9 (about 1961 years before epoch), the time_t value is negative. perhaps that's the issue? or perhaps combined with a specific timezone it becomes problematic? On my system it gives: $ gcc -o inv-year inv-year.c $ ./inv-year time() = 1555361050 localtime() = 2019-04-15 14:44:10 (mday=15 wday=1, isdst=1) struct tm (after adjustment) = 0009-04-15 14:44:10 (mday=15 wday=1, isdst=1) mktime() after date adjustment = -61874070118 regards, - assaf /* A test program to help with https://bugs.gnu.org/35289#28 compile with: gcc -o inv-year inv-year.c written by Assaf Gordon (assafgor...@gmail.com). placed under public domain. */ #include #include #include #include int main() { time_t now = time(NULL); if ( now == ((time_t) -1)) err(1, "time() failed"); printf("time() = %ld\n", (signed long)now); struct tm *a = localtime(&now); if (!a) err(1, "localtime(now) failed"); printf("localtime() = %04d-%02d-%02d %02d:%02d:%02d\n" " (mday=%d wday=%d, isdst=%d)\n", a->tm_year+1900,a->tm_mon+1,a->tm_mday, a->tm_hour, a->tm_min, a->tm_sec, a->tm_mday, a->tm_wday, a->tm_isdst); struct tm b; memcpy (&b, a, sizeof b); /** * Test date adjustment by changing 'b' members **/ b.tm_year -= 2010; printf("struct tm (after adjustment) = %04d-%02d-%02d %02d:%02d:%02d\n" " (mday=%d wday=%d, isdst=%d)\n", b.tm_year+1900,b.tm_mon+1,b.tm_mday, b.tm_hour, b.tm_min, b.tm_sec, b.tm_mday, b.tm_wday, b.tm_isdst); time_t notnow = mktime (&b); if ( notnow == ((time_t) -1) ) err(1, "mktime() failed"); printf("mktime() after date adjustment = %ld\n", (signed long)notnow); return 0; }
bug#35289: closed (Re: bug#35289: date+%-Y -d "- N years" errors when N > 111)
Hello, On 2019-04-15 11:55 a.m., C de-Avillez wrote: 19.04: It is worth noting that Ubuntu 19.04 has not been officially released yet, so you are testing on a development branch (or a release-candidate, or a special built infrastructure as hinted by your path). cerdea@piatam:/data/buildd/coreutils$ date +%-Y -d '- 2010 years' date: invalid date ‘- 2010 years’ 1 cerdea@piatam:/data/buildd/coreutils$ date --version date (GNU coreutils) 8.30 [...] On Mon, Apr 15, 2019 at 12:16 PM O. Emmerson wrote: $ file /bin/date /bin/date: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=26fa7f6c43c354d8c5647ebf946255a2b8e3c53d, stripped I just downloaded a recent daily snapshot of ubuntu desktop live CD for amd64 from http://cdimage.ubuntu.com/daily-live/current/ . The file "disco-desktop-amd64.iso" dated 2019-04-13 22:28 size 1.9GB, with the following checksum: $ sha1sum disco-desktop-amd64.iso b89fb143b51e17482a3882abe2f5f4e3b69942fe disco-desktop-amd64.iso Booting with QEMU as live-cd, I tested the same command and got "9" as the (correct) result. So this can't be easily reproduced. An interesting benefit of reproducible builds is that I see on the live-cd image the sha1 checksum of "/bin/date" is the same as you listed above. This hints to me the problem is somewhere else in your setup. As this is not an official release, we really can't support it. You'll have to dig further and see what is the issue. A good starting point is adding the "--debug" option to date(1) and examining its output. regards, - assaf
bug#35109: date 'tomorrow' bug
tags 35109 notabug close 35109 stop Hello, On 2019-04-02 7:23 a.m., Maximilian Gleißner wrote: I have encountered a possible bug with the date function using both SuSE LEAP 15.0 and SuSE 10.2. This bug occurs when asking date for 'tomorrow' when there is a daylight saving timechange. This is not a bug, just a usage issue. Note: The machine is located in the GMT+1 timezone, and daylight savings time changed on 31.03.2019 02:00 jumping to 03:00 Exactly - and 'date' adjust the time accordingly by adding an hour if the timezone was crossed. (technically it's not date(1) but glibc, if that matters). To replicate the bug: date -s "2019-03-30 23:XX" #where XX is any valid minute, e.g. 23:35 date -d 'tomorrow' #expected output: 2019-03-31 23:XX actual output: 2019-04-01 00:XX Note that 'date' printed one more critical piece of information: $ date Sat Mar 30 23:10:41 GMT 2019 $ date -d tomorrow Mon Apr 1 00:10:43 BST 2019 The timezone shifted from GMT to BST - and the time was adjusted accordingly by adding an hour, and crossing into April 1st. Similarly, if you waited 5 hours from 2019-03-30 23:35 it would be 5am, not 4am - and date needs to account for that: $ date Sat Mar 30 23:18:47 GMT 2019 $ date -d "+5 hours" Sun Mar 31 05:18:49 BST 2019 I am aware you recommend not using local timezones and daylight savings time, but I still think this should/could be implemented better. The GNU coreutils team does not recommend such a thing at all. In fact, team member Prof. Paul Eggert is the editor maintainer of the Time Zone database ( https://en.wikipedia.org/wiki/Tz_database ) which is used by almost every operating system and many programming languages ( https://en.wikipedia.org/wiki/Tz_database#Use_in_software_systems ). There is a strong recommendation however, to specify "noon" (12pm) whenever doing date arithmetics, exactly to avoid DST issues. See: https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e $ date Sat Mar 30 23:24:08 GMT 2019 $ date -d "12pm tomorrow" Sun Mar 31 12:00:00 BST 2019 On the other hand, it is the European Union that wants to do away with daylight saving time: https://www.bbc.com/news/world-europe-45366390 To learn more about the inner-working of GNU date and similar issues with DST, please see past discussions here: https://bugs.gnu.org/8357 https://bugs.gnu.org/11101 https://bugs.gnu.org/18159 https://bugs.gnu.org/30795 As such, I'm marking this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
tags 34488 fixed close 34488 stop Hello, The original request of "sort --limit" resulted in an improved "env" with options new options, which was included in the recent version 8.31. I'm therefor closing this item. -assaf
bug#34700: rm refuses to remove files owned by the user, even in force mode
tags 34700 notabug severity 34700 wishlist retitle: rm: add new --force option deal with read-only directories stop Hello, As explained by several people in this thread, This is not a bug in "rm -f", but the mandated behavior. Bob and others provided work-arounds ( https://bugs.gnu.org/34700#17 ). As for adding a new "--really-force" option (https://bugs.gnu.org/34700#11) - I'm marking this as a wish-list item. -assaf
bug#34825: New fails in tests/{misc,cp} in v8.31 on OpenIndiana
retitle 34825 OpenIndiana: tests/{misc,cp} fail in v8.31 stop Hello, On 2019-03-11 1:10 a.m., Michal Nowak wrote: on OpenIndiana 2018.10 (illumos kernel) the test suite has three new fails in v8.31 (amd64) compared to v8.30: FAIL tests/misc/timeout-parameters.sh (exit status: 1) FAIL tests/cp/no-deref-link1.sh (exit status: 1) FAIL tests/cp/no-deref-link2.sh (exit status: 1) The two 'ln' related bugs might be the same as this item: https://bugs.gnu.org/34894 with the fix committed here: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=3e0dff3925b5e521cae468087950e85b60002d1c Can you check whether it solved the issue on OpenIndiana as well? -assaf
bug#34894: Another solaris 10 ln issue on 8.31
tags 34894 fixed close 34894 stop On 2019-03-17 3:17 p.m., John Marino wrote: On 3/17/2019 15:28, Paul Eggert wrote: John Marino wrote: After applying the recent patch to 8.31 ln to fix functionality on solaris 10, I saw some improvement but I think there's something else wrong. Thanks. Could you please try the attached patch, which I installed on master? Installed here: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=3e0dff3925b5e521cae468087950e85b60002d1c Okay, that seems to fix the regression on ln. Thanks for confirming, I'm closing this bug. -assaf
bug#34923: Message/race bug in 'dd'
tags 34923 notabug severity 34923 wishlist retitle 34923 dd: add messages about IO errors stop On 2019-03-19 9:44 p.m., Paul Eggert wrote: Daniel A. Gauthier wrote: NOTICE that the "+nn" value on the line is always one off. It says +0 after the first error, +1 after the second, etc. until the correct count of error/short blocks is given at the end. The count is supposed to just count short blocks, not errors. This is a POSIX requirement. I installed the attached documentation patch to try to make this clearer. The documentation improvement was added here: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=59e01d13e600be0d2c7f08f3aff8cf11936b3ea1 Perhaps dd should output a separate line to stderr that summaries I/O errors; it could do that without violating POSIX. Marking this as a wishlist item. -assaf
bug#34968: (no subject)
tags 34968 notabug close 34968 stop On 2019-03-24 9:12 a.m., Bernhard Voelker wrote: I don't know Dutch, but this looks to me like the regular output of "sha256sum --help" from an older version of coreutils (<8.25, because the --ignore-missing option is not yet there). What is wrong with it? With no further replies, I'm closing this bug. -assaf
bug#34988: mv: check before asking users useless questions
tags 34988 notabug severity 34988 wishlist retitle 34988 mv: omit useless 'overwrite?' question stop On 2019-03-25 4:47 p.m., Paul Eggert wrote: On 3/24/19 11:05 PM, 積丹尼 Dan Jacobson wrote: $ mv a b mv: overwrite 'b'? y mv: cannot overwrite non-directory 'b' with directory 'a' User thinks well why didn't you check before uselessly asking me? POSIX requires the useless question. That being said, the question could be omitted in the case you describe, if POSIXLY_CORRECT is not specified. Marking this as a wishlist item. -assaf
bug#34905: uname: -i/-p returns "unknown"
tags 34905 moreinfo retitle 34905 uname: -i/-p returns "unknown" stop On 2019-03-19 9:48 p.m., Paul Eggert wrote: Wellington Almeida wrote: When using the -p and -i functions in the uname command I noticed that it returned an unknown result, can this be a bug? It could be a bug in the uname command, but more likely it's a kernel bug. Try running "strace uname -pi".
bug#35032: date ISO 8601 / RFC 3339 formats
severity 35032 wishlist retitle 35032 date: adjust rfc8601/3339 formats to W3C standard stop On 2019-03-28 11:20 a.m., Nicolas Mailhot wrote: Would it be possible to make them both optional in --rfc-3339, and both mandatory in --iso-8601 ? Or add a --w3c option that conforms to the W3C profile? This is all so sad… Some languages like Go do no understand neither of date's output, because they follow the W3C profile. I'm marking this as a "wishlist" item. For reference, here are previously similar requests: https://bugs.gnu.org/6132 - date: --rfc-3339=TIMESPEC option doesn't print 'T' https://bugs.gnu.org/6453 - date -- Add new options for ISO 8601 date formats (-O) https://bugs.gnu.org/14097 - date: add parsing support for ISO 8601 basic format -assaf
bug#33646: [PATCH] doc: improve wording of the --kibibytes option description
tags 33646 fixed close 33646 stop On 2019-03-15 8:38 a.m., Kamil Dudka wrote: Bug: https://bugzilla.redhat.com/1527391 --- doc/coreutils.texi | 8 +--- I can see no more comments on this. Could you please proceed to push it? Thanks for the reminder. Pushed here: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=6bd78f27fdc2df89b1219921c6f5735885f15e37 -assaf
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
Hello, Thanks for all comments. On 2019-02-24 11:33 a.m., Paul Eggert wrote: Thanks for doing all that. Although Pádraig is not enthusiastic about a shortcut like -p, I'm a bit warmer to it, as it's an important special case to fix a wart in POSIX. No big deal either way. For now I kept "-p", can be removed later of course. The first patch includes Pádraig's recent suggestions (slightly modified). The documentation should mention that SIGCHLD is special [...] The documentation should say what happens if mutually-contradictory options are specified, [...] The documentation should echo this suggestion in <http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html>: I've added those, and I welcome all improvements suggestion to grammar/phrasing/etc. > There should be options --block-signal[=SIG], --unblock-signal[=SIG], > and --setmask-signal[=SIG] that affect the signal mask, which is also > inherited by the child. These can be implemented via pthread_sigmask. The second patch adds these new options (separated to ease review). As for documentation - I'm not sure what to add beyond the basic option description. When should these be used? A third small patch adds "env ---list-signal-actions" and "env --list-blocked-signals" - to ease diagnostics. Might be worth adding for completeness (e.g., for users who need to somehow know if SIGPIPE is being ignored by the shell or not): $ ( trap '' PIPE && src/env --list-signal-actions ) PIPE (13): ignore Comments very welcomed, - assaf >From 02cba657e2f63c05f859daf18a7d1032fdc32c6f Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 15 Feb 2019 12:31:48 -0700 Subject: [PATCH 1/3] env: new options -p/--default-signal=SIG/--ignore-signal=SIG New options to set signal handlers to default (SIG_DFL) or ignore (SIG_IGN) This is useful to overcome POSIX limitation that shell must not override inherited signal state, e.g. the second 'trap' here is a no-op: trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1' Instead use: trap '' PIPE && sh -c 'env -p seq inf | head -n1' Similarly, the following will prevent CTRL-C from terminating the program: env --ignore-signal=INT seq inf > /dev/null See https://bugs.gnu.org/34488#8 . * NEWS: Mention new options. * doc/coreutils.texi (env invocation): Document new options. * man/env.x: Add example of --default-signal=SIG usage. * src/env.c (signals): New global variable. (shortopts,longopts): Add new options. (usage): Print new options. (parse_signal_params): Parse comma-separated list of signals, store in signals variable. (reset_signal_handlers): Set each signal to SIG_DFL/SIG_IGN. (main): Process new options. * src/local.mk (src_env_SOURCES): Add operand2sig.c. * tests/misc/env-signal-handler.sh: New test. * tests/local.mk (all_tests): Add new test. --- NEWS | 3 + doc/coreutils.texi | 58 man/env.x| 69 ++ src/env.c| 138 +++- src/local.mk | 1 + tests/local.mk | 1 + tests/misc/env-signal-handler.sh | 146 +++ 7 files changed, 415 insertions(+), 1 deletion(-) create mode 100755 tests/misc/env-signal-handler.sh diff --git a/NEWS b/NEWS index e73cb52b8..ddbbaf138 100644 --- a/NEWS +++ b/NEWS @@ -81,6 +81,9 @@ GNU coreutils NEWS-*- outline -*- test now supports the '-N FILE' unary operator (like e.g. bash) to check whether FILE exists and has been modified since it was last read. + env now supports '--default-singal[=SIG]' and '--ignore-signal[=SIG]' + options to set signal handlers before executing a program. + ** New commands basenc is added to complement existing base64,base32 commands, diff --git a/doc/coreutils.texi b/doc/coreutils.texi index eb1848882..c2c202b28 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -17246,6 +17246,64 @@ chroot /chroot env --chdir=/srv true env --chdir=/build FOO=bar timeout 5 true @end example +@item --default-signal[=@var{sig}] +Reset signal @var{sig} to its default signal handler. Without @var{sig} all +known signals are reset to their defaults. Multiple signals can be +comma-separated. The following command runs @command{seq} with SIGINT and +SIGPIPE set to their default (which is to terminate the program): + +@example +env --default-signal=PIPE,INT seq 1000 | head -n1 +@end example + +In the following example: + +@example +trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1' +@end example + +The first trap command sets SIGPIPE to ignore. The second trap command +ostensibly sets it back to its default, bu
bug#33468: A bug with yes and --help
Hello, On 2019-02-19 1:24 a.m., Bernhard Voelker wrote: On 2/18/19 11:20 AM, Assaf Gordon wrote: [...] what do you think ? To Eric's suggestion, I'd remove the RESET_OPTIND function argument, because it's never used. +1 Re. OPTIND: what about resetting the values of all involved externals to their previous value? + int saved_optind = optind; ... + /* Restore previous values. */ + optind = saved_optind; I believe restoring optind is incorrect here - did the tests pass after this change ? For example, I get the following: --- $ ./src/dd -- ./src/dd: unrecognized operand ‘--’ Try './src/dd --help' for more information. $ ./src/nohup -- ./src/nohup: ignoring input and appending output to 'nohup.out' ./src/nohup: failed to run command '--': No such file or directory $ ./src/yes -- | head -n1 -- --- All these programs expect 'optind' to point to the first non-option argument (because they all called "getopt_long" directly before your patch, and parse_gnu_standard_options_only() now calls getopt_long() for them, indirectly). So restoring it to its initial value of 1 is going to confuse the programs when they look into argv[optind] . Unless I got confused (it's rather late here). I'll double-check in the morning. regards, - assaf
bug#33468: A bug with yes and --help
Hello, On 2019-02-15 1:19 p.m., Eric Blake wrote: On 2/15/19 12:32 PM, Assaf Gordon wrote: There is at least one change in behavior, not sure if this is bad enough to be a regression or doesn't really matter: $ yes-OLD me -- --help | head -n1 me -- --help $ yes-NEW me -- --help | head -n1 me --help I would argue bug-fix. [...] So, I would suspect (although I have not yet tesed) that as patched, you would get: $ yes-NEW me -- --help | head -n1 me --help $ POSIXLY_CORRECT=1 yes-NEW me -- --help | head -n1 me -- --help $ yes-NEW -- me -- --help me -- --help Indeed - that's how it behaves with the patch. Thanks for explaining. In the gnulib patch: s/optional/option/ In the coreutils patch: s/non-options/non-option/ Attached updates with your suggested fixes. Also, all coreutils callers pass reset_optind==false; does the gnulib interface still need to provide a reset_optind parameter, given that setting the parameter true forces reliance on the getopt-gnu module as currently coded? The "getopt-gnu" was already a dependency before this patch, not sure if removing this parameter will save much hassle - what do you think ? -assaf >From 08d0505683cebed0fc10cff082255fd79da2d989 Mon Sep 17 00:00:00 2001 From: Bernhard Voelker Date: Thu, 29 Nov 2018 09:06:26 +0100 Subject: [PATCH] long-options: add parse_gnu_standard_options_only Discussed in https://bugs.gnu.org/33468 . * lib/long-options.c (parse_long_options): Use EXIT_SUCCESS instead of 0. (parse_gnu_standard_options_only): Add function to process the GNU default options --help and --version and fail for any other unknown long or short option. See https://gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html . * lib/long-options.h (parse_gnu_standard_options_only): Declare it. * modules/long-options (depends-on): Add stdbool, exitfail. * top/maint.mk (sc_prohibit_long_options_without_use): Update syntax-check rule, add new function name. --- lib/long-options.c | 68 +++- lib/long-options.h | 17 + modules/long-options | 2 ++ top/maint.mk | 2 +- 4 files changed, 87 insertions(+), 2 deletions(-) diff --git a/lib/long-options.c b/lib/long-options.c index 037f74b3a..b7acdb040 100644 --- a/lib/long-options.c +++ b/lib/long-options.c @@ -29,6 +29,7 @@ #include #include "version-etc.h" +#include "exitfail.h" static struct option const long_options[] = { @@ -71,7 +72,7 @@ parse_long_options (int argc, va_list authors; va_start (authors, usage_func); version_etc_va (stdout, command_name, package, version, authors); -exit (0); +exit (EXIT_SUCCESS); } default: @@ -87,3 +88,68 @@ parse_long_options (int argc, the probably-new parameters when/if getopt is called later. */ optind = 0; } + +/* Process the GNU default long options --help and --version (see also + https://gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html), + and fail for any other unknown long or short option. + Use with SCAN_ALL=true to scan until "--", or with SCAN_ALL=false to stop + at the first non-option argument (or "--", whichever comes first). + + if RESET_OPTIND=true, the global optind variable will be reset to zero, + preparing (and requiring) a follow-up gnu-compatible getopt() call + (non-gnu getopt functions use optreset=optind=1 instead of 0 for reset). + + if RESET_OPTIND=false, optind is left as-is (suitable for programs + which do not process further option parameters (but could still + process parameters directly by examining argv[optind]). */ +void +parse_gnu_standard_options_only (int argc, + char **argv, + const char *command_name, + const char *package, + const char *version, + bool scan_all, + bool reset_optind, + void (*usage_func) (int), + /* const char *author1, ...*/ ...) +{ + int c; + int saved_opterr; + + saved_opterr = opterr; + + /* Print an error message for unrecognized options. */ + opterr = 1; + + const char *optstring = scan_all ? "" : "+"; + + if ((c = getopt_long (argc, argv, optstring, long_options, NULL)) != -1) +{ + switch (c) +{ +case 'h': + (*usage_func) (EXIT_SUCCESS); + break; + +case 'v': + { +va_list authors; +va_start (authors, usage_func); +version_etc_va (stdout, command_name, package, version, authors); +exit (EXIT_SUCCESS); + } + +default: + (*usage_func) (exit_failure); + break; +
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
Hello, Thanks for all comments (on and off list). Attached an updated patch with documentation. The supported options are: --default-signal[=SIG] reset signal SIG to its default signal handler. without SIG, all known signals are included. multiple signals can be comma-separated. --ignore-signal[=SIG] set signal SIG to be IGNORED. without SIG, all known signals are included. multiple signals can be comma-separated. -p same as --default-signal=PIPE (lower-case "-p" as to not conflict with BSD, but of course can be changed to another letter). The new 'env-signal-handler.sh' test passes on GNU/linux, non-gnu/linux (alpine), and Free/Open/Net BSD. Comments very welcomed, - assaf >From 3542f1762c9f14e2275fe5e61d5d7f6275b420a9 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 15 Feb 2019 12:31:48 -0700 Subject: [PATCH] env: new options -p/--default-signal=SIG/--ignore-signal=SIG New options to set signal handlers to default (SIG_DFL) or ignore (SIG_IGN) This is useful to overcome POSIX limitation that shell must not override inherited signal state, e.g. the second 'trap' here is a no-op: trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1' Instead use: trap '' PIPE && sh -c 'env -p seq inf | head -n1' Similarly, the following will prevent CTRL-C from terminating the program: env --ignore-signal=INT seq inf > /dev/null See https://bugs.gnu.org/34488#8 . * NEWS: Mention new options. * doc/coreutils.texi (env invocation): Document new options. * man/env.x: Add example of --default-signal=SIG usage. * src/env.c (signals): New global variable. (shortopts,longopts): Add new options. (usage): Print new options. (parse_signal_params): Parse comma-separated list of signals, store in signals variable. (reset_signal_handlers): Set each signal to SIG_DFL/SIG_IGN. (main): Process new options. * src/local.mk (src_env_SOURCES): Add operand2sig.c. * tests/misc/env-signal-handler.sh: New test. * tests/local.mk (all_tests): Add new test. --- NEWS | 3 + doc/coreutils.texi | 43 man/env.x| 35 ++ src/env.c| 127 +- src/local.mk | 1 + tests/local.mk | 1 + tests/misc/env-signal-handler.sh | 146 +++ 7 files changed, 355 insertions(+), 1 deletion(-) create mode 100755 tests/misc/env-signal-handler.sh diff --git a/NEWS b/NEWS index fdde47593..5a8e8a3de 100644 --- a/NEWS +++ b/NEWS @@ -67,6 +67,9 @@ GNU coreutils NEWS-*- outline -*- test now supports the '-N FILE' unary operator (like e.g. bash) to check whether FILE exists and has been modified since it was last read. + env now supports '--default-singal[=SIG]' and '--ignore-signal[=SIG]' + options to set signal handlers before executing a program. + ** New commands basenc is added to complement existing base64,base32 commands, diff --git a/doc/coreutils.texi b/doc/coreutils.texi index be35de490..57b209e07 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -17227,6 +17227,49 @@ chroot /chroot env --chdir=/srv true env --chdir=/build FOO=bar timeout 5 true @end example +@item --default-signal[=@var{sig}] +Reset signal @var{sig} to its default signal handler. Without @var{sig} all +known signals are reset to their defaults. Multiple signals can be +comma-separated. The following command runs @command{seq} with SIGINT and +SIGPIPE set to their default (which is to terminate the program): + +@example +env --default-signal=PIPE,INT seq 1000 | head -n1 +@end example + +In the following example: + +@example +trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1' +@end example + +The first trap command sets SIGPIPE to ignore. The second trap command +ostensibly sets it back to its default, but POSIX mandates that the shell +must not change inherited state of the signal - so it is a no-op. + +Using @option{--default-signal=PIPE} (or its shortcut @option{-p}) can be +used to force the signal to its default behavior: + +@example +trap '' PIPE && sh -c "env -p seq inf | head -n1' +@end example + + +@item --ignore-signal[=@var{sig}] +Ignore signal @var{sig} when running a program. Without @var{sig} all +known signals are set to ignore. Multiple signals can be +comma-separated. The following command runs @command{seq} with SIGINT set +to be ignored - pressing @kbd{Ctrl-C} will not terminate it: + +@example +env --ignore-signal=INT seq inf > /dev/null +@end example + + +@item -p +Equivalent to @option{--default-signal=PIPE} - sets SIGPIPE to its default +behav
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
Hello, On 2019-02-17 1:12 p.m., Paul Eggert wrote: Assaf Gordon wrote: I don't mind either way (env feature or new program). This should be a new feature of 'nohup' not 'env', as 'nohup' is already about signal handling. I don't see a need for a new program. With 'nohup' I don't think there will be an easy (or at least intuitive way) to 'untrap' SIGPIPE without affecting the output: STDOUT will be redirected to 'nohup.out' automatically (unless we add more options like "--no-redirect"). Example: env -C /foo/bar PROGRAM## only change directory env --default-signal=PIPE PROGRAM ## only untrap SIGPIPE env -i PROGRAM ## only empty environment but nohup --default-signal=PIPE PROGRAM Will untrap SIGPIPE *and* SIGHUP *and* redirect stdout to a file. So we'll need to add: nohup --no-redirect-stdout --default-signal=PIPE PROGRAM Also, nohup's manual pages warns: "NOTE: your shell may have its own version of nohup, which usually supersedes the version described here. Please refer to your shell's documentation for details about the options it supports." And if there is a built-in "nohup", it will confuse users who want to use our new feature (and then more support questions, and we have to explain how to use "env nohup" or "\nohup". What do you think? -assaf
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
On 2019-02-16 4:56 p.m., Bernhard Voelker wrote: On 2/15/19 10:40 PM, Assaf Gordon wrote: $ seq | env --default-signal PIPE sort -n | sed 5q | wc -l src/env.c| 90 +++- That's quite a lot of new code. What about a new program ... quick shot (and maybe an unlucky name): 'trap' ? I don't mind either way (env feature or new program). "trap" will get mixed-up with the shell's built-in command. How about "untrap" (because the goal is to undo the 'trap' command), and also there's no "untrap" executable name in debian, so no name conflicts? will send an updated patch later today. -assaf
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
Helo, On 2019-02-15 8:20 a.m., Eric Blake wrote: On 2/15/19 8:43 AM, 積丹尼 Dan Jacobson wrote: sort: write failed: 'standard output': Broken pipe sort: write error [...] Perhaps coreutils should teach 'env' a command-line option to forcefully reset SIGPIPE back to default behavior [...] If we did that, then even if your sh is started with SIGPIPE ignored (so that the shell itself can't restore default behavior), you could do this theoretical invocation: $ seq | env --default-signal PIPE sort -n | sed 5q | wc -l 5 That is a nice idea, I could've used it myself couple of times. Attached a suggested patch. If this seems like a good direction, I'll complete it with NEWS/docs/etc. Usage is: env --default-signal=PIPE env -P ##shortcut to reset SIGPIPE env --default-signal=PIPE,INT,FOO This also works nicely with the recent 'env -S' option, so a script like so can always start with default SIGPIPE handler: #!/usr/bin/env -S -P sh seq inf | head -n1 comments welcomed, - assaf >From d65ddf38cd5cf60ba6fc4f1bf60f7324a3e6bebd Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 15 Feb 2019 12:31:48 -0700 Subject: [PATCH] env: new option -D/--default-signal=SIG [FIXME] See https://bugs.gnu.org/34488#8 . --- src/env.c| 90 +++- src/local.mk | 1 + tests/local.mk | 1 + tests/misc/env-signal-handler.sh | 68 ++ 4 files changed, 159 insertions(+), 1 deletion(-) create mode 100755 tests/misc/env-signal-handler.sh diff --git a/src/env.c b/src/env.c index 3a1a3869e..ebda91589 100644 --- a/src/env.c +++ b/src/env.c @@ -21,12 +21,16 @@ #include #include #include +#include +#include #include #include "system.h" #include "die.h" #include "error.h" +#include "operand2sig.h" #include "quote.h" +#include "sig2str.h" /* The official name of this program (e.g., no 'g' prefix). */ #define PROGRAM_NAME "env" @@ -48,7 +52,15 @@ static bool dev_debug; static char *varname; static size_t vnlen; -static char const shortopts[] = "+C:iS:u:v0 \t"; +/* if true, at least one signal handler should be reset. */ +static bool reset_signals ; + +/* if element [SIGNUM] is true, the signal handler's should be reset + to its defaut. */ +static bool signal_handlers[SIGNUM_BOUND]; + + +static char const shortopts[] = "+C:iPS:u:v0 \t"; static struct option const longopts[] = { @@ -56,6 +68,7 @@ static struct option const longopts[] = {"null", no_argument, NULL, '0'}, {"unset", required_argument, NULL, 'u'}, {"chdir", required_argument, NULL, 'C'}, + {"default-signal", optional_argument, NULL, 'P'}, {"debug", no_argument, NULL, 'v'}, {"split-string", required_argument, NULL, 'S'}, {GETOPT_HELP_OPTION_DECL}, @@ -88,8 +101,17 @@ Set each NAME to VALUE in the environment and run COMMAND.\n\ -C, --chdir=DIR change working directory to DIR\n\ "), stdout); fputs (_("\ + --default-signal=SIG reset signal SIG to its default signal handler.\n\ +multiple signals can be comma-separated.\n\ +"), stdout); + fputs (_("\ -S, --split-string=S process and split S into separate arguments;\n\ used to pass multiple arguments on shebang lines\n\ +"), stdout); + fputs (_("\ + -P same as --default-signal=PIPE\n\ +"), stdout); + fputs (_("\ -v, --debug print verbose information for each processing step\n\ "), stdout); fputs (HELP_OPTION_DESCRIPTION, stdout); @@ -525,6 +547,63 @@ parse_split_string (const char* str, int /*out*/ *orig_optind, *orig_optind = 0; /* tell getopt to restart from first argument */ } +static void +parse_signal_params (const char* optarg) +{ + char signame[SIG2STR_MAX]; + char *opt_sig; + char *optarg_writable = xstrdup (optarg); + + opt_sig = strtok (optarg_writable, ","); + while (opt_sig) +{ + int signum = operand2sig (opt_sig, signame); + if (signum < 0) +usage (EXIT_FAILURE); + + signal_handlers[signum] = true; + + opt_sig = strtok (NULL, ","); +} + + free (optarg_writable); +} + +static void +reset_signal_handlers (void) +{ + + if (!reset_signals) +return; + + if (dev_debug) + devmsg ("Resetting signal handlers:\n"); + + for (int i=0; ihttps://www.gnu.org/licenses/>. + +. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src +print_ver_ seq +trap_sigpipe_or_skip_ + +# Paraphrasing http://bugs.gnu.org/34488#8: +# POSIX requires that sh started with an inherit
bug#33468: A bug with yes and --help
Hello Eric and all, Thanks for the quick and detailed review. I've amended all the issues you mentioned. On 2019-02-13 8:20 p.m., Eric Blake wrote: 15 files changed, 46 insertions(+), 141 deletions(-) Nice diffstat. These are of course Bernhard's improvements, I just did the testing (and some minor things). diff --git a/NEWS b/NEWS Is "argument" better than "option" here? Or, maybe: now always process --help and --version options, regardless of any other arguments present before any optinoal -- end-of-options marker. I've used your phrasing, and also separated "nohup" from the rest of the programs, as it does not accept --help/--version anywhere, just as first arguments. Attached updated patches, with tests. comments welcomed, - assaf P.S. There is at least one change in behavior, not sure if this is bad enough to be a regression or doesn't really matter: $ yes-OLD me -- --help | head -n1 me -- --help $ yes-NEW me -- --help | head -n1 me --help gnulib-0001-long-options-add-parse_gnu_standard_options_only.patch.gz Description: application/gzip 0001-all-detect-help-and-version-more-consistently.patch.gz Description: application/gzip
bug#34488: Add sort --limit, or document workarounds for sort|head error messages
severity 34488 wishlist retitle 34488 doc: sort: expand on "broken pipe" (SIGPIPE) behavior stop Hello, On 2019-02-15 7:43 a.m., 積丹尼 Dan Jacobson wrote: Things start out cheery, but quickly get ugly, $ for i in 9 99 999 9; do seq $i|sort -n|sed 5q|wc -l; done 5 5 5 5 sort: write failed: 'standard output': Broken pipe sort: write error 5 sort: write failed: 'standard output': Broken pipe sort: write error Therefore, kindly add a sort --limit=n, I don't think this is wise, as "head -n5" does exactly that in much more generic way. and/or on (info "(coreutils) sort invocation") admit the problem, and give some workarounds, lest our scripts occasionally spew error messages seemingly randomly, just when the boss is looking. Just to clarify: why do you think this a "problem" ? This is the intended behavior of most proper programs: Upon receiving SIGPIPE they should terminal with an error, unless SIGPIPE is explicitly ignored. The errors are not "random" - they happen because you explicitly cut short the output of a program. It is an important indication about how your pipe works, and sort is not to blame, e.g.: $ seq 10 | head -n1 1 seq: write error: Broken pipe $ seq 100| cat | head -n1 1 cat: write error: Broken pipe seq: write error: Broken pipe This is a good indication that the entire output was not consumed, and is very useful and important in some cases, e.g. when a program crashes before consuming all input. Here's a contrived example: $ seq 100 | sort -S 200 -T /foo/bar sort: cannot create temporary file in '/foo/bar': No such file or directory seq: write error: Broken pipe I force "sort" to fail (limiting it's memory usage and pointing it to non-existing temporarily directory). It is then good to know that seq's output was cut short and not consumed. If you know in advance you will trim the output of a program, either hide the stderr with "2>/dev/null", or use the shell's "trap PIPE" mechanism. And no fair saying "just save the output" (could be big) "into a file first, and do head(1) or sed(1) on that." If you want to consume all input and just print the first 5 lines, you can use "sed -n 1,5p" instead of "sed 5q" - no need for a temporary file. I'm marking this as a documentation "wishlist" item, and patches are always welcomed. regards, - assaf
bug#34475: Mention even more worries for test -a
severity 34475 wishlist retitle 34475 doc: test: expand on -a/-o usage stop Hello, On 2019-02-13 6:00 p.m., 積丹尼 Dan Jacobson wrote: First, on the test(1) man page, at [...]> Say instead [...] I'm marking this as "wishlist" item, patches always welcomed. -assaf
bug#34487: dd (coreutils) 8.30 – A written ISO image cannot not be booted from BIOS
tags 34487 notabug close 34487 stop Hello, On 2019-02-15 5:02 a.m., Ricky Tigg wrote: Hi. An ISO image cannot not be booted from BIOS [...] # dd if=debian-9.7.0-amd64-DVD-3.iso of=/dev/sdc A CD/DVD image is not the same as a hard drive. The internal structure differs, and one can not be copied to the other and expected to work. Since this is not a bug in dd, I'm closing this item. For general help regarding operating system installation, please contact the relevant operating system's mailing list. In this case, likely Debian-user mailing list: https://lists.debian.org/debian-user/ See also https://wiki.debian.org/DebianMailingLists . regards, -assaf
bug#33468: A bug with yes and --help
Hello, On 2019-02-12 7:00 p.m., Eric Blake wrote: On 2/12/19 7:21 PM, Assaf Gordon wrote: + optind = 1; Why are you doing this in every caller, instead of doing it just once inside the body of parse_gnu_standard_options_only(), so that the state is left unchanged at optind==1 if there were no options parsed? That was just an ugly hack. Here are a more complete patches (both for gnulib and for coreutils). All existing tests pass (including nohup's exit code) but I did not yet write new tests for these improvements. Comments welcomed. -assaf 0001-all-detect-help-and-version-more-consistently-FIXME.patch.gz Description: application/gzip gnulib-0001-long-options-add-parse_gnu_standard_options_only.patch.gz Description: application/gzip
bug#33468: A bug with yes and --help
Hello, A follow-up and more details: On 2019-01-12 11:30 a.m., Assaf Gordon wrote: On 2019-01-12 8:42 a.m., Eric Blake wrote: On 1/11/19 6:23 PM, Assaf Gordon wrote: - optind = 0; + optind = 1; Ouch. You're hitting the portability problem of the difference between BSD and glibc. I only tested on Debian Stretch (with Debian GLIBC 2.24-11+deb9u3), did not yet test on BSDs. With "optind=1", I see the following: === $ ./src/hostid ec68f06c [...] With "optind=0" I see the following: === $ ./src/hostid ./src/hostid: extra operand ‘./src/hostid’ Try './src/hostid --help' for more information. Eric's suggestion was not wrong, "optint=0" was already used (and worked just fine) in parse_long_option. But there's a catch: after calling "parse_long_options" (which sets optind=0), every program called "getopt_long" again! and that call set optind to non-zero value. Bernhard's patch removed the (now unneeded) getopt_long call: === - parse_long_options (argc, argv, PROGRAM_NAME, PACKAGE, Version, - usage, AUTHORS, (char const *) NULL); - if (getopt_long (argc, argv, "", long_options, NULL) != -1) -usage (EXIT_FAILURE); + parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE, Version, + true, usage, AUTHORS, (char const *) NULL); === And so all these programs were left with "optind=0" when the checked non-option arguments, e.g.: === if (optind < argc) { error (0, 0, _("extra operand %s"), quote (argv[optind])); usage (EXIT_FAILURE); } === which resulted in all the parsing errors I reported previously. Perhaps "parse_gnu_standard_options_only" should use "_getopt_long_r" and avoid the need to reset anything? _getopt_long_r was ostensibly fine, but turned out to be messy: when coreutils is built on glibc systems, all of gnulib's getopt replacement modules are not used, and so _getopt_long_r is not available. As all the programs in this patch accept only --help and --yes (and non-option arguments), the attached ugly hack seems to solve the issue. There's probably a prettier way. With this patch, the only issues left are nohup's exit code (1 instead of 125) and "dd --", see https://bugs.gnu.org/33468#29 regards, - assaf >From eb4ed1a5417a2d50941181aa1d8e06b674c661a8 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Tue, 12 Feb 2019 17:58:47 -0700 Subject: [PATCH] all: parse_gnu_standard_options_only fixup --- src/cksum.c| 1 + src/dd.c | 1 + src/hostid.c | 1 + src/hostname.c | 1 + src/link.c | 1 + src/logname.c | 1 + src/nohup.c| 1 + src/sleep.c| 1 + src/tsort.c| 1 + src/unlink.c | 1 + src/uptime.c | 1 + src/users.c| 1 + src/whoami.c | 1 + src/yes.c | 1 + 14 files changed, 14 insertions(+) diff --git a/src/cksum.c b/src/cksum.c index cda61516a..b62249862 100644 --- a/src/cksum.c +++ b/src/cksum.c @@ -291,6 +291,7 @@ main (int argc, char **argv) parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE, Version, true, usage, AUTHORS, (char const *) NULL); + optind = 1; have_read_stdin = false; diff --git a/src/dd.c b/src/dd.c index b361e7d5a..f47e8a788 100644 --- a/src/dd.c +++ b/src/dd.c @@ -2393,6 +2393,7 @@ main (int argc, char **argv) parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE, Version, true, usage, AUTHORS, (char const *) NULL); + optind = 1; close_stdout_required = false; /* Initialize translation table to identity translation. */ diff --git a/src/hostid.c b/src/hostid.c index d9ea8929b..f023a3da1 100644 --- a/src/hostid.c +++ b/src/hostid.c @@ -66,6 +66,7 @@ main (int argc, char **argv) parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE_NAME, Version, true, usage, AUTHORS, (char const *) NULL); + optind = 1; if (optind < argc) { diff --git a/src/hostname.c b/src/hostname.c index 761f775b4..3a9d1dd80 100644 --- a/src/hostname.c +++ b/src/hostname.c @@ -83,6 +83,7 @@ main (int argc, char **argv) parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE_NAME, Version, true, usage, AUTHORS, (char const *) NULL); + optind = 1; if (argc == optind + 1) { diff --git a/src/link.c b/src/link.c index d70d434d9..d21f36099 100644 --- a/src/link.c +++ b/src/link.c @@ -69,6 +69,7 @@ main (int argc, char **argv) parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE_NAME, Version, true, usage, AUTHORS, (char const *) NULL); + optind = 1; if (argc < optind + 2) { diff --git a/src/logname.c b/src/logname.c in
bug#34345:
On 2019-02-09 1:18 p.m., Ricky Tigg wrote: Covered object by values '1994 s', '2014.25 s' seems to be a unique time elapsed. Those values can therefore be expected to be identical, either '1994 s' or '2014.25 s' – 2014 s and 25 hundredths of s –. The command was: # dd if=/dev/zero of=/dev/sdc status=progress > 8003555840 bytes (8.0 GB, 7.5 GiB) copied, 1994 s, 4.0 MB/s> dd: writing to '/dev/sdc': No space left on device> 15638481+0 records in> 15638480+0 records out> 8006901760 bytes (8.0 GB, 7.5 GiB) copied, 2014.25 s, 4.0 MB/s The first status report (with 1994s) is printed due to "status=progress" and is updated periodically. The last status line (with 2014.25s) was printed about 20 seconds later, hence the time difference.
bug#34220: failure to building with CompCert, patch proposed
tags 34220 wontfix close 34220 stop Hello, On 2019-01-27 9:03 p.m., Paul Eggert wrote: DAVID MONNIAUX wrote: under CompCert, floating-point values are not simplified at compile time [...] please file a bug report for CompCert so that its maintainers can fix the bug in the compiler. Given the above, I'm closing this as "won't fix". Discussion can continue by replying to this thread. -assaf
bug#34340: cp -a doesn't copy acls
tags 34340 moreinfo stop Hello, On 2019-02-05 4:50 p.m., L A Walsh wrote: and it is not on the manpage, but tar copies acls and has them on the manpage. It guess it is an oversite that cp copies over 'xattrs' but not acls? First, Can you verify the 'cp' binary you are using was compiled with ACL support? Something like: $ ldd $(which cp) | grep acl libacl.so.1 => /lib/x86_64-linux-gnu/libacl.so.1 (0x7f0b68066000) Second, Are you using a local file-system or a remote one? There is a previous bug about ACLs on NFS4: https://bugs.gnu.org/20884 . Third, Do you have a reproducible case? e.g. on my local system: $ touch a $ setfacl -m "u:nobody:w" a $ cp -a a b $ getfacl b # file: b # owner: gordon # group: gordon user::rw- user:nobody:-w- group::r-- mask::rw- other::r-- If you have a reproducible case, please also run it with "strace" to help us troubleshoot the issue more clearly, e.g. strace -o cp-acl.log cp -a a b And attach the 'cp-acl.log' file. regards, - assaf
bug#34345: coreutils v.8.30 – Two fractional digits accuracy. Appropriate notation regarding time elapsed.
severity 34345 wishlist retitle 34345 dd: report elapsed time as HH:MM:SS.NNN stop Hello, Ricky Tigg's original message was sent to coreut...@gnu.org (not to bug-coreutils@gnu.org), and did not create a new bug report: https://lists.gnu.org/r/coreutils/2019-02/msg3.html This is of course absolutely fine - but PLEASE keep the same mailing list when replying (e.g. don't reply to bug-coreutils@gnu.org if the message was sent to coreut...@gnu.org). The original message said: On 2019-02-06 2:36 a.m., Ricky Tigg wrote: > Enhancements request > Hi. Command executed: > > # dd if=/dev/zero of=/dev/sdc status=progress > 8003555840 bytes (8.0 GB, 7.5 GiB) copied, 1994 s, 4.0 MB/s > dd: writing to '/dev/sdc': No space left on device > 15638481+0 records in > 15638480+0 records out > 8006901760 bytes (8.0 GB, 7.5 GiB) copied, 2014.25 s, 4.0 MB/s > > > - Could values reported at '(8.0 GB, 7.5 GiB)' be displayed using a two > fractional digits accuracy (model 1.23 GiB). That is not likely to change. The printing code uses a common function (human_readable) that is used in several other coreutils programs, and the convention is to print up to 3 digits (or 2 digits with one decimal point). Examples: 1.1 KB 11 KB 111 KB 11 MB 111 MB 1.1 GB 11 GB 111 GB 1.1 TB The exact number of bytes is printed at the beginning of the line, and you can use 'numfmt' to format it to your liking: $ dd if=/dev/zero of=/dev/zero bs=1M count=10 2>&1 \ | tail -n1 | numfmt --to=si --format '%2.5f' 10.48576M bytes (10 MB, 10 MiB) copied, 0.00315612 s, 3.3 GB/s > - Could values reported at '1994 s', '2014.25 s' to be displayed > according to notation <*hour*>*h*<*minutes*>*'*<*seconds*>*''*. An interesting idea. I couldn't find previous discussion about it, so I'll mark this item as a "wishlist". As always, patches are welcomed. > Potential issue: > Since values '1994 s', '2014.25 s' expressed here may in the present > context cover nothing but a same object, non-identical values may be > interpreted as a programming issue. Regards. I don't understand the above - can you clarify ? regards, - assaf
bug#34349: unrecognized file system type 0x794c7630
Hello, On 2019-02-06 5:16 a.m., Matt Wilder wrote: tail: unrecognized file system type 0x794c7630 for ‘/var/log/syslog’. please report this to bug-coreutils@gnu.org. reverting to polling Thank you for the report. This has been fixed in version 8.25 and later, for more details see https://www.gnu.org/software/coreutils/filesystems.html . regards, - assaf
bug#34143: [coreutils 8.28] du -x is reporting a lower disk usage for /mnt when partitions are mounted
tags 34143 notabug close 34143 stop Hello, On 2019-01-19 3:11 p.m., Joseph Paul wrote: It may not be a bug at all, but I was surprised to find out that 'du -x' is reporting a lower disk usage on /mnt when partitions are mounted. This is not a bug. Technically, as you wrote below, du simply skips (and does not count) any directory that is not on the same filesystem. [...] linux$ du -x /mnt 4/mnt/data 4/mnt/VL1800 4/mnt/nfs/nas 8/mnt/nfs 20/mnt /mnt is now bigger. Is this a normal result, because even when mounted, physically, the directories '/mnt/VL1800' and '/mnt/data' still exist on the '/' filesystem, or not ? Shouldn't they still occupy 4Kb of disk space each on the '/' filesystem when partitions are mounted ? They do occupy as much disk space as before, but du has no way to know how much they occupy, because the kernel reports that they are on a different device and you requested -x/--one-file-system. We can even take it a step further, and mount a new filesyetem on a non-empty directory - all the directory's content won't be counted: As root, create the directory structure: cd /tmp mkdir -p a a/b a/c a/d Now fill the "b" directory with a large file: dd if=/dev/zero of=a/b/bigfile bs=1M count=1 Before any mounts, "b" is counted: # du -x a 4 a/c 1028 a/b 4 a/d 1040 a Now create a temporary file system loop file, and mount it over "b": dd if=/dev/zero of=disk.img bs=1M count=10 mkfs.ext3 disk.img mount -o loop disk.img a/b Re-checking disk-usage, "b" is not even listed, and its content (1MB) is not counted: # du -x a 4 a/c 4 a/d 12 a --- To see why du skips it, you can check the Device-ID associated with each directory: # stat -c "%n Device-ID: %D Mount-Point: %m" a a/b a/c a/d a Device-ID: 812 Mount-Point: /tmp a/b Device-ID: 700 Mount-Point: /tmp/a/b a/c Device-ID: 812 Mount-Point: /tmp a/d Device-ID: 812 Mount-Point: /tmp Your device numbers will differ, but the number for "a/b" will not be the same as for the rest. When du sees a different device number, it simply skips the directory. Once unmounted, the device-id returns to the old value, and "a/b" will be counted with its content: # umount a/b # stat -c "%n Device-ID: %D Mount-Point: %m" a a/b a/c a/d a Device-ID: 812 Mount-Point: /tmp a/b Device-ID: 812 Mount-Point: /tmp a/c Device-ID: 812 Mount-Point: /tmp a/d Device-ID: 812 Mount-Point: /tmp As such, I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#32198: tail -f -F unexpected behavior
tags 32198 notabug close 32198 stop Hello, It seems your message has not been replied to in a long while. Sorry about that. On 2018-07-18 8:24 a.m., Matthew Guidry wrote: I was doing some experimentation with nano v2.9.3 and tail, watching the output of tail after saving in nano and encountered some strange behavior. This is not a bug at all (not in tail nor in nano). It is the result of updating a file in-place (i.e. changing existing bytes) which 'tail' already consumed. I had two terminals open side by side; one with nano and one with tail. I opened a file called test.txt in nano and saved with ^w in the first terminal. I went to the second terminal and ran tail -f test.txt to watch the file. I went back to the nano terminal and returned twice and saved. The tail terminal reports this change properly. With the file still open in nano, I write any number of characters and save. The tail terminal reports this change But skips the first character. To better see what happens, open a third terminal, and run the following command (after initially saving the file): watch -n1 od -tc test.txt Which will show the content of the file, updated once a second. I will use a similar but slightly different flow: 1. When you first save (in nano) the file, it is empty. The "od" terminal will show: 000 2. Type "12345" (don't press ENTER), and save (ctrl-O). The "od" terminal will show: 000 1 2 3 4 5 \n 006 The "tail" terminal will show: 12345 AND the cursor in the "tail" terminal will go to the next line (as there is a newline in the file). 3. Still in nano, on the same line, type "67890" (don't press ENTER), and save (CTRL-O). The "od" terminal will show: 000 1 2 3 4 5 6 7 8 9 0 \n 011 The "tail" terminal will show: 12345 7890 Here, the "6" character was not displayed by "tail". The reason is that that character in offset 6 of the file used to be a newline, and "tail" already consumed it. When the line was changed, nano went back and changed existing data in the file (or re-wrote the file completely - not sure about the implementation). "tail" has no way to detect that or "go back" in the file. This is a carefully constructed example, where the data change is small enough so that that "tail" almost doesn't notice it. If you make larger changes, or delete some parts of the file, nano will rewrite the file completely and "tail" will issue a warning such as: tail: test.txt: file truncated and then re-read the file. As such I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#32455: cp gets confused by symlinks to parent directory
tags 32455 notabug close 32455 stop Hello, It seems your message has not been replied to in a long while. Sorry about that. On 2018-08-16 8:47 a.m., Mike Crowe wrote: If cp is passed the -d option and told to copy a symlink to the directory containing the symlink then it ends up removing the target directory so it is unable to create the symlink. If my understanding is correct, the "-d" flag is not relevant to the issue. The problem is that "self" is a symlink to a directory: Reproduction script: --8<-- #!/bin/sh set -x rm -rf temp mkdir -p temp/src temp/dest ln -s . temp/src/self # This one works cp -vd temp/src/self temp/dest/self This works because "temp/dest/self" does not exist. In this case, "temp/dest" is taken as the destination directory, and "self" is taken as the name of the file/dir/symlink to create. That is, you could run "cp -vd temp/src/self temp/dest/foobar" to create "foobar" as a copy of "self". # This one fails cp -vd temp/src/self temp/dest/self Here, "temp/dest/self" already exists, and it is a symlink to a directory. Meaning, the request is: copy "temp/src/self" into the directory "temp/dest/self/" (and create "temp/dest/self/self"). This would have gone well, except that because "self" is a symlink to ".", it can be resolved indefinitely: $ file temp/dest/self temp/dest/self: symbolic link to . $ file temp/dest/self/self/self/self/self/self temp/dest/self/self/self/self/self/self: symbolic link to . $ file temp/dest/self/self/self/self/self/self/self temp/dest/self/self/self/self/self/self/self: symbolic link to . "cp" first removes "temp/dest/self/self" (which is valid), but then, "temp/dest/self" is gone (since it is the same file path after resolving it). Hence, "cp" fails by saying "no such directory" on "temp/dest/self/self". When this step is done, "temp/dest/self" does not exist, and so: # This one works again cp -vd temp/src/self temp/dest/self This works as before. You can observe what happens on the kernel level by adding "strace -e trace=file" before the "cp" commands, this might help in deeper understanding. To illustrate this differently: When creating regular directories and files, then deleting the innermost files, it is naively expected that the parent directories still exist: mkdir -p a/b/c/d touch a/b/c/d/e rm a/b/c/d/e That is, a normal program can call "dirname("a/b/c/d/e")" to get the parent directory of "e", and expect it to still exist even after "e" is deleted. But with your case: $ mkdir a $ ln -s . a/self $ rm a/self/self/self/self/self/self All the "apparent" parent directories ("self/self/self/self/self") are gone! Expected behaviour: There should be no error message emitted by the second invocation of cp, and the target directory should be in the same state as it is after the first or third attempts to copy the symlink. Not exactly. What you want is for the DEST parameter of "cp" to always be a file, never to be considered a directory, i.e. "temp/dest/self" should always be interpreted as file "self" in directory "temp/dest", never as directory "temp/dest/self". Luckily, there is already an option for that: -T, --no-target-directory treat DEST as a normal file With "-T", repeated commands work as you expected: $ mkdir -p temp/src temp/dest $ ln -s . temp/src/self $ cp -vdT temp/src/self temp/dest/self 'temp/src/self' -> 'temp/dest/self' $ cp -vdT temp/src/self temp/dest/self removed 'temp/dest/self' 'temp/src/self' -> 'temp/dest/self' $ cp -vdT temp/src/self temp/dest/self removed 'temp/dest/self' 'temp/src/self' -> 'temp/dest/self' $ cp -vdT temp/src/self temp/dest/self removed 'temp/dest/self' 'temp/src/self' -> 'temp/dest/self' [... ad infinitum ...] I hope this addresses the issue, I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)
Hello, On 2019-01-18 2:56 a.m., René J.V. Bertin wrote: the code isn't the most welcoming to dive into I've ever seen ;) Two online resources that might help in exploring the code: http://www.maizure.org/projects/decoded-gnu-coreutils/ https://opengrok.housegordon.com/source/xref/coreutils/ regards, - assaf
bug#8960: stdbuf on bi-arch systems
severity 8960 wishlist stop (triaging old bugs) Hello, On 2011-07-04 10:15 a.m., Pádraig Brady wrote: On 29/06/11 21:47, Bruno Haible wrote: The program 'stdbuf' on bi-arch x86 / x86_64 systems cannot work on all kinds of programs. [...] I would like to have a single binary that works on both x86 and x86_64 programs. [...] if stdbuf sets both LD_PRELOAD_32 and LD_PRELOAD_64 to the appropriate libstdbuf.so, it should just work. It's been more than 7 years since last comments/progress on this issue. Is it still relevant / needed ? If no one replies, I'll close it as "wontfix" in a few days. regards, - assaf
bug#12339: Gnu rm, changed only recently (4-5 years), and didn't follow letter of posix...(statement follows)
close 12339 stop (triaging old bugs) Hello, This long and winding thread covers several topics relating to rm(1), historical unix and POSIX compatibility (and a bugfix or two in the mix). An enlightening read for those interested... ( https://bugs.gnu.org/12339 ) But the bottom line is: rm -rf . will not delete the content of the current directory (while keeping the directory itself) and that is not likely to change. Two suggested alternatives: find . -delete rm -rf * .[!.] .??* As such, and with no more comments in 6 years, I'm closing this bug. PLEASE do not reply to this thread. If there are other relevant issues (that have not been discussed elsewhere, and have not been previously rejected), please start a new thread by emailing coreut...@gnu.org . regards, - assaf
bug#12400: rmdir runs "amok", users "curse" GNU...(as rmdir has no option to stay on 1 file system)...
retitle 12400 rmdir: add --one-file-system option severity 12400 wishlist tags 12400 wontfix stop (triaging old bugs) Hello, On 2012-09-09 11:22 p.m., Bob Proulx wrote: Linda Walsh wrote: If you are going to only provide 1 mode of functionality, it should be to only rmdir dirs on the same file system as the starting args. [...] But rmdir only removes the directories you tell it to remove. [...] If you want a recursive option why not use 'rm -rf'? There is always 'find' with the -delete option. But regardless there has been the find -exec option. find /some/path -type d -delete find /some/path -depth -type d -exec rmdir {} + With no further comments in 6 years, I'm closing this request. regards, - assaf
bug#33211: coreutils.mo is in both LC_TIME and LC_MESSAGES folders
tags 33211 notabug close 33211 stop Hell0, On 2018-10-30 3:33 p.m., scootergrisen wrote: I wonder if its a mistake that in Fedora i can see coreutils.mo in both: /usr/share/locale/*/LC_TIME /usr/share/locale/*/LC_MESSAGES They seem to be identical. This is not a mistake (nor a bug). Not only they are identical, one is a symlink to the other: $ cd /usr/local/share/locale/ca $ ls -log LC_*/coreutils.mo -rw-r--r-- 1 379478 Dec 27 22:47 LC_MESSAGES/coreutils.mo lrwxrwxrwx 1 27 Dec 27 22:47 LC_TIME/coreutils.mo -> ../LC_MESSAGES/coreutils.mo coreutils.mo is the only file i see in the /usr/share/locale/*/LC_TIME folder. Most programs that use gettext (https://www.gnu.org/software/gettext/) are concerned with user visible messages, hence most of the translation only use LC_MESSAGES directory, and there's no need for other files. Few coreutils programs (e.g. date, sort) do care about translation of time-related strings (e.g. days / month names). That's why coreutils also uses LC_TIME. One can ask for the date/time to use one local, and messages to use another: $ export LC_TIME=ru_RU.UTF-8 $ export LANGUAGE=ja_JP.UTF-8 $ date Пт янв 18 01:06:10 MST 2019 $ date -d ABCD date: `ABCD' は無効な日付です Should the /usr/share/locale/*/LC_TIME/coreutils.mo files be removed so there is only the /usr/share/locale/*/LC_MESSAGES/coreutils.mo files? No, Both should exist, otherwise setting LC_TIME won't work. Technically, the translated strings for both messages and time are stored in the same file - that's why when coreutils is installed, one is a symlink to the other. Even more technically, when building from source, the file "bootstrap.conf" contains the following: # Other locale categories that need message catalogs. EXTRA_LOCALE_CATEGORIES=LC_TIME The directory "./po" is populated with available translation (e.g. "ru.po" and "ja.po"). During the build, the ".po" files are compiled into binary ".gmo" files. During installation, the files are copied/symlinked: $ make install [...] make[2]: Entering directory '/home/gordon/projects/coreutils/po' installing af.gmo as /usr/local/share/locale/af/LC_MESSAGES/coreutils.mo installing af.gmo link as /usr/local/share/locale/af/LC_TIME/coreutils.mo installing be.gmo as /usr/local/share/locale/be/LC_MESSAGES/coreutils.mo installing be.gmo link as /usr/local/share/locale/be/LC_TIME/coreutils.mo [...] Hope this addresses the issue. I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf
bug#33646: [PATCH] doc: improve wording of the --kibibytes option description
Hello, On 2018-12-06 6:32 a.m., Kamil Dudka wrote: Bug: https://bugzilla.redhat.com/1527391 --- doc/coreutils.texi | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index f8339d73f..e93fe71a0 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -7975,9 +7975,11 @@ Append @samp{*} for executable regular files, otherwise behave as for @opindex --kibibytes Set the default block size to its normal value of 1024 bytes, overriding any contrary specification in environment variables -(@pxref{Block size}). This option is in turn overridden by the -@option{--block-size}, @option{-h} or @option{--human-readable}, and -@option{--si} options. +(@pxref{Block size}). If @option{--block-size}, @option{-h}, +@option{--human-readable}, or @option{--si} options are used, +they take precedence over @option{-k} or @option{--kibibytes} +even if @option{-k} or @option{--kibibytes} is placed after +the other options. The @option{-k} or @option{--kibibytes} option affects the per-directory block count written by the @option{-l} and similar I'm ok with this improvement - if there are no comments I'll push in the next few days. -assaf
bug#33718: Syntaxe problem? I can't find the solution :-(
tags 33718 moreinfo stop Hello, On 2018-12-13 1:54 a.m., Rudy BROSTEAUX wrote: Environment: AIX 7.2 TL3 SP1 (on IBM Power Systems) Origin of the coreutils RPM used @release 8.30 is perzl.org Installed using a yum server. *** /root> /usr/bin/time timeout 2.3 sleep 5 timeout: warning: timer_create: Invalid argument From: Bernhard Voelker No idea. For an analysis, we need more information: timeout version, OS/kernel version, and finally of course a reproducer, i.e., the exact command line you were using. timeout works well here: $ /usr/bin/time -f '%e' timeout 2.3 sleep 4.5 Command exited with non-zero status 124 2.30 I tested coreutils-8.30 built from source on AIX 7.2, both 64 bit and 32 bit, and both work fine: $ file src/timeout src/timeout: 64-bit XCOFF executable or object module not stripped $ /usr/bin/time ./src/timeout 2.3 sleep 5 Real 2.31 User 0.00 System 0.00 $ file src/timeout src/timeout: executable (RISC System/6000) or object module not stripped $ /usr/bin/time ./src/timeout 2.3 sleep 5 Real 2.30 User 0.00 System 0.00 Perhaps it is a problem in the RPM package? Please try to build from source code and see if you see experience the issue. regards, - assaf
bug#9089: pipe failure with cat and head of coreutils 6.12
close 9089 stop (triaging old bugs) Hello, On 2011-07-15 5:30 a.m., Philipp Thomas wrote: I'm trying to track down a bug in cat of coreutils 6.12. Doing cat /var/log/Xorg.0.log | head -n70 under ksh consistently fails with 'cat: write error: Connection reset by peer'. It does not fail when run under bash and it does not fail in current coreutils . I'm still able to reproduce this problem (i.e. "cat: write error" with coreutils 6.12 under ksh on Linux 4.9.0). However, Given that it is has been seven and a half years ago since the last comment, and even then it was already acknowledged that the problem does not happen in later versions, and it happens because ksh uses socketpairs instead of pipes - I'm closing this bug. If the issue of cat(1) supporting socketpair/ECONNRESET instead of pipes/EPIPE is still relevant, we can re-open the bug. regards, - assaf
bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)
severity 34110 wishlist retitle 34110 du: add dual-column showing apparent-size and disk-size stop Hello, On 2019-01-17 3:13 a.m., René J.V. Bertin wrote: On Wednesday January 16 2019 16:06:50 Assaf Gordon wrote: I hope this helps to clarify "apparent-size". Yes and no :) I understand what "apparent-size" does [] My whole point is that there might be a better name. The parameter name "--apparent-size" is not likely to be changed. It has been named so for about 16 years (since 'fileutils 4.5.8' which is even before 'coreutils' was created as a unified package). Changing it would break existing scripts and user expectations. I realise that you cannot really call the content size observable "real size" when reporting from a disk-usage viewpoint, but "content size" (--content-size, -C) should be clear enough? Creating a second alias to "--apparent-size" is possible, but I'm not sure it's warranted. --- I think the discussion about "--apparent-size" is mostly concluded, but the idea to have two-columns is an interesting feature request. I'm marking this as a "wish list" item. Concrete patches are welcomed. regards, - assaf
bug#13738: Add --all option to 'users' command
tags 13738 wontfix close 13738 stop (triaging old bugs) Hello, On 2013-02-18 2:01 p.m., Bob Proulx wrote: anatoly techtonik wrote: Bob Proulx wrote: anatoly techtonik wrote: The 'users' command shows users who are currently online. It will be nice to have --all option to show all users. Do you mean the equivelent to this? $ getent passwd | awk -F: '{print$1}' [] Solving the problem in general gets messy very quickly. It is therefore one of those that is better solved locally by providing the tools needed to do what is needed on a case by case basis. So far after forty years of Unix and GNU systems this hasn't been needed and therefore the use cases must be unusual. The philosophy isn't to solve all problems but just to make all problems solvable. It would help if you could say a few words about the case in which this would be helpful? With no further comments in almost 6 years, and this item already listed under our "rejected requests" page, I'm closing this as "won't fix". regards, - assaf
bug#16282: revisit; reasoning for not using ENV vars to provide workarounds for POSIX limitations?
severity 16282 wishlist tags 16282 wontfix close 16282 stop (triaging old bugs) Hello, On 2013-12-28 1:03 p.m., Paul Eggert wrote: [...] if it makes a standard utility behave in odd ways, it'll break scripts that don't expect the odd behavior. That's the essential objection here. Yes, we've used env vars in the past for this, but we've come to regret it, and we don't want to make matters worse in this respect without a compelling justification. Given the above, and with no further comments in 5 years, I'm closing this bug. More details about the reasoning for rejecting new environment variables are summarized here: https://www.gnu.org/software/coreutils/rejected_requests.html#envvar regards, - assaf
bug#12820: FWIW, this is still happening as of gnulib 4a82904
close 12820 stop (triaging old bugs) Hello, On 2013-02-28 10:08 a.m., Paul Eggert wrote: Perhaps there's a bug in nap () but if so the bug should be fixed there. Given the above, and with no further comments in almost 6 years, I'm closing this bug. Discussion can continue by replying to this thread. - assaf
bug#33785: df: don't suppress remote mounts
tags 33785 notabug close 33785 stop Hello, On 2018-12-19 10:05 a.m., Pádraig Brady wrote: On 17/12/18 22:42, lzhong wrote: According to the following commit commit 2e81e62243409c5c574b899f52b08c000e4d99fd df: only suppress remote mounts of separate exports with --total [...] The remote mounts should not be suppressed after this change. However, it turns out it doesn't work as the message described. The remote mounts are still suppressed. And here is The intent of the patch was not to suppress _separate_ exports on the server. I.E. nas.example.com:/Photos and nas.example.com:/Download would not be suppressed (even if they have the same device id). If you want all nfs mounts you could `df -a -t nfs` With no further comments, I'm closing this as "notabug". Discussion can continue by replying to this thread. -assaf
bug#34115: coreutils v. 8.30– Document's content gets deleted using cat(1)
tags 34115 notabug close 34115 merge 34115 33823 stop Hello, On 2019-01-17 5:53 a.m., Ricky Tigg wrote: [...] $ cat > .inputrc set enable-bracketed-paste on Press *Return*, then *Ctrl D*. [...] Content of *.inputrc*, which is expected to be still present, has been This sounds very similar to the previous email you sent ( https://bugs.gnu.org/33823 ). As before, it is not a problem in "cat", but perhaps a problem in your GUI, X11, xterminal, or perhaps a problem in copy&pasting to the clipboard. regards, - assaf
bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)
Hello, I'll address only the "apparent-size" issue (not the two-columns, or compressed file-systems): On 2019-01-16 1:13 p.m., René J.V. Bertin wrote: According to `du --help`, the apparent-size option reports a size that is not the actual disk usage. The numbers above seem to show the opposite. If anything, I find the concept of "apparent size" more appropriate to the size a file occupies on the storage medium because ultimately that storage device will not give you more than "struct stat : st_size" bytes for uncompressed filesystems. Another way to say it: with "--apparent-size", du returns the actual file size; without, it returns how large the file appears to be (judging from its disk footprint). "apparent-size" shows how much content/data the file has. without "apparent-size" du shows the amount of storage consumed (or "wasted"?) on the storage medium (accounting sparse file holes, though I'm not sure about compression). To illustrate, create three files with specific sizes: $ head --bytes=1700 /dev/zero > a $ head --bytes=4097 /dev/zero > b $ truncate --size=105 c# will be a sparse file These are their sizes, as in the amount of bytes they contain: $ ls -log total 12 -rw-r--r-- 11700 Jan 16 15:36 a -rw-r--r-- 14097 Jan 16 15:36 b -rw-r--r-- 1 105 Jan 16 15:37 c These are their "apparent-sizes", rounded up to the nearest 1K block: $ du --apparent-size a b c 2 a 5 b 1026 c e.g. file "a" is 1700 bytes, rounded-up to 2K, and "du --apparent-size" shows "2". Using "--apparent-size --block-size=1" (and its equivalent, "--bytes") will show the exact sizes: $ du --apparent-size --block-size=1 a b c 1700 a 4097 b 105 c Without "--apparent-size", du shows how much storage space is actually used/wasted/consumed on the storage medium by the files: $ du a b c 4a 8b 0c How are these numbers calculated? The simplest case is file "c" - it is completely sparse - so despite logically containing 1,050,000 zeros, on the actual storage medium it consumes zero data blocks (ignoring inodes blocks and somesuch). File "a" has 1,700 bytes of data. On my filesystem the basic block size is 4096, as shown by "stat -f": $ stat -f / File: "/" ID: 5a2cade519bada6a Namelen: 255 Type: ext2/ext3 ->Block size: 4096 Fundamental block size: 4096<- Blocks: Total: 27559017 Free: 18845977 Available: 17435289 Inodes: Total: 7036928Free: 6496730 Therefore, any file from size 1 to size 4096 will consume exactly one disk block. On most common filesystems, disk blocks can not be shared between files. Meaning that this block is fully consumed. That's why for file "a" du shows "4" - meaning 4K bytes (exactly one block) is consumed on the storage medium by this file. Similarly for file "b" - its size is 4097, which is 1 byte more than one filesystem block. Hence, file "b" consumes 2 blocks, coming up to 8K. du then shows "8" for file "b". Now to your examples: %> du -hcs /Volumes/nif64/tmp/.npm/ ; du -hcs --apparent-size /Volumes/nif64/tmp/.npm/ 340M/Volumes/nif64/tmp/.npm/ > 180M/Volumes/nif64/tmp/.npm/ Same folder on btrfs (mounted with compress=lzo): > %> du -hcs /mnt/.npm/ ; du -hcs --apparent-size /mnt/.npm> 198M /mnt/.npm/> 181M/mnt/.npm In both cases, "du --apparent-size" shows about 180MB of actual data (181MB in the second example). That is the amount of actual content (number of total bytes in these files). In the first case, these files consume 340MB of space on your disk. In the second case, these files consume 198MB of space on your disk. The reason they consume MORE than their actual data is explained above with the file-system blocks. This suggest to me that compression is not accounted for in these values. If it was, then the consumed size (without "--apparent-size") should've been less than the actual size (with "--apparent-size"). A quick on-line search shows that btrsf's default block size is 16K, while ZFS's default record-size is 128KB. That might explain why similar amount of data (and I assume, similar number of files and sizes) consume more disk space on ZFS (Could be wrong, though, comments are welcomed). I hope this helps to clarify "apparent-size". I'll leave it to others to comment on how compressed file systems come into play with du. regards, - assaf
bug#33468: A bug with yes and --help
Hello Eric, On 2019-01-12 8:42 a.m., Eric Blake wrote: On 1/11/19 6:23 PM, Assaf Gordon wrote: - optind = 0; + optind = 1; Ouch. You're hitting the portability problem of the difference between BSD and glibc. Otherwise many things fail like so: $ ./src/dd ./src/dd: unrecognized operand ‘./src/dd’ Try './src/dd --help' for more information. That's the symptoms on BSD for optind = 0 (there, you HAVE to use optreset=optind=1 for a complete reset; or plain optind=1 for a soft reset where the man page is not clear if it will always work). But on glibc, optind=1 does a soft reset (works if the optstring does not start with '-' or '+' and if you did not change POSIXLY_CORRECT), but MUST use optind = 0 if you want a hard reset. I only tested on Debian Stretch (with Debian GLIBC 2.24-11+deb9u3), did not yet test on BSDs. With "optind=1", I see the following: === $ ./src/hostid ec68f06c $ ./src/sleep ./src/sleep: missing operand Try './src/sleep --help' for more information. $ ./src/uptime 11:14:05 up 23 days 21:23, 4 users, load average: 1.16, 1.05, 0.52 $ ./src/users gordon gordon gordon gordon $ ./src/nohup ./src/nohup: missing operand Try './src/nohup --help' for more information. $ ./src/dd ## waits for CTRL-C ^C 0+0 records in 0+0 records out 0 bytes copied, 1.10243 s, 0.0 kB/s $ ./src/yes | head -n1 y === With "optind=0" I see the following: === $ ./src/hostid ./src/hostid: extra operand ‘./src/hostid’ Try './src/hostid --help' for more information. $ ./src/sleep ./src/sleep: missing operand Try './src/sleep --help' for more information. $ ./src/users $ ./src/users | od -tx1 000 02 e2 03 0a 004 $ ./src/users /var/log/wtmp ./src/users: extra operand ‘/var/log/wtmp’ Try './src/users --help' for more information. $ ./src/nohup ./src/nohup: ignoring input and appending output to 'nohup.out' ^C $ ./src/dd ./src/dd: unrecognized operand ‘./src/dd’ Try './src/dd --help' for more information. $ ./src/yes | head -n1 ./src/yes === Perhaps "parse_gnu_standard_options_only" should use "_getopt_long_r" and avoid the need to reset anything? regards, - assaf
bug#33468: A bug with yes and --help
Hello Berny and all, On 2018-11-29 1:48 a.m., Bernhard Voelker wrote: The attached are quite raw attempts to address this - yes, as a function instead of a macro. ;-) * [PATCH] long-options: add parse_gnu_standard_options_only gnulib patch! For the gnulib patch, I believe the following is needed: diff --git a/lib/long-options.c b/lib/long-options.c index 52ef1f2f8..9567d5135 100644 --- a/lib/long-options.c +++ b/lib/long-options.c @@ -139,7 +139,7 @@ parse_gnu_standard_options_only (int argc, /* Restore previous value. */ opterr = saved_opterr; - /* Reset this to zero so that getopt internals get initialized from + /* Reset this to one so that getopt internals get initialized from the probably-new parameters when/if getopt is called later. */ - optind = 0; + optind = 1; } Otherwise many things fail like so: $ ./src/dd ./src/dd: unrecognized operand ‘./src/dd’ Try './src/dd --help' for more information. The "1" value matches the instructions in the getopt_long(3) man page. * [PATCH] all: detect --help and --version more consistently [FIXME] FIXME: NEWS, syntax-check, tests. With the above 'optind=1' change, there are only two major differences: --- $ nohup-8.30 -/ ; echo $? nohup: invalid option -- '/' Try 'nohup --help' for more information. 125 $ ./src/nohup -/ ; echo $? src/nohup: invalid option -- '/' Try 'src/nohup --help' for more information. 1 $ dd-8.30 -- if=/dev/null 0+0 records in 0+0 records out 0 bytes copied, 3.9014e-05 s, 0.0 kB/s $ ./src/dd -- if=/dev/null ./src/dd: unrecognized operand ‘--’ Try './src/dd --help' for more information. --- Which in turn cause "tests/misc/invalid-opt", "tests/misc/usage_vs_getopt", and "tests/dd/misc" to fail. All other test pass as before (tested only on Debian Stretch). regards, - assaf P.S. https://bugs.gnu.org/29617 "seq: `seq 1 --help' doesn't give help" will also likely be fixed by your patch.
bug#25159: chown bug ? or sys glitch ?
close 25159 stop On 2018-10-28 1:35 a.m., Assaf Gordon wrote: On 2016-12-10 6:51 a.m., ahfc wrote: Maybe a system glitch or a chown bug so just fyi. [...] chown: changing ownership of ‘/run/media/rest_/of_/path_/filename ': Operation not permitted If this is still an issue for you, can you provide more details, in particular what is the file system on /run/media ? With no further replies, I'm closing this bug. regards, - assaf
bug#29285: Error building coreutils 8.28.32-a4eed under Archlinux from AUR
close 29285 stop On 2018-10-29 8:09 p.m., Assaf Gordon wrote: On 2017-11-13 7:43 a.m., timofonic timofonic wrote: As the coreutils build system reported, I'm sending the following building error from using the coreutils-git Arch User Repository package ( https://aur.archlinux.org/packages/coreutils-git/) Do you still get similar failure with more recent coreutils versions? With no further replies, I'm closing this bug. regards, - assaf
bug#33204: Failed to modify 'Access Time' for files without extension using the Touch tool ver 8.4
close 33204 stop Hello, On 2018-10-31 7:10 a.m., Eric Blake wrote: On 10/30/18 3:49 AM, ˮ��֮�� wrote: HI,Dear developer of GNU tools: I found a possible bug when using the Touch tool. Most likely, this is not a bug in coreutils, but a limitation between the operating system and file system you are using. If it is a GNU/Linux system, using strace would confirm that. But since you did not give us those details, it's hard to say if there's anything further we can do to help you. With no further replies, I'm closing this bug. regards, - assaf
bug#15328: Bug or dubious feature?
tags 15328 notabug close 15328 stop Hello, On 2013-09-10 3:01 p.m., Linda Walsh wrote: Whatever the problem is, it's not in 'mv'... Given the above, and no further comments in 5 years, I'm closing this item. regards, - assaf
bug#15727: Bug: cp <-a|-archive> (w/<-f|--remove-destination>) breaks if one of files is a dir and other not
severity 15727 wishlist retitle 15727 doc: cp: expand dirs-vs-files with -f/--remove-dest stop Hello, On 2013-10-29 12:20 p.m., Linda Walsh wrote: [...] You need to make the docs much more clear about "cp"s limitations. update isn't eally update, and -T is certainly wrong at the very least. If you feel you'd rather document cp's limitations, that's fine... cp is a great tool, don't get me wrong! But when it added update and -T, --remove-destination, it started inferring or promising more than you were willing to deliver. That should be documented. Based on the above (as the result of the long discussion), I'm marking this as a documentation wish-list item: clarify that "-f" and "--remove-destination" won't replace a file with a directory (as explained by Pádraig and Bernhard in the thread). Similarly for related limitations of "-T". regards, - assaf
bug#22022: ls - error making symbolic links with relative paths
retitle 22022 ln: error making symbolic links with relative paths tags 22022 notabug close 22022 stop Hello, On 2015-11-26 9:13 p.m., Eric Blake wrote: [...] You may be interested in trying 'ln --relative -sv b/* c/' instead, which creates 'c/a' as a symlink to '../b/a', and therefore resolves rather than creating a dangling symlink. With no further comments to Eric's suggestion, I'm assuming this is resolved. Discussion can continue by replying to this thread. regards, - assaf
bug#34009: warn that mkdir --mode doesn't affect parents created
severity 34009 wishlist retitle 34009 doc: mkdir: warn that --mode doesn't affect parents stop Hello, On 2019-01-07 8:36 a.m., 積丹尼 Dan Jacobson wrote: do warn that --mode doesn't affect any parents created. $ mkdir --mode 700 -p /tmp/g/h/i $ find /tmp/g -ls 55795 0 drwxr-xr-x 3 jidanni jidanni60 Jan 7 23:30 /tmp/g 55796 0 drwxr-xr-x 3 jidanni jidanni60 Jan 7 23:30 /tmp/g/h 55797 0 drwx-- 2 jidanni jidanni40 Jan 7 23:30 /tmp/g/h/i Also warn on (info "(coreutils) mkdir invocation") more directly. Thanks. The info manual does contain a short sentence about parents' modes: "To set the file permission bits of any newly-created parent directories to a value [...]" But this can be improved. Marking as wishlist. Patches are welcomed. regards, - assaf'
bug#20775: cp -a -u destroys files after they are copied
severity 20775 wishlist retitle 20775 cp: improve hardlink dups handling with "cp -a -u" stop With no further comments in more than 3 years, I'm marking this as a "wish list" item. -assaf