from:"Assaf Gordon"

bug#49741: basenc --base64url decoding bug

2021-08-29 Thread Assaf Gordon


tag 49741 fixed
close 49741
stop

On 2021-08-22 4:15 p.m., Assaf Gordon wrote:

Attached a suggested fix.


pushed in:

https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=709d1f8253072804cc27189a6f2b873d8d563399

bug#50151: Coreutils, aarch64 and chroot

2021-08-25 Thread Assaf Gordon


tag 50151 notabug
close 50151
stop

On 2021-08-25 12:54 p.m., Frans de Boer wrote:

On 8/25/21 10:16 AM, Assaf Gordon wrote:

  qemu-aarch64 -strace -L /newroot \
  /newroot/usr/sbin/chroot /newroot /usr/bin/env --version 2&1 \
  | tee log.txt

@assaf: your suggestions no. 1 and 2, had the predicted results. Thus, 
suggestion no. 3 failed because of suggestion no.2. I followed then 
suggestion 4 and attached the strace output to this message. It seems 
that chroot is working as expected, only env seems to fail with an error.


Not exactly:
The 'chroot' system-call *seems* to succeed,
followed by a failed "execve(2)" system call to execute another binary.
That "execve" system fails - so it is not 'env' per-se,
it is any program that will try to execute another aarch64 binary.

Learning that, searching for "qemu-user", "chroot" and "architecture"
leads to several web pages detailing similar errors (and few suggested
solutions):

https://wiki.gentoo.org/wiki/Crossdev_qemu-static-user-chroot

https://newbedev.com/how-can-i-chroot-into-a-filesystem-with-a-different-architechture

https://ownyourbits.com/2018/06/13/transparently-running-binaries-from-any-architecture-in-linux-with-qemu-and-binfmt_misc/


I hope you have some clue of what is going wrong.


With the above information, we can conclude this is not a bug
in coreutils - it is a limitation of the linux+qemu-user setup.

So I'm closing this item and marking it as "not a bug",
but discussion can continue by replying to this thread.

regards,
 - assaf

bug#50151: Coreutils, aarch64 and chroot

2021-08-25 Thread Assaf Gordon


Hello,

On 2021-08-24 2:39 a.m., Paul Eggert wrote:
However, I think it'll be a better use of our time for you to debug this 
one yourself. It doesn't sound like a Coreutils problem; it sounds like 
a problem in your virtual machine setup, and you're the best expert on 
that setup.


Few suggestions to check, that might help you and us to troubleshoot:

1. ensure the binaries are indeed for aarch64:

   file /newroot/usr/sbin/chroot
   file /newroot/usr/bin/env
   file /newroot/usr/bin/bash

it should say something like
  "ELF 64-bit LSB pie executable, ARM aarch64"
for all of them.


2. ensure each binary works by itself:

 qemu-aarch64 -L /newroot /newroot/usr/sbin/chroot --version
 qemu-aarch64 -L /newroot /newroot/usr/bin/env --version
 qemu-aarch64 -L /newroot /newroot/usr/bin/bash --version

(the actual version doesn't matter here, the main thing is that
the qemu user-mode emulator was able to run the binaries.)

On 2021-08-21 4:33 a.m., Frans de Boer wrote:


Running 'qemu-aarch64 -L /newroot /newroot/usr/bin/bash -c 
/usr/bin/env> --help' does show the env help text. So, I guess chroot

is to blame?

Note that the above command runs your *host's* /usr/bin/env
because chroot is not used - the binary under qemu
 (/newroot/usr/bin/bash) sees your host's file system.

Observe with:

  qemu-aarch64 -L /newroot /newroot/usr/bin/bash -c /bin/uname -m
  qemu-aarch64 -L /newroot /newroot/usr/bin/env /bin/uname -m

I'm guessing you will see "x86_64", not "aarch64".

3. What you should try is:

  qemu-aarch64 -L /newroot \
 /newroot/usr/bin/bash -c /newroot/usr/bin/env --version
and:
  qemu-aarch64 -L /newroot \
 /newroot/usr/bin/env /newroot/usr/bin/bash --version

In both cases, one aarch64 binary will try to execute another aach64 
binary. Do these work for you, or are you seeing an error?




4. Use qemu's "-strace" to see the syscalls, hopefully
that will help pinpoint the cause:

  qemu-aarch64 -strace -L /newroot \
  /newroot/usr/sbin/chroot /newroot /usr/bin/env --version 2&1 \
  | tee log.txt

If the command results in an error, the "log.txt" file will show
more details about what failed.
If you're not familiar with 'strace' output, post it here as an email 
attachment.



Hope this helps,
 - assaf

P.S.

On 2021-08-24 2:39 a.m., Paul Eggert wrote:

A complete set of instructions for an outsider to reproduce the
problem from scratch.  Assume the outsider is running Fedora 34
x86-64 (since that's what I'm running :-).

I'm not familiar with Fedora, but on Debian/x86_64 the following works:

   apt-get qemu-user
   apt-get install crossbuild-essential-arm64 libc6-arm64-cross

   cd coreutils
   ./configure --host=aarch64-linux-gnu
   make

then:

$ qemu-aarch64 -L /usr/aarch64-linux-gnu/ ./src/uname -m
aarch64

Somewhat related:

$ qemu-aarch64 -L /usr/aarch64-linux-gnu/ ./src/env ./src/uname -m
/lib/ld-linux-aarch64.so.1: No such file or directory

This fails because once "inside" qemu, the aarch64 searches for
"/lib/ld-linux-aarch64.so.1" but the file is in
"/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1".
One possible work-around is to build static binaries.

I don't want to assume that is the culprit for Frans, so we'll wait for 
the logs...

bug#49741: basenc --base64url decoding bug

2021-08-22 Thread Assaf Gordon


On 2021-08-17 3:37 a.m., Jim Meyering wrote:

On Tue, Aug 17, 2021 at 2:02 AM Pádraig Brady  wrote:

On 16/08/2021 22:17, Assaf Gordon wrote:


Attached a suggested fix.


minor nit in NEWS:

a nit in the commit log:


Thanks, attached updated patch.
Will push this week if there are no other comments.

-assaf



>From 090663068a23662b36ddc0603fc1c2c752b6aff1 Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Mon, 16 Aug 2021 15:03:36 -0600
Subject: [PATCH] basenc: fix bug49741: using wrong decoding buffer length

Emil Lundberg  reports in
https://bugs.gnu.org/49741 about a 'basenc --base64 -d' decoding bug.
The input buffer length was not divisible by 3, resulting in
decoding errors.

* NEWS: Mention fix.
* src/basenc.c (DEC_BLOCKSIZE): Change from 1024*5 to 4200 (35*3*5*8)
which is divisible by 3,4,5,8 - satisfying both base32 and base64;
Use compile-time verify() macro to enforce the above.
* tests/misc/basenc.pl: Add test.
---
 NEWS | 4 
 src/basenc.c | 4 +++-
 tests/misc/basenc.pl | 9 +
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index ddec56bdf..efdb1450e 100644
--- a/NEWS
+++ b/NEWS
@@ -60,6 +60,10 @@ GNU coreutils NEWS-*- outline -*-
   invalid combinations of case character classes.
   [bug introduced in coreutils-8.6]
 
+  basenc --base64 --decode no longer silently discards decoded characters
+  on (1024*5) buffer boundaries
+  [bug introduced in coreutils-8.31]
+
 ** Changes in behavior
 
   cp and install now default to copy-on-write (COW) if available.
diff --git a/src/basenc.c b/src/basenc.c
index 5c97a3652..2ffdb2d27 100644
--- a/src/basenc.c
+++ b/src/basenc.c
@@ -213,7 +213,9 @@ verify (DEC_BLOCKSIZE % 12 == 0);  /* So complete encoded blocks are used.  */
 
 /* Note that increasing this may decrease performance if --ignore-garbage
is used, because of the memmove operation below.  */
-# define DEC_BLOCKSIZE (1024*5)
+# define DEC_BLOCKSIZE (4200)
+verify (DEC_BLOCKSIZE % 40 == 0); /* complete encoded blocks for base32 */
+verify (DEC_BLOCKSIZE % 12 == 0); /* complete encoded blocks for base64 */
 
 static int (*base_length) (int i);
 static bool (*isbase) (char ch);
diff --git a/tests/misc/basenc.pl b/tests/misc/basenc.pl
index 3383aaeef..ac5394731 100755
--- a/tests/misc/basenc.pl
+++ b/tests/misc/basenc.pl
@@ -37,6 +37,13 @@ my $base64url_out_nl = $base64url_out;
 $base64url_out_nl =~ s/(..)/\1\n/g; # add newline every two characters
 
 
+# Bug 49741:
+# The input  is 'abc' in base64, in an 8K buffer (larger than 1024*5,
+# the buffer size which caused the bug).
+my $base64_bug49741_in = "YWJj" x 2000 ;
+my $base64_bug49741_out = "abc" x 2000 ;
+
+
 my $base32_in = "\xfd\xd8\x07\xd1\xa5";
 my $base32_out = "7XMAPUNF";
 my $x = $base32_out;
@@ -111,6 +118,8 @@ my @Tests =
  ['b64u_7', '--base64url -d',  {IN=>$base64_out},
   {EXIT=>1},  {ERR=>"$prog: invalid input\n"}],
 
+ ['b64_bug49741', '--base64 -d',  {IN=>$base64_bug49741_in},
+  {OUT=>$base64_bug49741_out}],
 
 
 
-- 
2.20.1

bug#49741: basenc --base64url decoding bug

2021-08-16 Thread Assaf Gordon


Hello Emil and all,

Thanks for the clear and easily reproducible bug report.

Attached a suggested fix.
Comments very welcomed,

- Assaf

>From 11330058443e7cc92b4a53322d810725d42b4e34 Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Mon, 16 Aug 2021 15:03:36 -0600
Subject: [PATCH] basenc: fix bug49741: using wrong decoding buffer length

Emil Lundberg  reports in
https://bugs.gnu.org/49741 about a 'basenc --base64 -d' decoding bug.
The input buffer was not divisible by 3, resulting in decoding errors.

* NEWS: Mention fix.
* src/basenc.c (DEC_BLOCKSIZE): Change from 1024*5 to 4200 (35*3*5*8)
which is divisible by 3,4,5,8 - satisfying both base32 and base64;
Use compile-time verify() macro to enforce the above.
* tests/misc/basenc.pl: Add test.
---
 NEWS | 4 
 src/basenc.c | 4 +++-
 tests/misc/basenc.pl | 9 +
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index ddec56bdf..d490ed101 100644
--- a/NEWS
+++ b/NEWS
@@ -60,6 +60,10 @@ GNU coreutils NEWS-*- outline -*-
   invalid combinations of case character classes.
   [bug introduced in coreutils-8.6]
 
+  basenc --base64 --decode no longer silently discard decoded characters
+  on (1024*5) buffer boundaries
+  [bug introduced in coreutils-8.31]
+
 ** Changes in behavior
 
   cp and install now default to copy-on-write (COW) if available.
diff --git a/src/basenc.c b/src/basenc.c
index 5c97a3652..2ffdb2d27 100644
--- a/src/basenc.c
+++ b/src/basenc.c
@@ -213,7 +213,9 @@ verify (DEC_BLOCKSIZE % 12 == 0);  /* So complete encoded blocks are used.  */
 
 /* Note that increasing this may decrease performance if --ignore-garbage
is used, because of the memmove operation below.  */
-# define DEC_BLOCKSIZE (1024*5)
+# define DEC_BLOCKSIZE (4200)
+verify (DEC_BLOCKSIZE % 40 == 0); /* complete encoded blocks for base32 */
+verify (DEC_BLOCKSIZE % 12 == 0); /* complete encoded blocks for base64 */
 
 static int (*base_length) (int i);
 static bool (*isbase) (char ch);
diff --git a/tests/misc/basenc.pl b/tests/misc/basenc.pl
index 3383aaeef..ac5394731 100755
--- a/tests/misc/basenc.pl
+++ b/tests/misc/basenc.pl
@@ -37,6 +37,13 @@ my $base64url_out_nl = $base64url_out;
 $base64url_out_nl =~ s/(..)/\1\n/g; # add newline every two characters
 
 
+# Bug 49741:
+# The input  is 'abc' in base64, in an 8K buffer (larger than 1024*5,
+# the buffer size which caused the bug).
+my $base64_bug49741_in = "YWJj" x 2000 ;
+my $base64_bug49741_out = "abc" x 2000 ;
+
+
 my $base32_in = "\xfd\xd8\x07\xd1\xa5";
 my $base32_out = "7XMAPUNF";
 my $x = $base32_out;
@@ -111,6 +118,8 @@ my @Tests =
  ['b64u_7', '--base64url -d',  {IN=>$base64_out},
   {EXIT=>1},  {ERR=>"$prog: invalid input\n"}],
 
+ ['b64_bug49741', '--base64 -d',  {IN=>$base64_bug49741_in},
+  {OUT=>$base64_bug49741_out}],
 
 
 
-- 
2.20.1

bug#49741: basenc --base64url decoding bug

2021-08-13 Thread Assaf Gordon


Hi,

I will also work on it this weekend.

 -assaf


On 2021-08-12 7:37 p.m., Paul Eggert wrote:
Simon, this looks like some sort of minor buffering problem in 'basenc 
--base64', since plain 'base64' works correctly. Is this something you 
have time to look into?


https://bugs.gnu.org/49741

bug#44704: uniq: replace repeated lines with a message about how many repeated lines

2020-11-17 Thread Assaf Gordon


tag 44704 notabug
severity 44704 wishlist
stop

Hello,

On 2020-11-17 6:32 a.m., Brian J. Murrell wrote:

It would be a useful enhancement to uniq to replace all lines
considered non-uniq (i.e. those that would be removed from the output)
with a message about how many times the previous line was repeated.

I.e.

$ cat <
[...]

uniq supports the "--group" option, which adds a blank line after each
group of identical lines - this can be used down-stream to process
groups in any way you want.

Example:
  $ cat < in
  first line
  second line
  repeated line
  repeated line
  repeated line
  repeated line
  repeated line
  third line
  EOF

  $ cat in | uniq --group=append
  first line

  second line

  repeated line
  repeated line
  repeated line
  repeated line
  repeated line

  third line


  $ cat in | uniq --group=append \
  | awk '$0=="" { print "do something after group" ; next } ;
 1 { print }'
  first line
  do something after group
  second line
  do something after group
  repeated line
  repeated line
  repeated line
  repeated line
  repeated line
  do something after group
  third line
  do something after group

And with counting:

$ cat in | uniq --group=append \
 | awk 'BEGIN { c = 0 } ;
$0=="" { print "Group has " c " lines" ; c=0 ; next } ;
1 { print ; c++ }'
  first line
  Group has 1 lines
  second line
  Group has 1 lines
  repeated line
  repeated line
  repeated line
  repeated line
  repeated line
  Group has 5 lines
  third line
  Group has 1 lines


Hope this helps.
More information about "uniq --group=X" is here:

https://www.gnu.org/software/coreutils/manual/html_node/uniq-invocation.html

I'm marking this as "notabug/wishlist", but will likely close soon as
"wontfix" unless we come up with convincing argument why "--group"
is not sufficient for your use case.

Regardless of the status, discussion can continue by replying to this 
thread.


regards,
 - assaf

bug#43684: Problem with numerical splitting with files > 90*l

2020-09-29 Thread Assaf Gordon





On 29/09/2020 02:18, ned haughton wrote:

When splitting with -d, the numbering screws up after 89:


In addition to Pádraig explanation, please see previous similar 
discussion here:

  https://lists.gnu.org/archive/html/bug-coreutils/2017-02/msg00050.html
  http://bugs.gnu.org/25832

regards,
 - assaf

bug#42340: Fwd: bug#42340: "join" reports that "sort"ed input is not sorted

2020-07-15 Thread Assaf Gordon


Hello,

On 2020-07-15 2:12 p.m., Beth Andres-Beck wrote:

If that is the intended behavior, the bug is that:

printf '12,\n1,\n' | sort -t, -k1 -s

1,
12,

does _not_ take the remainder of the line into account, and only sorts on
the initial field, prioritizing length.

It is at the very least unexpected that adding an `a` to the end of both
lines would change the sort order of those lines:

printf '12,a\n1,a\n' | sort -t, -k1 -s

12,a
1,a



Not a bug, just an incomplete usage :)

sort's -k/--key parameter takes two values (the second being optional):
the first and last column to use as the key. If the second value is 
omitted (as in your case), then the key is taken from the first field

to the end of the line.

And so:
"sort -k1,1" means take the first *and only the first* field as the key.
"sort -k1" means take the first field until the end of the line as the key.
"sort -k1,3" means take the first,second and third fields as the single key.
"sort -k1,1 -k2,2 -k3,3" means take the first field as the first key,
second field as the second key, and third field as the third key.

---

The "--debug" option can help illustrate what sort is doing,
by adding underscore characters to show which characters are being used 
as keys in each line. Consider the following:


   $ printf '12,\n1,\n' | sort -t, -k1 -s --debug
   sort: using ‘en_CA.utf8’ sorting rules
   1,
   __
   12,
   ___

   $ printf '12,\n1,\n' | sort -t, -k1,1 -s --debug
   sort: using ‘en_CA.utf8’ sorting rules
   1,
   _
   12,
   __

In the first example, the "-k1" means from first field till end of line,
the underscore includes the "," characters.
In the second example, the "-k1,1" means only the first field, and the 
comma is not used.


Now consider your second case of adding an "a" at the end of each line:

   $ printf '12,a\n1,a\n' | sort -t, -k1 -s --debug
   sort: using ‘en_CA.utf8’ sorting rules
   12,a
   
   1,a
   ___

   $ printf '12,a\n1,a\n' | sort -t, -k1,1 -s --debug
   sort: using ‘en_CA.utf8’ sorting rules
   1,a
   _
   12,a
   __

In the first example, "-k1" means: from first field until the end of the 
line, and so the entire string "12,a" is compared against "1,a".


**AND**, because the locale is a "utf-8" locale, punctuation characters 
are ignored (as mentioned in the previous email in this thread).

So effectively the compared strings are "12a" vs "1a".
The ASCII value of "2" is smaller than the ASCII value of "a", and
therefore "12a" appears before "1a".

If we force C locale, then the order is reversed:

   $ printf '12,a\n1,a\n' | LC_ALL=C sort -t, -k1 -s --debug
   sort: using simple byte comparison
   1,a
   ___
   12,a
   

Because now punctuation characters are used, and the ASCII value of ","
is smaller than the ASCII value of "2".

**HOWEVER**, this result of using "LC_ALL=C" together with "-k1" is
only correct by a happy accident :)
it is still very likely that "-k1" is not what you wanted - you 
probably meant to do "-k1,1".


---

Lastly, the "-s/--stable" option in the above contrived examples is 
superfluous - it doesn't affect the output order because there are no

equal field values (i.e. "1" vs "12").
A slightly better example to illustrate how "-s" affects ordering is this:

   $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1
   1,a
   2,b
   2,x

   $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1 -s
   1,a
   2,x
   2,b

Here, "1" comes before "2" - that's obvious. But should "2,b" come 
before "2,x" ?
If we do not use "-s/--stable", then "sort" ALSO does one additional 
comparison of the entire line as a last step (hence "sort --help" says

"[disable] last-resort comparison" about "-s/--stable").
The substring ",b" comes before ",x" - therefore "2,b" appears first.

If we add "-s/--stable", the last comparison step of the entire line is 
skipped, and the lines of "2" appear in the order they were in the input 
(hence - "stable").


By using "--debug" we can see the additional comparison step (indicated 
by additional underscore lines);


   $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1 --debug
   sort: using ‘en_CA.utf8’ sorting rules
   1,a
   _
   ___
   2,b
   _
   ___
   2,x
   _
   ___


   $ printf "2,x\n1,a\n2,b\n" | sort -t, -k1,1 -s --debug
   sort: using ‘en_CA.utf8’ sorting rules
   1,a
   _
   2,x
   _
   2,b
   _

---

Hope this helps.
regards,
 - assaf

bug#42340: "join" reports that "sort"ed input is not sorted

2020-07-13 Thread Assaf Gordon


tags 42340 notabug
close 42340
stop

Hello,

On 2020-07-12 5:57 p.m., Beth Andres-Beck wrote:

In trying to use `join` with `sort` I discovered odd behavior: even after
running a file through `sort` using the same delimiter, `join` would still
complain that it was out of order.

[...]

Here is a way to reproduce the problem:


printf '1.1.1,2\n1.1.12,2\n1.1.2,1' | sort -t, > a.txt
printf '1.1.12,a\n1.1.1,b\n1.1.21,c' | sort -t, > b.txt
join -t, a.txt b.txt

  join: b.txt:2: is not sorted: 1.1.1,b

The expected behavior would be that if a file has been sorted by "sort" it
will also be considered sorted by join.

[...]
I traced this back to what I believe to be a bug in sort.c 


This is not a bug in sort or join, just a side-effect of the locale on 
your system on the sorting results.


By forcing a C locale with "LC_ALL=C" (meaning simple ASCII order),
the files are ordered in the same way 'join' expected them to be:

 $ printf '1.1.1,2\n1.1.12,2\n1.1.2,1' | LC_ALL=C sort -t, > a.txt
 $ printf '1.1.12,a\n1.1.1,b\n1.1.21,c' | LC_ALL=C sort -t, > b.txt
 $ join -t, a.txt b.txt
 1.1.1,2,b
 1.1.12,2,a

---

More details:
I'm going to assume your system uses some locale based on UTF-8.
You can check it by running 'locale', e.g. on my system:
  $ locale
  LANG=en_CA.utf8
  LANGUAGE=en_CA:en
  LC_CTYPE="en_CA.utf8"
  ..
  ..

Under most UTF-8 locales, punctuation characters are *ignored* in the
compared input lines. This might be confusing and non-intuitive, but
that's the way most systems have been working for many years (locale
ordering is defined in the GNU C Library, and coreutils has no way to
change it).

Observe the following:

  $ printf '12,a\n1,b\n' | LC_ALL=en_CA.utf8 sort
  12,a
  1,b

  $ printf '12,a\n1,b\n' | LC_ALL=C sort
  1,b
  12,a

With a UTF-8 locale, the comma character is ignored, and then "12a" 
appears before "1b" (since the character '2' comes before the character

'b').

With "C" locale, forcing ASCII or "byte comparison", punctuation 
characters are not ignored, and "1,b" appears before "12,a" (because
the comma ',' ASCII value is 44	, which is smaller then the ASCII value 
digit '2').


---

Somewhat related:
Your sort command defines the delimiter ("-t,") but does not define 
which columns to sort by; sort then uses the entire input line - and 
there's no need to specify delimiter at all.


---

As such, I'm closing this as "not a bug", but discussion can continue by
replying to this thread.

regards,
 - assaf

bug#40530: feature proposal: coreutils -> sort: adding sorting ability for Hebrew numerals

2020-04-09 Thread Assaf Gordon

Hello,

> On Apr 9, 2020, at 3:23 PM, Zeev Pekar  wrote:
> 
> it would be nice to be able to sort (coreutils -> sort) Hebrew numerals:

An interesting idea, but I think it is a bit too niche to be included in the 
coreutils “sort” program (tradeoff of usefulness vs bloat).

However, such functionality is very suitable to an old idea of an auxiliary 
“decorate” program that will allow many more sorting options when used in 
tandem with “sort”.

I’ve started writing such program some time ago, based on  Pádraig's idea 
(never completed, but perhaps these days are perfect opportunity to complete 
it):
https://lists.gnu.org/archive/html/coreutils/2019-03/msg00056.html

Would you like to try your hand at coding the sorting rules for such 
Hebrew-numerals sort?

regards,
 - Assaf

bug#38003: date --date=-1month gives same month today

2019-10-31 Thread Assaf Gordon


tag 38003 notabug
close 38003
stop

Hello,

On 2019-10-31 2:34 a.m., Ilja Honkonen wrote:
Please CC me as I'm not on this list. Running date (GNU coreutils) 8.26 
on fedora 30 today (date --utc  -I: 2019-10-31) with --date=-1month 
gives the same month which doesn't make sense:

$ date --utc -I --date=-1month
2019-10-01


date gained a "--debug" option that helps diagnosing the issue:

$ date --utc -I --debug --date=-1month
date: parsed relative part: -1 month(s)
[...]
date: using current date as starting value: '(Y-M-D) 2019-10-31'
[...]
date: warning: when adding relative months/years, it is recommended to 
specify the 15th of the months   <

date: after date adjustment (+0 years, -1 months, +0 days),
date: new date/time = '(Y-M-D) 2019-10-01 17:29:20'
date: warning: month/year adjustment resulted in shifted dates:
date:  adjusted Y M D: 2019 09 31<
date:normalized Y M D: 2019 10 01<
[...]
date: final: (Y-M-D) 2019-10-01 17:29:20 (UTC)
2019-10-01

--

Subtracting 1 month from October 31st results in September 31st.
Since the date doesn't exist, it is normalized:
September 31st is "one day after September 30th", which
results in October 1st.

The "--debug" option also warns: when subtracting months,
it is recommended to specify the 15th (middle) of the month,
exactly to avoid such issues.

   $ date --utc -I --date="2019-10-15 -1month"
   2019-09-15

regards,
 - assaf

bug#37702: Suggestion for 'df' utility

2019-10-13 Thread Assaf Gordon


Hello Bernhard,

On 2019-10-13 3:57 p.m., Bernhard Voelker wrote:

On 2019-10-13 23:28, Paul Eggert wrote:

In any sane system there would be only
four lines of non-header output (for tmpfs etc, /, /home, and
/media/eggert/B827-D456), but df is outputting 28 lines.


What is so special about tmpfs so that you would like to see it?


As an interesting use-case (though not common),
I recently configured a raspberry PI device,
and wanted to mount as many locations on tmpfs as possible,
e.g. "/tmp" "/var/tmp", "/var/log" etc.

In was very useful in those cases to be able to see separate
tmpfs file system listed, with information about how big they
are and how much space was used.

Also in other systems where "/tmp" is a "tmpfs",
users might want to see how much space is available.

If we hide it by default, they can of course use "df /tmp"
or "df --all" - it's not about removing this option,
it is just about making users' life harder or easier,
and making unexpected changes.


I recently also encountered a change in a default behavior of
a program which I've been using a very long time - and it is *very*
frustrating to have something that worked "just fine" for so long
being changed.



Here on my openSUSE:Tumbleweed system, I see the following:

   $ df -T
   Filesystem Type 1K-blocks  Used Available Use% Mounted on

[...]

   /dev/loop0 ext2 31729 31729 0 100% 
/FULL_PARTITION_TMPDIR

[...]


(The /FULL_PARTITION_TMPDIR is used by a special coreutils test.)



That's an interesting case, where I would think you'd want to see it,
because you explicitly mounted it.



I think I could well live with adding 'devtmpfs' and 'tmpfs' to the
pseudo file systems in gnulib's "mountlist.c".


I agree, but think this needs to be communicated very well,
and in advance - perhaps announce this change ahead of time to
the respective package maintainers of each distribution - just so
they'll know it's coming (and also have a way to revert it if they don't
like it).



This seems to be a small change, and not satisfying the snap case.


Possibly hiding "squashfs" of readonly-mounts could get rid of those snaps?

regards,
  -assaf

bug#37702: Suggestion for 'df' utility

2019-10-13 Thread Assaf Gordon


On 2019-10-13 3:28 p.m., Paul Eggert wrote:
[..]
I mean c'mon, here's the output of 'df' on the Ubuntu 18.04.3 LTS 
workstation I'm typing this particular message on. In any sane system 
there would be only four lines of non-header output (for tmpfs etc, /, 
/home, and /media/eggert/B827-D456), but df is outputting 28 lines. This 
is ridiculous.




It is certainly inconvenient if that's not what you are looking for
(and certainly most desktop users aren't).

But I'm not sure if it's easy to find a set of criteria
that would work well while having minimal unexpected side effects of 
hiding entries people in other systems do expect to see.


Out of curiosity,
can you share the output of the following commands on the same system?

lsblk

df -x tmpfs -x devtmpfs -x squashfs


Thanks,
 - assaf

bug#37702: Suggestion for 'df' utility

2019-10-13 Thread Assaf Gordon


Hi all,

On 2019-10-13 2:27 p.m., Paul Eggert wrote:

On 10/13/19 2:41 AM, Pádraig Brady wrote:

I wonder could we key (also) on used==0||available==0.


Yes, looking at the sample output I gave earlier, I'd say we could by 
default drop filesystems where usage is 1% or less. That would solve the 
problem for my workstation. This is roughly akin to the "used==0" test 
you're suggesting.




I would humbly suggest caution with such unexpected user-facing changes
to the default output of 'df' - learning the lessons from changing the 
quotes in 'ls'.


Countless users have been using 'df' in their own ways, and have gotten
used to certain outputs.

This thread originated by a request to "clean up" the output on newer
ubuntu machines which use "snap" packages as /dev/loopN .

Let's not turn that into a drastic change that will affect many other
existing systems - the users on other systems did not ask for any changes.

---

Specifically for "default drop filesystems where usage is 1% or less" -
I can think of few cases off the top of my head where this would be
extremely confusing:

- I recently installed a 33TB raid file system. The usage on that system
is at %1 and will stay like so for at least several days.

- Amazon cloud services (AWS) offers an NFS4 service (they call it 
"EFS") that has reported size of 8 exabytes. There too usage could be at 
%1 for a long long time.


---


For cases where I want to list only the "real" storage, I typically use
an alias such as:

   alias dff='df -h -x tmpfs -x devtmpfs'

And it would be very easy and least disruptive to recommend
to ubuntu users to add "-x squashfs" or another file system to ignore.


Perhaps we can come up with a recommended list of "lesser" file systems
to ignore (or conditions such as read-only file
systems) and add it as a new option, but please let's not make it the
default.



My two cents,
 - assaf

bug#37093: wc runs 100% cpu when in pipeline or tee >(wc)

2019-08-20 Thread Assaf Gordon


tag 37093 notabug
close 37093
stop

Hello,

On 2019-08-19 10:44 p.m., Edward Huff wrote:

In the demo below, dd uses 0.665s to write 1GiB of zeros.
sha256sum uses 4.285s to calculate the sha256 of 1GiB of zeros.
wc uses 32.160s to count 1GiB of zeros.


[...]


baseline results:
$ dd if=/dev/zero count=$((1024*1024)) bs=1024 | tee >(sha256sum>&2) | wc
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5007 s, 33.0 MB/s
49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14  -
   0   0 1073741824
$


First,
Try to avoid UTF8 locales (i.e., force a C/POSIX locale with LC_ALL=C)
which makes 'wc' much faster.

On my computer:

With UTF8 locale:

  $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \
| tee >(sha256sum>&2) | time --portability wc
  1048576+0 records in
  1048576+0 records out
  1073741824 bytes (1.1 GB, 1.0 GiB) copied, 46.5928 s, 23.0 MB/s
  49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14  -
0   0 1073741824
  real 46.59
  user 46.37
  sys 0.19

With C locale:

  $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \
   | tee >(sha256sum>&2) | LC_ALL=C time --portability wc
  1048576+0 records in
  1048576+0 records out
  1073741824 bytes (1.1 GB, 1.0 GiB) copied, 8.60285 s, 125 MB/s
  49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14  -
0   0 1073741824
  real 8.60
  user 5.22
  sys 0.26


Second,
The "word counting" feature in 'wc' is the main cpu-hog.
If you avoid that (i.e. counting only lines, or only characters),
'wc' is even faster (and it automatically ignores UTF8 issues):

  $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \
   | tee >(sha256sum>&2) \
   | \time --portability wc -c
  1048576+0 records in
  1048576+0 records out
  1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.59429 s, 141 MB/s
  49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14  -
  1073741824

  real 7.59
  user 0.10
  sys 0.71

Notice that the "real time" wasn't changed much (from 8.6s to 7.59s), 
but the actual work performed by 'wc' (measured in "user time") is down

drastically.


Third,
If you are comfortable with compiling Coreutils from source,
you can build it using optimized hashing function from OpenSSL, like so:

 ./configure --with-openssl
 make

Then, "sha256sum" will be faster (about 2x fast on my computer).

If you don't want to re-compile it, consider using "openssl" directly
to calculate the checksum, like so:

  dd if=/dev/zero count=1K bs=1M | tee >(openssl sha256>&2) | wc -c


Fourth,
To save few more microseconds, consider using dd with larger block size 
(bs=) and fewer blocks (count=), e.g.:


   $ time dd if=/dev/zero of=/dev/null count=1M bs=1K
   1048576+0 records in
   1048576+0 records out
   1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.865853 s, 1.2 GB/s

   real 0m0.868s
   user 0m0.288s
   sys  0m0.579s

   $ time dd if=/dev/zero of=/dev/null count=1K bs=1M
   1024+0 records in
   1024+0 records out
   1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.0998688 s, 10.8 GB/s

   real 0m0.102s
   user 0m0.000s
   sys  0m0.102s

This won't reduce the total time by much, but will result in
fewer sys-calls, and less CPU kernel time (at least by a tiny bit).
The effect is more noticeable when reading or writing to a physical disk.



Lastly,
If you use GNU time instead of the shell's built-in 'time' function,
you can specify custom output format,
and easily show the timing of each program in the pipeline.
Example:

$ FMT="\n=== CMD: %C ===\nreal %e\tuser %U\tsys %S\n"
$ \time -f "$FMT" dd if=/dev/zero count=1M bs=1K \
 | \time -f "$FMT" tee >(\time -f "$FMT" sha256sum>&2) \
 | \time -f "$FMT" wc -c
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.77339 s, 138 MB/s

=== CMD: dd if=/dev/zero count=1048576 bs=1024 ===
real 7.77   user 0.36   sys 1.65


=== CMD: tee /dev/fd/63 ===
real 7.77   user 0.10   sys 1.30

49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14  -

=== CMD: sha256sum ===
real 7.77   user 7.47   sys 0.27

1073741824

=== CMD: wc -c ===
real 7.77   user 0.05   sys 0.76


As such, I'm closing this as "not a bug",
but discussion can continue by replying to this thread.

regards,
 - assaf

bug#37058: Error message with local deployment of Galaxy-k8s

2019-08-16 Thread Assaf Gordon

tag 37058 notabug
close 37058
stop

Hello,

Two issues are mixed here.

First:

On 2019-08-16 2:17 p.m., Gao, Jianliang wrote:
I followed https://github.com/phnmnl/phenomenal-h2020/wiki/QuickStart-Installation-for-Local-PhenoMeNal-Workflow with Older Galaxy chart to deploy local galaxy-k8s instance with minikube on Windows 10. The following message came from the logs of my pod. I can't connect to my local instance.

[...]

kubectl logs galaxy-k8s-tr6fc
[ run_galaxy_config.sh ] -- Galaxy sqlite directory created since we are not
using postgresql
[ run_galaxy_config.sh ] -- Replaced galaxy ini for the user's injected one

[...]

dpkg-preconfigure: unable to re-open stdin:
[WARNING]: It is unneccessary to use '{{' in loops, leave variables in loop
expressions bare.

[...]

galaxy.tools.deps WARNING 2019-08-16 19:20:48,175 Path
'./database/dependencies' does not exist, ignoring
galaxy.tools.deps WARNING 2019-08-16 19:20:48,175 Path
'./database/dependencies' is not directory, ignoring
galaxy.tools.deps.installable WARNING 2019-08-16 19:20:48,190 Conda not
installed and auto-installation disabled.
galaxy.tools.deps.installable WARNING 2019-08-16 19:20:48,190 Conda not
installed and auto-installation disabled.

These are issues related your Galaxy setup.

(for other readers: "Galaxy" in this context is a web-based framework
for bioinformatics analysis, see https://galaxyproject.org/ and
https://usegalaxy.org ).

Such issues are best asked in their support forums:
https://galaxyproject.org/support/
https://help.galaxyproject.org

This includes problems in underlying layers, such as the 'dpkg' errors
above that result from deploying Galaxy VMs or instances or kubernetes
or containers etc.

tail: unrecognized file system type 0x794c7630 for 'paster.log'. please report
this to bug-coreutils@gnu.org. reverting to polling

This warning indeed comes from coreutils program 'tail',
however it is harmless in your situation.
For more details, see here:
https://www.gnu.org/software/coreutils/filesystems.html

---

A cursory look at the error logs makes it seem like
"bug-coreutils@gnu.org" is the place to ask General questions about
"Galaxy" server (because it is the last thing mentioned),

but that is not the case.
We can only help with coreutils programs (e.g. 'tail').

Please contact the Galaxy team for galaxy-related issues.

Hope this helps.
regards,
- assaf

bug#36985: tail

2019-08-09 Thread Assaf Gordon


close 36985
stop

Hello,

On 2019-08-09 12:55 a.m., Rob Hearne wrote:

root@kafka-robh-vmdub-04:/kafka/bin# tail -f Control
tail: unrecognized file system type 0x794c7630 for ‘Control’. please report
this to bug-coreutils@gnu.org. reverting to polling



This has been fixed in version 8.25 (released in 2016).
For more details, see
https://www.gnu.org/software/coreutils/filesystems.html

-assaf

bug#36901: Enhance directory and file moves where target already exists

2019-08-03 Thread Assaf Gordon

Hello,

On Fri, Aug 02, 2019 at 10:47:18PM -0700, L A Walsh wrote:
> It's not a wish list that 'mv' doesn't work as documented.

The "wishlist" refers to the topic:
You are asking to add new funtionality to 'mv'.
That is a "wishlist" item.

(answering out of order:)

> > On 2019-08-02 9:56 p.m., L A Walsh wrote:
> >> But you say posix wants it to perform as a rename?
[...]
> >>
> >> So if I have:
> >> mkdir A B
> >> touch A/foo B/fee
> >> So when I look at the system call on linux for rename:
> >> oldpath can specify a directory.  In this case, newpath must
> >> either not
> >> exist, or it must specify an empty directory.
> >>  (complying with POSIX_C_SOURCE >= 200809L)
> >>
> >> So move should give an error: Nope:
> >>
> >> mv A B
> >>> tree B
> >> B
> >> ├── A
> >> │   └── foo
> >> └── fee
> >>
> >> 1 directory, 2 files
> >>
> >> So mv is violating POSIX - it didn't do the rename, but moved
> >> A under B and neither dir had to be empty.
> >>
> >> Saying it has to follow POSIX when it doesn't appear to, seems
> >> a bit contradictory?

I previously quoted one small part of the entire "mv" POSIX specification
(item #3, regarding using the 'rename(2)' function).

It would be wise to read the entire specification before making claims
about violating POSIX.
Specifically, at the top of the page:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html
   SYNOPSIS
  mv [-if] source_file target_file
  mv [-if] source_file... target_dir
   DESCRIPTION
  [...]
  In the second synopsis form, mv shall move each file named by a
  source_file operand to a destination file in the existing directory
  named by the target_dir operand [...] This second form is assumed
  when the final operand names an existing directory

In this regard GNU 'mv' is compliant with POSIX.

> > On 2019-08-02 9:56 p.m., L A Walsh wrote:
> >> On 2019/08/02 19:47, Assaf Gordon wrote:
> >>> Can new merging features be added to 'mv'? yes.
> >>> But it seems to me these would be better suited for 'higher level'
> >>> programs (e.g. a GUI file manager).
> >> ---
> >> If the command was named 'ren', then I'd expect it to be dummer,
> >> but 'mv'/move seem like it should be able to move files from
> >> one dir into another.
> >>
> >> But you say posix wants it to perform as a rename?
> >> I know, create a 're' command (or 'rn') for rename, and have
> >> it do what 'mv' would do.  Maybe posix would realize it would
> >> be better to have re/rn behave like rename, and 'mv' to
> >> behave it was moving something.

The Austin group (https://www.opengroup.org/austin/) who is in charge
of developing and maintaining the POSIX standard is the place
to go when wanting to change things in POSIX (or add new things).

You can write to them, suggest a modification,
and if they change the standard, GNU coreutils will surely follow.

As for renaming 'mv' or creating new 'rn' command -
part of POSIX is to codify existing behavior (that is - programs which
were in common use *before* POSIX).  It's not always logic, it's not always
ideal, but that's what has been in use for many years.

Based on mv's wiki page (https://en.wikipedia.org/wiki/Mv), 'mv' was
first introduced in 1971, 47 years ago.
With hindsight of nearly 5 decades it's easy to point to faults in a
program. If we were designing 'mv' today from scratch, I'm sure we would
improve many of its aspects.

But given that it is a long-standing program and its usage and quirks
are well established, I'm inclined to say it is highly unlikely
we will change mv's default behaviour or replace it with a different
name.

Adding new functionality (e.g. a new '--merge-directory' option)
is possible, and concrete patches are always welcomed.
However, given all the above, there is no guarentee that such new option
will be accepted.
I still think that such specific features are better suited for more
sophisticated programs (whether GUI or command line).

regards,
 - assaf

bug#36901: Enhance directory and file moves where target already exists

2019-08-02 Thread Assaf Gordon

severity 36901 wishlist
retitle 36901 mv: merge directories where target already exists
stop

Hello,

(for context: this is a new topic, diverged at https://bugs.gnu.org/36831#38 )

For completeness, quoting your second message ( from 
https://bugs.gnu.org/36831#50 ):

On 2019-08-02 9:56 p.m., L A Walsh wrote:
> 
> On 2019/08/02 19:47, Assaf Gordon wrote:
>> Can new merging features be added to 'mv'? yes.
>> But it seems to me these would be better suited for 'higher level'
>> programs (e.g. a GUI file manager).
> ---
>   But neither the person who posted the original bug on this
> nor I are using a GUI, we are running 'mv' GUI, we use the cmd line on
> linux, so that wouldn't
> be of any use.
> 
> If the command was named 'ren', then I'd expect it to be dummer,
> but 'mv'/move seem like it should be able to move files from
> one dir into another.
> 
> But you say posix wants it to perform as a rename?
> I know, create a 're' command (or 'rn') for rename, and have
> it do what 'mv' would do.  Maybe posix would realize it would
> be better to have re/rn behave like rename, and 'mv' to
> behave it was moving something.
> 
> So if I have:
> mkdir A B
> touch A/foo B/fee
> 
> So when I look at the system call on linux for rename:
> oldpath can specify a directory.  In this case, newpath must
> either not
> exist, or it must specify an empty directory.
>  (complying with POSIX_C_SOURCE >= 200809L)
> 
> So move should give an error: Nope:
> 
> mv A B
>> tree B
> B
> ├── A
> │   └── foo
> └── fee
> 
> 1 directory, 2 files
> 
> So mv is violating POSIX - it didn't do the rename, but moved
> A under B and neither dir had to be empty.
> 
> Saying it has to follow POSIX when it doesn't appear to, seems
> a bit contradictory?
>

bug#36831: Enhance directory move. (was Re: bug#36831: enhance 'directory not empty' message)

2019-08-02 Thread Assaf Gordon


Hello,

On 2019-08-02 9:56 p.m., L A Walsh wrote:

On 2019/08/02 19:47, Assaf Gordon wrote:

Can new merging features be added to 'mv'? yes.
But it seems to me these would be better suited for 'higher level'
programs (e.g. a GUI file manager).

---
But neither the person who posted the original bug on this
nor I are using a GUI, we are running 'mv' GUI, we use the cmd line on linux, 
so that wouldn't
be of any use.


The original post was about the error *message*, asking to make it 
clearer. That is the topic of this thread (and the previous patch) -

so let's leave them at that.


I see you started a new thread ( https://bugs.gnu.org/36901 ),
so I'll reply there.

bug#36831: Enhance directory move. (was Re: bug#36831: enhance 'directory not empty' message)

2019-08-02 Thread Assaf Gordon

Hello,

On Fri, Aug 02, 2019 at 02:41:31AM -0700, L A Walsh wrote:
> On 2019/07/28 23:28, Assaf Gordon wrote:
> >
> >
> > $ mkdir A B B/A
> > $ touch A/bar B/A/foo
> > $ mv A B
> > mv: cannot move 'A' to 'B/A': Directory not empty
> >
> > And the reason (as you've found out) is that the target directory 'B/A'
> > is not empty (has the 'foo' file in it).
> > Had this been allowed, moving 'A' to 'B/A' would result in the 'foo'
> > file disappearing.
> >   
> ---
> Why must foo disappear?
> 
> Microsoft Windows handles this situation by telling the user that
> the target directory already exists and giving the option to *MERGE*
> the directories.
> 
> If you attempt to move a file into a directory that already contains
> a file by the same name, it pops up another notice asking [...]

Certainly, GUI programs (and more 'feature-rich' programs than 'mv')
offer many "merging" options.

I'm sure Midnight-Commander, KDE/Doplhine, XFCE/Thunar, Gnome/Nautilus
and many other free software GUI file managers have some "merging"
capabilities.

But 'mv' is more basic and does not have this capability.
Partly that is because it adheres to the POSIX standards, which
mandates:
"3. The mv utility shall perform actions equivalent to the
rename() function [...]"
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html

Some rsync options (--remove-source-files) can mimick 'mv' with merging,
but then they are more like "copy+delete" than actual "rename/move".

Can new merging features be added to 'mv'? yes.
But it seems to me these would be better suited for 'higher level'
programs (e.g. a GUI file manager).

regards,
 - assaf

bug#36831: enhance 'directory not empty' message

2019-08-01 Thread Assaf Gordon

On Thu, Aug 01, 2019 at 03:58:51PM -0700, Paul Eggert wrote:
> Thanks, that's better, but we're still missing some opportunities for 
> improvement.
> 
> > mv: cannot move 'A' to 'B/A': Target directory not empty
> 
> This should be "Destination" not "Target". 
[...] 
> You meant "mv" not "rm".
[...]
> > +static char*
> Space before "*".
[...]
> > +strerror_target (int e)
> Change name to "strerror_dest"
[...] 
> This function should return NULL instead of aborting when the errno value is
> inapplicable. That way, its callers need not hardcode which errno values it
> handles.

Thanks for the review and suggestions - attached an updated patch.

> Come to think of it, the same improvement should be made to ln, cp, install
> and shred. Basically, to any program that uses 'rename' or 'link' or similar
> syscalls, and which reports an error if the syscall fails.

OK, I will work on that next.

-assaf
>From 8dc6158a6fde668e55312b5fb69384f438b7e55a Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Mon, 29 Jul 2019 00:23:20 -0600
Subject: [PATCH] mv: improve error messages when destination directory is at
 fault

Suggested by Alex Mantel  in
https://bugs.gnu.org/36831 .

$ mkdir A B B/A
$ touch A/bar B/A/foo

Before:

$ mv A B
mv: cannot move 'A' to 'B/A': Directory not empty

After:

$ mv A B
mv: cannot move 'A' to 'B/A': Destination directory not empty

The following errors are handled:
EDQUOT, EEXIST, ENOTEMPTY, EISDIR, ENOSPC, ETXTBSY.

* src/copy.c (copy_internal): Print custom messages for errors
that explicitly fault the destination directory.
(strerror_dest): New function, return custom, translatable error
messages for errors relating to 'destination' component.
* tests/mv/dir2dir.sh: Adjust expected error message.
* NEWS: Mention change.
---
 NEWS|  6 +
 src/copy.c  | 53 ++---
 tests/mv/dir2dir.sh |  8 ---
 3 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/NEWS b/NEWS
index fd0543351..3d80665ae 100644
--- a/NEWS
+++ b/NEWS
@@ -44,6 +44,12 @@ GNU coreutils NEWS-*- 
outline -*-
   stat(1) also supports a new --cached= option to control cache
   coherency of file system attributes, useful on network file systems.
 
+** Improvements
+
+  mv now prints clearer error messages when a failure relates to the
+  destination directory (e.g., "Destination directory is not empty" instead
+  of "Directory not empty").
+
 
 * Noteworthy changes in release 8.31 (2019-03-10) [stable]
 
diff --git a/src/copy.c b/src/copy.c
index 65cf65895..602c8307b 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1867,6 +1867,44 @@ source_is_dst_backup (char const *srcbase, struct stat 
const *src_st,
   return dst_back_status == 0 && SAME_INODE (*src_st, dst_back_sb);
 }
 
+/* Return custom error messages replacing the default libc's
+   messages. These messages explicity fault the destination component
+   in the error.
+
+   Return NULL if E (errno value) is not handled (and by implication
+   should use the system's default text for the error message).  */
+static char *
+strerror_dest (int e)
+{
+  /* TRANSLATORS: These strings should mimick libc's standard
+ error messages (from strerror(3)), but explicitly mention
+ the fault is with the destination directory. */
+  switch (errno)
+{
+case EDQUOT:
+  return _("Disk quota exceeded on destination device");
+case EEXIST:
+case ENOTEMPTY:
+  return _("Destination directory not empty");
+case EISDIR:
+  return _("Tried to overwrite a directory with a file");
+case ENOSPC:
+  return _("No space left on destination device");
+case ETXTBSY:
+  /* NOTE: The error is "Text file busy" - but "text" in that context
+ refers to "text segment" of an executable file (as opposed to
+ "data segment" and "BSS segment").
+
+ This error message is meant for users, and 'text file' can be easily
+ confused with an actual text file (i.e., one containing only ASCII
+ characters. Thus, say 'executable' instead of 'text'.*/
+  return _("Destination executable file is busy");
+default:
+  return NULL;
+}
+}
+
+
 /* Copy the file SRC_NAME to the file DST_NAME.  The files may be of
any type.  NEW_DST should be true if the file DST_NAME cannot
exist because its parent directory was just created; NEW_DST should
@@ -2477,9 +2515,18 @@ copy_internal (char const *src_name, char const 
*dst_name,
  If the permissions on the directory containing the source or
  destination file are made too restrictive, the rename will
  fail.  Etc.  */
-  error (0, rena

bug#36831: enhance 'directory not empty' message

2019-08-01 Thread Assaf Gordon

Hello,

On Wed, Jul 31, 2019 at 08:03:45PM -0700, Paul Eggert wrote:
> Assaf Gordon wrote:
> > An explicit error explicitly saying "cannot move", and mention the source 
> > and
> > destination, and also "blames" the target directory seems the most
> > user-friendly and least ambiguous.
> 
> Sure, but that handles only the ENOTEMPTY/EEXIST case. How would you handle
> the EDQUOT, EISDIR, and ENOSPC cases? Will you invent a separate diagnostic
> for each case, or just treat them as in my proposed patch? I assume the
> latter, but either way I'd like to see a patch that handles these properly
> too. Also, please handle ETXTBUSY while you're at it (sorry, I missed that
> one).
> 
> > For the second and third cases,
> > "No space" and "Quota exceeded" seem to me to always relate to the
> > destination, and I don't think users get confused about those
> > (other opinions of course welcomed).
> 
> What's obvious to experts like us is not always obvious to users. If users
> get confused by the current diagnostic for ENOTEMPTY/EEXIST, I don't see why
> they wouldn't also get confused for ETXTBUSY etc.
> 
> > Your patch also added "EISDIR", for which rename(2) says:
> >  "newpath is an existing directory, but oldpath is not a directory."
> > 
> > But I don't think this error can happen with gnu mv.
> 
> It can, as a result of a race condition if some other process is mutating
> the file system while 'mv' is running. Admittedly unlikely, but we might as
> well improve this errno value while we're improving the others.

All good points.

Please see attached updated version.

It does add explicit error string for each error code, but I hope the
implementation is reasonable and easy to maintain and translate.

-assaf
>From 8ee71b24d74d7cfe81f151de430d38935cf04675 Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Mon, 29 Jul 2019 00:23:20 -0600
Subject: [PATCH] mv: improve error messages when target directory is at fault

Suggested by Alex Mantel  in
https://bugs.gnu.org/36831 .

$ mkdir A B B/A
$ touch A/bar B/A/foo

Before:

$ mv A B
mv: cannot move 'A' to 'B/A': Directory not empty

After:

$ mv A B
mv: cannot move 'A' to 'B/A': Target directory not empty

The following errors are handled:
EDQUOT, EEXIST, ENOTEMPTY, EISDIR, ENOSPC, ETXTBSY.

* src/copy.c (copy_internal): Print custom messages for errors
that explicitly fault the target directory.
(strerror_target): New function, return custom and translatable error
messages.
* tests/mv/dir2dir.sh: Adjust expected error message.
* NEWS: Mention change.
---
 NEWS|  6 +
 src/copy.c  | 56 ++---
 tests/mv/dir2dir.sh |  6 ++---
 3 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/NEWS b/NEWS
index fd0543351..4ec4d0df0 100644
--- a/NEWS
+++ b/NEWS
@@ -44,6 +44,12 @@ GNU coreutils NEWS-*- 
outline -*-
   stat(1) also supports a new --cached= option to control cache
   coherency of file system attributes, useful on network file systems.
 
+** Improvements
+
+  rm now prints clearer error messages when a failure relates to the
+  target directory (e.g., "Target directory is not empty" instead of
+  "Directory not empty").
+
 
 * Noteworthy changes in release 8.31 (2019-03-10) [stable]
 
diff --git a/src/copy.c b/src/copy.c
index 65cf65895..9cf02ad9c 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1867,6 +1867,38 @@ source_is_dst_backup (char const *srcbase, struct stat 
const *src_st,
   return dst_back_status == 0 && SAME_INODE (*src_st, dst_back_sb);
 }
 
+static char*
+strerror_target (int e)
+{
+  /* TRANSLATORS: These strings should mimick libc's standard
+ error messages (from strerror(3)), but explicitly mention
+ the fault is with the target directory. */
+  switch (errno)
+{
+case EDQUOT:
+  return _("Disk quota exceeded on target device");
+case EEXIST:
+case ENOTEMPTY:
+  return _("Target directory not empty");
+case EISDIR:
+  return _("Tried to overwrite a directory with a file");
+case ENOSPC:
+  return _("No space left on target device");
+case ETXTBSY:
+  /* NOTE: The error is "Text file busy" - but "text" in that context
+ refers to "text segment" of an executable file (as opposed to
+ "data segment" and "BSS segment").
+
+ This error message is meant for users, and 'text file' can be easily
+ confused with an actual text file (i.e., one containing only ASCII
+ characters. Thus, say 'executable' instead of 'text'.*/
+  return _("Target executable file is busy");
+default:
+  assert (0);
+}
+}
+
+
 /* Co

bug#36831: enhance 'directory not empty' message

2019-07-31 Thread Assaf Gordon

Hello Paul,

On Mon, Jul 29, 2019 at 06:50:46PM -0500, Paul Eggert wrote:
> On 7/29/19 1:28 AM, Assaf Gordon wrote:
> > +  if (rename_errno == ENOTEMPTY || rename_errno == EEXIST)
> > +{
> > +  error (0, 0, _("cannot move %s to %s: Target directory not 
> > empty"),
> > + quoteaf_n (0, src_name), quoteaf_n (1, dst_name));
> 
> Although this is an improvement, it is not general enough, as other errno
> values are relevant only for the destination. Better would be to have a
> special case for errno values that matter only for the destination, and use
> the existing code for errno values where we don't know whether the problem
> is the source or the destination. Something like the attached, say.

> +case EDQUOT: case EEXIST: case EISDIR: case ENOSPC: case 
> ENOTEMPTY:
> +  error (0, rename_errno, "%s", quotearg_colon (dst_name));
> +  break;
> +

Thanks for the review.

At the risk of bikeshedding, I'd like to argue for the prior method.
While it is not general enough, I think it provides a clearer error message.

For example, with the more general implementation the errors would be:

  $ mv A B
  mv: B/A: Directory not empty

  $ mv A B
  mv: B/A: No space left on device

  $ mv A B
  mv: B/A: Quota exceeded

In the first case,
I think this error is potentially more confusing than
before: while it doesn't mention the source directory, it also doesn't
say "cannot move" - so it is only implied it is an error (an
inexperienced user might dismiss this as a warning).

Also, it could be that there will be a source directory named very similarly
to the destination directory, and from a quick glace it would not be easy to
understand what happened.

An explicit error explicitly saying "cannot move", and mention the source and
destination, and also "blames" the target directory seems the most
user-friendly and least ambiguous.

---

For the second and third cases,
"No space" and "Quota exceeded" seem to me to always relate to the
destination, and I don't think users get confused about those
(other opinions of course welcomed).

---

Your patch also added "EISDIR", for which rename(2) says:
 "newpath is an existing directory, but oldpath is not a directory."

But I don't think this error can happen with gnu mv.
If we try to move a file onto a directory, we get:

  $ mkdir C C/D ; touch D
  $ mv D C
  mv: cannot overwrite directory 'C/D' with non-directory

And this case is specifically handled in copy.c line 2131, before
calling rename(2)  (and also this is an example of a custom error
message instead of using stock libc messages).

---

Happy to hear your opinion,
 - assaf

bug#36831: enhance 'directory not empty' message

2019-07-29 Thread Assaf Gordon

Hello,

On Sun, Jul 28, 2019 at 08:58:59PM +0200, Alex Mantel wrote:
[...] 
> Ah, the target directory does exist! Hmm... But i'd like the message to be
> like:
> 
>    $ mv thing/ ../things
>    mv: cannot move 'thing' to '../things/things': Targetdirectory not empty
> 
>   ^ this little thing here,
>     it explains everyting.
> 
> Change text from 'Directory not empty' to 'Targetdirectory not empty'.

Thanks for the report.

To clarify, the scenario is:

$ mkdir A B B/A
$ touch A/bar B/A/foo
$ mv A B
mv: cannot move 'A' to 'B/A': Directory not empty

And the reason (as you've found out) is that the target directory 'B/A'
is not empty (has the 'foo' file in it).
Had this been allowed, moving 'A' to 'B/A' would result in the 'foo'
file disappearing.

---

How is a user expecting to know this error is about that target
directory?

There is a bit of a trade-off here between user-friendliness (especially
for non-technical user) and more technical knowledge.
If we go one step 'lower' to the programming interface, almost all
sources mention this is about the 'target' directory not being empty:

POSIX's says:
https://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html
[EEXIST] or [ENOTEMPTY]
The link named by new is a directory that is not an empty directory.

Linux's rename(2) manual page says:
ENOTEMPTY or EEXIST
newpath is a nonempty directory, that is, contains entries
other than "." and "..".

FreeBSD's rename(2) manual page says:
[ENOTEMPTY]The to argument is a directory and is not empty.

AIX rename(2) manual page says:
 ENOTEMPTY
   The ToPath parameter specifies an existing directory that is
   not empty.

So there is some merit in claiming this helpful piece of information is
lost when the error message is reported to the user.

---

In GNU coreutils this error message originates from 'copy.c' line 2480:
https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/copy.c#n2480

error (0, rename_errno,
  _("cannot move %s to %s"),
  quoteaf_n (0, src_name), quoteaf_n (1, dst_name));

And herein lies the (technical) problem: The actual message "Directory
not empty" is not in the source code - it is a system error message
that corresponds to the value of 'rename_errno' variable
(ENOTEMPTY/EEXIST). It originates from GLibc (or another libc).

So there is no trivial way to change the error message in coreutils.

Attached a patch to add special handling for this error.

---

What do others think? If this is a desired improvement, I'll finish the
patch with news/tests/etc.

regards,
 - assaf
>From 430b30104234db719bf15e6fc681a62312c7124f Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Mon, 29 Jul 2019 00:23:20 -0600
Subject: [PATCH] mv: improve ENOTEMPTY/EEXIST error message

Suggested by Alex Mantel  in
https://bugs.gnu.org/36831 .

$ mkdir A B B/A
$ touch A/bar B/A/foo

Before:

$ mv A B
mv: cannot move 'A' to 'B/A': Directory not empty

After:

$ mv A B
mv: cannot move 'A' to 'B/A': Target directory not empty

* src/copy.c (copy_internal): Add special handling for ENOTEMPTY/EEXIST.
TODO: NEWS, tests.
---
 src/copy.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/copy.c b/src/copy.c
index 65cf65895..a5af570bf 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -2450,6 +2450,14 @@ copy_internal (char const *src_name, char const 
*dst_name,
   return true;
 }

+  if (rename_errno == ENOTEMPTY || rename_errno == EEXIST)
+{
+  error (0, 0, _("cannot move %s to %s: Target directory not empty"),
+ quoteaf_n (0, src_name), quoteaf_n (1, dst_name));
+  forget_created (src_sb.st_ino, src_sb.st_dev);
+  return false;
+}
+
   /* WARNING: there probably exist systems for which an inter-device
  rename fails with a value of errno not handled here.
  If/as those are reported, add them to the condition below.
-- 
2.11.0

bug#36674: Sort Suggestion

2019-07-15 Thread Assaf Gordon

tag 36674 notabug
close 36674
stop

Hello,

On Mon, Jul 15, 2019 at 11:42:01AM -0700, Marshall Lake wrote:
> Even though this isn't a bug, I was asked to send the following to this
> email address.

(General suggestions and discussions are better suited for
coreut...@gnu.org mailing list, that way the system won't open a new
bug item.)

> 
> Re:  SORT Command from GNU coreutils 8.25
> 
> A suggestion for an additional option to the SORT command is to ignore
> non-alphanumeric characters.
> 
> As an example, in attempting to sort an index ...
> 
> Abbott, William259
> 
> sorts before:
> 
> Abbot, William 099
> 
> If non-alphanumeric characters were ignored then the same two records
> would sort as:
> 
> Abbot, William 099
> Abbott, William259
> 
> 

There's actually something else at play here:
In your case, sort does ignore non-alphanumeric characters,
but it ALSO ignores white space.
That happens because your locale is set to some language
(for example, en_US.UTF8).

Using such locale makes sort ignore all non-alphanumeric chareacters,
whitespace, and upper/lower cases.

In essense, you are compaing "AbbottWilliam" (two 't's) to
'AbbotWilliam' (one 't') - and then the second 't' is compared to a 'w',
and is determined to come first.

If you force a POSIX/C locate, then all characters are considered,
and the result will be as you requested.

Observe the following:

  $ printf "%s\n" AbbottWilliam AbbotWilliam | LC_ALL=en_CA.utf8 sort
  AbbottWilliam
  AbbotWilliam

  $ printf "%s\n" "Abbott William" "Abbot William" | LC_ALL=en_CA.utf8 sort
  Abbott William
  Abbot William

  $ printf "%s\n" "Abbott William" "Abbot William" | LC_ALL=C sort
  Abbot William
  Abbott William

  $ printf "%s\n" "Abbott, William" "Abbot, William" | LC_ALL=C sort
  Abbot, William
  Abbott, William

Note that 'sort' already has an option for dictionary style sorting:
   -d, --dictionary-order: consider only blanks and alphanumeric characters.

However, locale rules take precedence over it, so effectively it only
works in "C" locale:

  $ printf "%s\n" "Ab,,b,,ott William" "Abbot William" | LC_ALL=C sort
  Ab,,b,,ott William
  Abbot William

  $ printf "%s\n" "Ab,,b,,ott William" "Abbot William" | LC_ALL=C sort -d
  Abbot William
  Ab,,b,,ott William

You can read past discussion about the confusion resulting from locale
sorting rules here:
   https://debbugs.gnu.org/11621
   https://debbugs.gnu.org/12783

As such, I'm closing this as "not a bug", but discussion can continue
by replying to this thread.

-assaf

bug#36671: tail: unrecognized file system type 0x794c7630 for ‘/var/log/messages’. please report this to bug-coreutils@gnu.org. reverting to polling

2019-07-15 Thread Assaf Gordon

tag 36671 notabug
close 36671
stop

Hello,

On Mon, Jul 15, 2019 at 06:22:47PM +0200, John Koppolu wrote:
> tail: unrecognized file system type 0x794c7630 for ‘/var/log/messages’.
> please report this to bug-coreutils@gnu.org. reverting to polling

You've previously reported this 4 days ago,
please see the reply there:
  https://bugs.gnu.org/36600#8

-assaf

bug#36600: unrecognized file system type 0x794c7630 for ‘/var/log/messages’. please report this to bug-coreutils@gnu.org. reverting to polling

2019-07-11 Thread Assaf Gordon

tag 36600 notabug
close 36600
stop

Hello,

On Thu, Jul 11, 2019 at 05:53:16PM +0200, John Koppolu wrote:
> unrecognized file system type 0x794c7630 for ‘/var/log/messages’. please
> report this to bug-coreutils@gnu.org. reverting to polling
>

This has system (overlayfs, commonly used with Docker containers) has
been added in version 8.25. Consider upgrading Coreutils if possible.

See https://www.gnu.org/software/coreutils/filesystems.html for
more details.

regards,
 - assaf

bug#35939: version sort is incorrect with hyphen-minus

2019-06-26 Thread Assaf Gordon

Hello Paul,

On Wed, Jun 26, 2019 at 12:57:14PM -0700, Paul Eggert wrote:
> GNU sort uses the same algorithm as glibc strverscmp,

I think that both sort and ls use 'filevercmp' - a simplified version
that does not support locales (and doesn't fail).

The change (from 'strvercmp') was made in:

  commit e505736f8211a608b00dfe75fb186a5211e1a183
  Author: Kamil Dudka 
  Date:   Fri Oct 3 11:03:40 2008 +0200
  ls and sort: use filevercmp instead of strverscmp

https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=e505736f8211a608b00dfe75fb186a5211e1a183

> Has the Debian version-comparison algorithm changed since 1997? If so, could
> you give details about the changes to the Debian algorithm?

I don't think the algorithm changed in Debian,
and also in gnulib there are only a handful of relevant commits, all 10
years old:

  9121662f1 2008-10-03 filevercmp: new module
  0443c2f39 2009-03-05 filevercmp: Move hidden files up in ordering.
  1721cf06d 2009-03-24 filevercmp: handle simple~ and numbered.~3~ backup 
suffixes
  4fd008794 2009-04-09 filevercmp: fix regression
  cc96df30d 2009-04-09 filevercmp: correct today's change

I think (also based on Ian's confirmation) that this discrepancy was
from the beginning.

I now notice that there's an additional difference: coreutils/gnulib has
special handling for extension, hidden files and backup files.

As Ian wrote, a documentation improvement is probably the best fix.
I'll try to come up with a suggested change.

-assaf

P.S.

For completion, here are few other threads with details/explanations
about 'version-sort':
https://bugs.gnu.org/18168
https://bugs.gnu.org/22275
https://bugs.gnu.org/22455
https://bugs.gnu.org/33786

bug#35939: version sort is incorrect with hyphen-minus

2019-06-26 Thread Assaf Gordon

(Adding Ian Jackson for dpkg/debian-version details)

Hello,

On Tue, May 28, 2019 at 02:53:39AM +0200, Vincent Lefevre wrote:
> With GNU coreutils 8.30 under Debian/unstable, I get:
> 
> $ LC_ALL=C ls
> ab-cd  abb  abe
> $ LC_ALL=C ls -v
> abb  abe  ab-cd
> 
> The hyphen-minus character should still be regarded as being less
> than the letters (there are no digits, so both are expected to be
> equivalent). The GNU coreutils manual says:
> 
[...]

Thanks for the report and the clear details.

To summarize,
"ls -v" and "sort -V" (coreutils' version sort) behaves differently than
other implementations in regards to minus character:

$ printf "%s\n" abb ab-cd | sort -V
abb
ab-cd

$ v1="abb"
$ v2="ab-cd"
$ dpkg --compare-versions "$v1" lt "$v2" && printf "$v1\n$v2\n" || printf 
"$v2\n$v1\n"
ab-cd
abb

If I understand correctly,
The reason is that in Debian's version comparison algorithm [1], the minus
character has a special meaning: it separates the "upstream version"
part from the "debian revision" part.

In Debian's implementation [2], a version string is first split into three
parts (epoch, upstream version, debian revision) using ":" for epoch
delimiter and "-" for revision delimiter. Only then the three parts are
compared, separately [3].

[1] https://www.debian.org/doc/debian-policy/ch-controlfields.html#version
[2] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/parsehelp.c#n191
[3] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/version.c#n140

On ther other hand, coreutils' implementation (from gnulib [4]) does not
break version string into three parts - it treats the entire string as a
single "upstream version" part.
The rules for sorting the "upstream version" string say:

  "... The lexical comparison is a comparison of ASCII values modified so
  that all the letters sort earlier than all the non-letters and so that a
  tilde sorts before anything" (from [1])

[4] https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/filevercmp.c

Therefore, dpkg first seprates "ab" from "cd", then compares "ab" to
"abb" - and 'ab' comes first;
Coreutils compare "ab-cd" to "abb" (or technically, just "ab-" to
"abb"), and because "letters sort earlier than all non-letters", "abb"
comes first.

I hope this helps explain the differences (I also hope this explanation is
correct, and I invite others to chime in).

regards,
 - assaf

bug#35654: We've found a vulnerability of gnu chown, please check it and request a cve id for us.

2019-06-26 Thread Assaf Gordon

tag 35654
close 35654
stop

Hello,

On Thu, May 09, 2019 at 11:53:11PM +0800, st0n3 ss wrote:
> Hello! we have found a vulnerability of command chown, please check it.If
> it is a vulnerability. please request a cve id for use, thank you!chown -h
> bypass

Given Paul's and Bob's detailed answers, I'm closing this as "not a bug".

Discussion can continue by replying to this thread.

regards,
 - assaf

bug#36130: split bug

2019-06-26 Thread Assaf Gordon

tag 36130 notabug
close 36130
stop

Hello,

On Mon, Jun 10, 2019 at 04:50:20PM -0600, Assaf Gordon wrote:
> On 2019-06-10 12:28 p.m., Heather Wick wrote:
> > Verbose: This seems to have made the same number of files this time; not
> > sure why the other 3-4 times I ran it it did not. They appear to be the
> > same size, with paired last reads
> [...]
> 
> Glad to hear it worked.
> 
> Could it be that in previous times the queued job ran out of disk space?
> 
> That would be my first guess, as such things are common in shared
> grid/cluster environments, particularly if your job runs in a temporary
> and limited storage location (e.g. "/tmp/job-").


With no further comments, I'm closing this ticket.
If more issues arise (or this was not adequate solution) we can always
re-open this ticket.

regards,
 -assaf

bug#35632: date Parse of '13:00 + 2 hours' Broken.

2019-06-26 Thread Assaf Gordon

tag 35632 notabug
close 35632
stop

Hello,

(sorry for the delayed reply)

On Wed, May 08, 2019 at 12:57:10PM +0100, Ralph Corderoy wrote:
> 
> Using date from coreutils 8.31-1 on Arch Linux.
> This surprised me.
> 
> $ TZ=UTC0 /bin/date -d '1pm + 2 hours'
> Wed  8 May 15:00:00 UTC 2019
> $ TZ=UTC0 /bin/date -d '13:00 + 2 hours'
> Wed  8 May 12:00:00 UTC 2019
> 
> The documentation doesn't suggest `1pm' and `13:00' are treated
> differently.  `--debug' helps.
> 
> $ TZ=UTC0 /bin/date --debug -d '1pm + 2 hours'
> date: parsed time part: 01:00:00pm
> date: parsed relative part: +2 hour(s)
> ...
> $ TZ=UTC0 /bin/date --debug -d '13:00 + 2 hours'
> date: parsed time part: 13:00:00 UTC+02
> date: parsed relative part: +1 hour(s)
> date: input timezone: parsed date/time string (+02)
> ...
> 
> It looks like parsing is broken in the second case.

Thank you for for providing detailed output with "--debug",
makes things easier to troubleshoot.

When encountering a time string (HH:MM or HH:MM:SS) followed by a plus
sign and a number, date's parser *always* treats it as a timezone
(giving timezones higher priority than time adjustments).


> The result I wanted can also be obtained my omitting the `+'.
> 
> $ TZ=UTC0 /bin/date -d '1pm 2 hours'
> Wed  8 May 15:00:00 UTC 2019
> $ TZ=UTC0 /bin/date -d '13:00 2 hours'
> Wed  8 May 15:00:00 UTC 2019

And this is indeed one possibly solution.

Other similar issues are detailed here:
https://lists.gnu.org/archive/html/bug-coreutils/2018-10/msg00126.html

As such, I'm closing this ticket, but discussion can continue by
replying to this thread.

regards,
 - assaf

bug#36383: date command processes timezone differently when doing math

2019-06-26 Thread Assaf Gordon

tag 36383 notabug
close 36383
stop

Hello,

On Tue, Jun 25, 2019 at 04:10:07PM -0700, Brian Woods wrote:
> When doing a math operation to a date command it appear to process the
> timezone differently.
[...]
>
> #echo $datNow
> 2019-06-25 15:21:34
>
> #date -d "$datNow + 1 minute" "+%Y-%m-%d %H:%M:%S" --debug
> date: parsed date part: (Y-M-D) 2019-06-25
> date: parsed time part: 15:21:34 UTC+01
> date: parsed relative part: +1 minutes
> date: input timezone: parsed date/time string (+01)

Thank you for providing detailed examples with "--debug",
makes things much easier to troubleshoot.

The issue is that a time string (HH:MM:SS) followed by a plus
sign and a number is *always* taken to be a time zone.

Using a value other than 1 will show it more clearly:

  $ date -d "$datNow + 8 minutes" "+%Y-%m-%d %H:%M:%S" --debug
  date: parsed date part: (Y-M-D) 2019-06-25
  date: parsed time part: 15:21:34 UTC+08
  date: parsed relative part: +1 minutes
  date: input timezone: parsed date/time string (+08)

The "+8" part is treated as timezone,
and the remaining text ("minutes") is taken as a one-minute time
adjustment.

One solution is to just remove the plus sign:

  $ date -d "$datNow 8 minutes" "+%Y-%m-%d %H:%M:%S" --debug
  date: parsed date part: (Y-M-D) 2019-06-25
  date: parsed time part: 15:21:34
  date: parsed relative part: +8 minutes
  date: input timezone: system default
  [...]
  2019-06-25 15:29:34

Another is to specify the time zone:

  $ date -d "$datNow +00:00 +8 minutes" "+%Y-%m-%d %H:%M:%S" --debug
  date: parsed date part: (Y-M-D) 2019-06-25
  date: parsed time part: 15:21:34 UTC+00
  date: parsed relative part: +8 minutes
  date: input timezone: parsed date/time string (+00)
  [...]
  2019-06-25 09:29:34


More examples of adjusting time strings are here (your example is similar
to case #1):
https://lists.gnu.org/archive/html/bug-coreutils/2018-10/msg00126.html

As such, I'm closing this ticket but discussion can continue by replying
to this thread.

regards,
 - assaf

bug#36130: split bug

2019-06-10 Thread Assaf Gordon

Hello,

On 2019-06-10 12:28 p.m., Heather Wick wrote:
Thank you so much for your response. Here are the results of the tests 
you sent:

Verbose: This seems to have made the same number of files this time; not 
sure why the other 3-4 times I ran it it did not. They appear to be the 
same size, with paired last reads

[...]

Glad to hear it worked.

Could it be that in previous times the queued job ran out of disk space?

That would be my first guess, as such things are common in shared 
grid/cluster environments, particularly if your job runs in a temporary

and limited storage location (e.g. "/tmp/job-").

I would suspect that the exit-code you are seeing is the exit code
of the entire job (that is - of the shell script that is being qsub'd),
and not necessarily that of 'split' (then again, this might not be 
correct if you explicitly checked the exit code of 'split').

Given that your grid environment already has configuration issues
(the bash and "module" related errors), I would not be surprised if
the exit code is not reliable.

I would strongly encourage to always look into the STDERR file
of the job to verify no other errors occurred.

Or, perhaps write shell scripts more defensively, like so:

  [...]
  zcat MH1_R1.fastq.gz | split -l 4000 - DHT_R1_ \
&& echo split MH1_R1 OK \
|| echo split MH1_R1 FAILED
  [...]

Then checking the STDOUT for positive confirmation each program succeeded.
Or perhaps:

  # define a shell function "die" to print an error and terminate
  die()
  {
base=$(basename "$0")
echo "$base: error: $*" >&2
exit 1
  }

  zcat MH1_R1.fastq.gz | split -l 4000 - DHT_R1_ \
|| die "split MH1_R1 failed"

And then run at least one job that will fail on purpose,
and ensure you see the error message in the STDERR log,
and you get a non-zero exit code (and then ensure you use 'die'
on every command).

It is sometimes recommended to use "set -e" for "easy"
error handling in shell scripts- but I would recommend against it.
Many reasons detailed here: https://mywiki.wooledge.org/BashFAQ/105

It might be more frustrating to add such extra checks on every
program, but from my humble experience, grid environments bring
on so many more intermittent and transient problems that it is
definitely worth it.

STDERR:
The only thing in the stderr file is an odd duck of:

-sh: module: line 1: syntax error: unexpected end of file

-sh: error importing function definition for `BASH_FUNC_module'

Python 3.6.8 :: Anaconda, Inc.

/bin/sh: module: line 1: syntax error: unexpected end of file

/bin/sh: error importing function definition for `BASH_FUNC_module'

but this prints for every job I run with this particular flavor of 
conda/bash and doesn't seem to affect anything else (as far as I know)

These errors are specific to your grid/cluster environment,
and the best place to ask is the I.T or bioinformatics department in
your institute (whomever is in charge of the cluster).

Broadly speaking, "module" is mechanism that ease the use of
various software packages. It is usally setup by your IT administrators.
A typical use-case is to have different version of programs in non-
standard locations, e.g.
   samtools version 1.6 in /opt/it/programs/samtools-1.6
 and
   samtools version 1.9 in /opt/bioinfo/tools/new/samtools/

and then cluster users (e.g. you) just need to add:
   "module load samtools-1.8"
and have the command "samtools" just work without knowing
the gritty details of where the program is.

It seems that in your case, something relating to the "module"
setup is broken.

More information here: 
https://en.wikipedia.org/wiki/Environment_Modules_(software)

All jobs finished well below allotted memory and with exit status 0, 
even when split didn't make the right number of output files.

>
> Do you know any reason why the behavior would be inconsistent?

The "alloted memory" is a non-issue for this "split" command,
it will always use very little amount of memory regardless of how big
the input files are.

As for "exit status 0" - I can't be sure, but I suspect the exit status
you see is the one of the entire job (i.e. the shell script),
and perhaps it does not represent the exit code of the "split" program.

If you have the STDERR files of the jobs which failed, it's worth
checking them for any additional error messages.

Pairing check: unfortunately my server's version of bash doesn't support 
paste in this way, I've run into this issue before but I forget what the 
workaround is. I can't run this command interactively because my server 
times out (these files are > 3 billion lines each, so it takes a long 
time to zcat them)

Ah yes, the construct:

   program <(other program)

is a "bash" feature that is not available in simple shell scripts
(interactive use vs non-interactive and other things).

One work-around is to run (from inside your script):

  bash -c "paste <(zcat MH1_R2.fastq) <(zcat MH1_R2.fastq.gz)" \
   | awk 'NR%4!=1

bug#36130: split bug

2019-06-07 Thread Assaf Gordon

Hello,

On Fri, Jun 07, 2019 at 09:48:44PM -0400, Heather Wick wrote:
> Yes, sorry, I should have specified that I already checked that the
> original fastq files are indeed paired and sorted with the same number of
> lines and same starting/ending IDs, narrowing down the issue to a problem
> with split.

It could be a problem with "split", but we'll need to dig a bit deeper
to be able to pinpoint the exact issue.

Could you please try the following commands and post the results?

zcat MH1_R1.fastq.gz \
   | split --verbose -l 4000 - DHT_R1_ > DHT_R1.log ; echo DHT_R1 exit 
code: $?
zcat MH1_R2.fastq.gz \
   | split --verbose -l 4000 - DHT_R2_ > DHT_R2.log ; echo DHT_R2 exit 
code: $?
wc -l DHT_R1.log DHT_R2.log

Two more questions:
1. can you post the result of "split --version" ?
2. You mentioned "jobs" - if you are running these as submitted jobs on
a cluster (e.g. with "qsub"), can you double-check the STDERR log files
to ensure no errors where encountered ?

If we still can't pinpoint the issue, the next steps would be to check
the DHT_R{1,2}.log files, and then try to compare the content of the
splitted files.

I assume the input files are indeed correctly paired, but just to check,
if you could try the following command, it should not print anything
to the screen (indicating all sequence IDs are paired):

paste <(zcat MH1_R2.fastq) <(zcat MH1_R2.fastq.gz) \
   | awk 'NR%4!=1 { next } $1!=$3 { print "Error in line " NR ":" $1 " vs " 
$3 }'

regards,
 - assaf

bug#36130: split bug

2019-06-07 Thread Assaf Gordon

Hello,

On Fri, Jun 07, 2019 at 02:23:15PM -0400, Heather Wick wrote:
> I am using split to split up some large, paired fastq files [...]:
>
>   zcat MH1_R1.fastq.gz | split - -l 4000 DHT_R1_
>   zcat MH1_R2.fastq.gz | split - -l 4000 DHT_R2_
>
> This creates 96 chunks for the R1 and 95 chunks for R2, even though the
> orignal fastq files have the same number of reads.
>
> Do you have any suggestions for how to proceed? Perhaps zcatting and piping
> the files is not the best way to call split?

To help diagnose to issue better, please run the following commands
and tell us what are the results:

1. number of lines in each file:

   zcat MH1_R1.fastq.gz | wc -l
   zcat MH1_R2.fastq.gz | wc -l

2. The first two sequence IDs:

   zcat MH1_R1.fastq.gz | head -n8 | grep ^@
   zcat MH1_R2.fastq.gz | head -n8 | grep ^@

3. Last two sequence IDs:

   zcat MH1_R1.fastq.gz | tail -n8 | grep ^@
   zcat MH1_R2.fastq.gz | tail -n8 | grep ^@

These will just verify the FASTQ files are indeed paired with no
surprises. The files should have the same number of lines,
and matching sequence IDs in the first and last lines.

regards,
 - assaf

bug#35587: sort order wrt lower/upper case

2019-05-05 Thread Assaf Gordon


tags 35587 notabug
close 35587
stop

Hello,

On 2019-05-05 1:01 p.m., Toralf Förster wrote:

I'd expect "B" being the first line here:

echo a B c d | xargs -n 1 | sort

using sys-apps/coreutils-8.30 at a stable hardened Gentoo Linux, but it is "a". 
Is this a bug or a feature?


This is just a matter of your locale (e.g. "de_DE.UTF8" ?)
that sorts letters without regard to case.

If you force C locale you'll get "B" first:

  $ echo a B c d | xargs -n 1 | LC_ALL=C sort
  B
  a
  c
  d


Adding "--debug" will show a warning and help diagnose such issues in 
the future:


  $ sort --debug
  sort: using ‘ca_EN.utf8’ sorting rules
  ...
  ...

As such, I'm closing this as not-a-bug, but discussion can continue by 
replying to this thread.


-assaf

bug#34825: New fails in tests/{misc,cp} in v8.31 on OpenIndiana

2019-04-16 Thread Assaf Gordon


tags 34825 fixed
close 34825
stop

Hello,

On 2019-04-10 5:05 a.m., Michal Nowak wrote:


the patch worked on OpenIndiana as well.



Thanks for confirming, I'm closing this bug.

regards,
 -assaf

bug#35289: closed (Re: bug#35289: date+%-Y -d "- N years" errors when N > 111)

2019-04-16 Thread Assaf Gordon


Hello,

On 2019-04-15 5:10 p.m., O. Emmerson wrote:

For me it gives:

$ ./inv-year
time() = 1555369320
localtime() = 2019-04-16 00:02:00
   (mday=16 wday=2, isdst=1)
struct tm (after adjustment) = 0009-04-16 00:02:00
   (mday=16 wday=2, isdst=1)
inv-year: mktime() failed: Value too large for defined data type

>

On 2019-04-15 6:50 p.m., C de-Avillez wrote:
[...]

root@u1904:~# gcc -o inv-year inv-year.c
root@u1904:~# ./inv-year
time() = 1555375408
localtime() = 2019-04-16 00:43:28
   (mday=16 wday=2, isdst=0)
struct tm (after adjustment) = 0009-04-16 00:43:28
(mday=16 wday=2, isdst=0)
mktime() after date adjustment = -61874061392

So: a pristine 19.04 runs it. My laptop (which is my work machine,
full of other packages & programs), does not.



Thank you both for testing.

So, to summarize:
whenever "inv-year" fails - it is a problem with glibc on your
setup, *not* a problem in coreutils' date(1) program.

If there is a setup where "inv-year" succeeds but date(1) still fails,
then it is a problem in coreutils.

I'm glad to hear latest Ubuntu 19.04 is working fine
(though the reason for the earlier failure is still a mystery).

As Paul suggested, trying 'strace' on the failing system
might reveal more details.

regards,
 - assaf

bug#35289: closed (Re: bug#35289: date+%-Y -d "- N years" errors when N > 111)

2019-04-15 Thread Assaf Gordon


Hello,

On 2019-04-15 11:55 a.m., C de-Avillez wrote:

19.04:


It is worth noting that Ubuntu 19.04 has not been officially released
yet, so you are testing on a development branch (or a release-candidate,
or a special built infrastructure as hinted by your path).


cerdea@piatam:/data/buildd/coreutils$ date +%-Y -d '- 2010 years'
date: invalid date ‘- 2010 years’
1 cerdea@piatam:/data/buildd/coreutils$ date --version
date (GNU coreutils) 8.30


[...]


On Mon, Apr 15, 2019 at 12:16 PM O. Emmerson  wrote:


$ file /bin/date
/bin/date: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV),
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for
GNU/Linux 3.2.0, BuildID[sha1]=26fa7f6c43c354d8c5647ebf946255a2b8e3c53d,
stripped


I just downloaded a recent daily snapshot of
ubuntu desktop live CD for amd64 from 
http://cdimage.ubuntu.com/daily-live/current/ .

The file "disco-desktop-amd64.iso" dated 2019-04-13 22:28 size 1.9GB,
with the following checksum:

  $ sha1sum disco-desktop-amd64.iso
  b89fb143b51e17482a3882abe2f5f4e3b69942fe  disco-desktop-amd64.iso

Booting with QEMU as live-cd, I tested the same command and got "9" as
the (correct) result. So this can't be easily reproduced.

An interesting benefit of reproducible builds is that I see on the
live-cd image the sha1 checksum of "/bin/date" is the same as you listed
above. This hints to me the problem is somewhere else in your setup.

As this is not an official release, we really can't support it.
You'll have to dig further and see what is the issue.

A good starting point is adding the "--debug" option to date(1)
and examining its output.

regards,
 - assaf

bug#35109: date 'tomorrow' bug

2019-04-02 Thread Assaf Gordon


tags 35109 notabug
close 35109
stop

Hello,

On 2019-04-02 7:23 a.m., Maximilian Gleißner wrote:

I have encountered a possible bug with the date function using both SuSE
LEAP 15.0 and SuSE 10.2.
This bug occurs when asking date for 'tomorrow' when there is a daylight
saving timechange.


This is not a bug, just a usage issue.


Note: The machine is located in the GMT+1 timezone, and daylight savings
time changed on 31.03.2019 02:00 jumping to 03:00


Exactly - and 'date' adjust the time accordingly by adding
an hour if the timezone was crossed.
(technically it's not date(1) but glibc, if that matters).


To replicate the bug:
date -s "2019-03-30 23:XX"  #where XX is any valid
minute, e.g. 23:35
date -d 'tomorrow'  #expected output:
2019-03-31 23:XX
actual output: 2019-04-01 00:XX


Note that 'date' printed one more critical piece of information:

   $ date
   Sat Mar 30 23:10:41 GMT 2019

   $ date -d tomorrow
   Mon Apr  1 00:10:43 BST 2019

The timezone shifted from GMT to BST - and the time was adjusted 
accordingly by adding an hour, and crossing into April 1st.


Similarly, if you waited 5 hours from 2019-03-30 23:35
it would be 5am, not 4am - and date needs to account for that:

$ date
Sat Mar 30 23:18:47 GMT 2019
$ date -d "+5 hours"
Sun Mar 31 05:18:49 BST 2019



I am aware you recommend not using local timezones and daylight savings
time, but I still think this should/could be implemented better.



The GNU coreutils team does not recommend such a thing at all.
In fact, team member Prof. Paul Eggert is the editor maintainer of the
Time Zone database ( https://en.wikipedia.org/wiki/Tz_database ) which
is used by almost every operating system and many programming languages
( https://en.wikipedia.org/wiki/Tz_database#Use_in_software_systems ).

There is a strong recommendation however, to specify "noon" (12pm)
whenever doing date arithmetics, exactly to avoid DST issues.
See:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e

  $ date
  Sat Mar 30 23:24:08 GMT 2019

  $ date -d "12pm tomorrow"
  Sun Mar 31 12:00:00 BST 2019

On the other hand, it is the European Union that wants to do away
with daylight saving time:
https://www.bbc.com/news/world-europe-45366390

To learn more about the inner-working of GNU date
and similar issues with DST, please see past discussions here:
  https://bugs.gnu.org/8357
  https://bugs.gnu.org/11101
  https://bugs.gnu.org/18159
  https://bugs.gnu.org/30795


As such, I'm marking this as "not a bug", but discussion can continue by
replying to this thread.

regards,
 - assaf

bug#34488: Add sort --limit, or document workarounds for sort|head error messages

2019-03-28 Thread Assaf Gordon


tags 34488 fixed
close 34488
stop

Hello,

The original request of "sort --limit" resulted in
an improved "env" with options new options,
which was included in the recent version 8.31.

I'm therefor closing this item.

-assaf

bug#34700: rm refuses to remove files owned by the user, even in force mode

2019-03-28 Thread Assaf Gordon


tags 34700 notabug
severity 34700 wishlist
retitle: rm: add new --force option deal with read-only directories
stop

Hello,
As explained by several people in this thread,
This is not a bug in "rm -f", but the mandated behavior.

Bob and others provided work-arounds ( https://bugs.gnu.org/34700#17 ).

As for adding a new "--really-force" option
(https://bugs.gnu.org/34700#11) - I'm marking this as a wish-list
item.

-assaf

bug#34825: New fails in tests/{misc,cp} in v8.31 on OpenIndiana

2019-03-28 Thread Assaf Gordon


retitle 34825 OpenIndiana: tests/{misc,cp} fail in v8.31
stop

Hello,

On 2019-03-11 1:10 a.m., Michal Nowak wrote:
on OpenIndiana 2018.10 (illumos kernel) the test suite has three new 
fails in v8.31 (amd64) compared to v8.30:


FAIL tests/misc/timeout-parameters.sh (exit status: 1)
FAIL tests/cp/no-deref-link1.sh (exit status: 1)
FAIL tests/cp/no-deref-link2.sh (exit status: 1)


The two 'ln' related bugs might be the same as
this item: https://bugs.gnu.org/34894

with the fix committed here:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=3e0dff3925b5e521cae468087950e85b60002d1c

Can you check whether it solved the issue on OpenIndiana as well?

-assaf

bug#34894: Another solaris 10 ln issue on 8.31

2019-03-28 Thread Assaf Gordon


tags 34894 fixed
close 34894
stop

On 2019-03-17 3:17 p.m., John Marino wrote:

On 3/17/2019 15:28, Paul Eggert wrote:

John Marino wrote:

After applying the recent patch to 8.31 ln to fix functionality on
solaris 10, I saw some improvement but I think there's something else
wrong.


Thanks. Could you please try the attached patch, which I installed on
master?


Installed here:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=3e0dff3925b5e521cae468087950e85b60002d1c


Okay, that seems to fix the regression on ln.


Thanks for confirming, I'm closing this bug.

-assaf

bug#34923: Message/race bug in 'dd'

2019-03-28 Thread Assaf Gordon


tags 34923 notabug
severity 34923 wishlist
retitle 34923 dd: add messages about IO errors
stop

On 2019-03-19 9:44 p.m., Paul Eggert wrote:

Daniel A. Gauthier wrote:

NOTICE that the "+nn" value on the line is always one off.  It says +0
after the first error, +1 after the second, etc. until the correct count
of error/short blocks is given at the end.


The count is supposed to just count short blocks, not errors. This is a 
POSIX requirement. I installed the attached documentation patch to try 
to make this clearer.


The documentation improvement was added here:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=59e01d13e600be0d2c7f08f3aff8cf11936b3ea1


Perhaps dd should output a separate line to stderr that summaries I/O 
errors; it could do that without violating POSIX.


Marking this as a wishlist item.

-assaf

bug#34968: (no subject)

2019-03-28 Thread Assaf Gordon


tags 34968 notabug
close 34968
stop

On 2019-03-24 9:12 a.m., Bernhard Voelker wrote:

I don't know Dutch, but this looks to me like the regular output of "sha256sum 
--help"
from an older version of coreutils (<8.25, because the --ignore-missing option
is not yet there).  What is wrong with it?


With no further replies, I'm closing this bug.

-assaf

bug#34988: mv: check before asking users useless questions

2019-03-28 Thread Assaf Gordon


tags 34988 notabug
severity 34988 wishlist
retitle 34988 mv: omit useless 'overwrite?' question
stop

On 2019-03-25 4:47 p.m., Paul Eggert wrote:

On 3/24/19 11:05 PM, 積丹尼 Dan Jacobson wrote:

$ mv a b
mv: overwrite 'b'? y
mv: cannot overwrite non-directory 'b' with directory 'a'

User thinks well why didn't you check before uselessly asking me?


POSIX requires the useless question. That being said, the question could
be omitted in the case you describe, if POSIXLY_CORRECT is not specified.



Marking this as a wishlist item.

-assaf

bug#34905: uname: -i/-p returns "unknown"

2019-03-28 Thread Assaf Gordon


tags 34905 moreinfo
retitle 34905 uname: -i/-p returns "unknown"
stop

On 2019-03-19 9:48 p.m., Paul Eggert wrote:

Wellington Almeida wrote:

When using the -p and -i functions in the uname command I noticed that
it returned an unknown result, can this be a bug?


It could be a bug in the uname command, but more likely it's a kernel 
bug. Try running "strace uname -pi".

bug#35032: date ISO 8601 / RFC 3339 formats

2019-03-28 Thread Assaf Gordon


severity 35032 wishlist
retitle 35032 date: adjust rfc8601/3339 formats to W3C standard
stop

On 2019-03-28 11:20 a.m., Nicolas Mailhot wrote:
Would it be possible to make them both optional in --rfc-3339, and both 
mandatory in --iso-8601 ? Or add a --w3c option that conforms to the W3C 
profile? This is all so sad… Some languages like Go do no understand 
neither of date's output, because they follow the W3C profile.


I'm marking this as a "wishlist" item.
For reference, here are previously similar requests:

https://bugs.gnu.org/6132 - date: --rfc-3339=TIMESPEC option doesn't 
print 'T'


https://bugs.gnu.org/6453 - date -- Add new options for ISO 8601 date 
formats (-O)


https://bugs.gnu.org/14097 - date: add parsing support for ISO 8601 
basic format


-assaf

bug#33646: [PATCH] doc: improve wording of the --kibibytes option description

2019-03-15 Thread Assaf Gordon


tags 33646 fixed
close 33646
stop

On 2019-03-15 8:38 a.m., Kamil Dudka wrote:

Bug: https://bugzilla.redhat.com/1527391
---

   doc/coreutils.texi | 8 +---


I can see no more comments on this.  Could you please proceed to push it?



Thanks for the reminder.

Pushed here:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=6bd78f27fdc2df89b1219921c6f5735885f15e37

-assaf

bug#34488: Add sort --limit, or document workarounds for sort|head error messages

2019-02-25 Thread Assaf Gordon


Hello,

Thanks for all comments.

On 2019-02-24 11:33 a.m., Paul Eggert wrote:
Thanks for doing all that. Although Pádraig is not enthusiastic about a 
shortcut like -p, I'm a bit warmer to it, as it's an important special 
case to fix a wart in POSIX. No big deal either way.


For now I kept "-p", can be removed later of course.
The first patch includes Pádraig's recent suggestions (slightly modified).


The documentation should mention that SIGCHLD is special [...]


The documentation should say what happens if mutually-contradictory 
options are specified, [...]


The documentation should echo this suggestion in 
<http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html>: 


I've added those, and I welcome all improvements suggestion to 
grammar/phrasing/etc.


> There should be options --block-signal[=SIG], --unblock-signal[=SIG],
> and --setmask-signal[=SIG] that affect the signal mask, which is also
> inherited by the child. These can be implemented via pthread_sigmask.

The second patch adds these new options (separated to ease review).
As for documentation - I'm not sure what to add beyond the basic
option description. When should these be used?

A third small patch adds "env ---list-signal-actions" and
"env --list-blocked-signals" - to ease diagnostics.
Might be worth adding for completeness (e.g., for users who
need to somehow know if SIGPIPE is being ignored by the shell
or not):

$ ( trap '' PIPE && src/env --list-signal-actions )
PIPE   (13): ignore

Comments very welcomed,
 - assaf



>From 02cba657e2f63c05f859daf18a7d1032fdc32c6f Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Fri, 15 Feb 2019 12:31:48 -0700
Subject: [PATCH 1/3] env: new options
 -p/--default-signal=SIG/--ignore-signal=SIG

New options to set signal handlers to default (SIG_DFL) or ignore
(SIG_IGN) This is useful to overcome POSIX limitation that shell must
not override inherited signal state, e.g. the second 'trap' here is
a no-op:

   trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1'

Instead use:

   trap '' PIPE && sh -c 'env -p seq inf | head -n1'

Similarly, the following will prevent CTRL-C from terminating the
program:

   env --ignore-signal=INT seq inf > /dev/null

See https://bugs.gnu.org/34488#8 .

* NEWS: Mention new options.
* doc/coreutils.texi (env invocation): Document new options.
* man/env.x: Add example of --default-signal=SIG usage.
* src/env.c (signals): New global variable.
(shortopts,longopts): Add new options.
(usage): Print new options.
(parse_signal_params): Parse comma-separated list of signals, store in
signals variable.
(reset_signal_handlers): Set each signal to SIG_DFL/SIG_IGN.
(main): Process new options.
* src/local.mk (src_env_SOURCES): Add operand2sig.c.
* tests/misc/env-signal-handler.sh: New test.
* tests/local.mk (all_tests): Add new test.
---
 NEWS |   3 +
 doc/coreutils.texi   |  58 
 man/env.x|  69 ++
 src/env.c| 138 +++-
 src/local.mk |   1 +
 tests/local.mk   |   1 +
 tests/misc/env-signal-handler.sh | 146 +++
 7 files changed, 415 insertions(+), 1 deletion(-)
 create mode 100755 tests/misc/env-signal-handler.sh

diff --git a/NEWS b/NEWS
index e73cb52b8..ddbbaf138 100644
--- a/NEWS
+++ b/NEWS
@@ -81,6 +81,9 @@ GNU coreutils NEWS-*- outline -*-
   test now supports the '-N FILE' unary operator (like e.g. bash) to check
   whether FILE exists and has been modified since it was last read.
 
+  env now supports '--default-singal[=SIG]' and '--ignore-signal[=SIG]'
+  options to set signal handlers before executing a program.
+
 ** New commands
 
   basenc is added to complement existing base64,base32 commands,
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index eb1848882..c2c202b28 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -17246,6 +17246,64 @@ chroot /chroot env --chdir=/srv true
 env --chdir=/build FOO=bar timeout 5 true
 @end example
 
+@item --default-signal[=@var{sig}]
+Reset signal @var{sig} to its default signal handler. Without @var{sig} all
+known signals are reset to their defaults. Multiple signals can be
+comma-separated. The following command runs @command{seq} with SIGINT and
+SIGPIPE set to their default (which is to terminate the program):
+
+@example
+env --default-signal=PIPE,INT seq 1000 | head -n1
+@end example
+
+In the following example:
+
+@example
+trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1'
+@end example
+
+The first trap command sets SIGPIPE to ignore.  The second trap command
+ostensibly sets it back to its  default, but POSIX mandates that the shell
+must not change inherited state of the signal - so it is a no-op.
+
+Using @option{--default-signal=PI

bug#33468: A bug with yes and --help

2019-02-19 Thread Assaf Gordon


Hello,

On 2019-02-19 1:24 a.m., Bernhard Voelker wrote:

On 2/18/19 11:20 AM, Assaf Gordon wrote:

[...] what do you think ?


To Eric's suggestion, I'd remove the RESET_OPTIND function argument,
because it's never used.


+1


Re. OPTIND: what about resetting the values of all involved externals
to their previous value?

+  int saved_optind = optind;
...
+  /* Restore previous values.  */
+  optind = saved_optind;



I believe restoring optind is incorrect here - did the tests pass
after this change ?

For example, I get the following:
---
$ ./src/dd --
./src/dd: unrecognized operand ‘--’
Try './src/dd --help' for more information.

$ ./src/nohup --
./src/nohup: ignoring input and appending output to 'nohup.out'
./src/nohup: failed to run command '--': No such file or directory

$ ./src/yes -- | head -n1
--

---

All these programs expect 'optind' to point to the first non-option
argument (because they all called "getopt_long" directly before your
patch, and parse_gnu_standard_options_only() now calls getopt_long()
for them, indirectly).

So restoring it to its initial value of 1 is going to confuse the
programs when they look into argv[optind] .

Unless I got confused (it's rather late here).
I'll double-check in the morning.

regards,
 - assaf

bug#33468: A bug with yes and --help

2019-02-18 Thread Assaf Gordon


Hello,

On 2019-02-15 1:19 p.m., Eric Blake wrote:

On 2/15/19 12:32 PM, Assaf Gordon wrote:

There is at least one change in behavior, not sure if this is
bad enough to be a regression or doesn't really matter:

   $ yes-OLD me -- --help | head -n1
   me -- --help

   $ yes-NEW me -- --help | head -n1
   me --help


I would argue bug-fix.

[...]

So, I would suspect (although I have not yet tesed) that as patched, you
would get:

$ yes-NEW me -- --help | head -n1
me --help
$ POSIXLY_CORRECT=1 yes-NEW me -- --help | head -n1
me -- --help
$ yes-NEW -- me -- --help
me -- --help


Indeed - that's how it behaves with the patch.

Thanks for explaining.


In the gnulib patch:
s/optional/option/



In the coreutils patch:
s/non-options/non-option/


Attached updates with your suggested fixes.



Also, all coreutils callers pass reset_optind==false; does the gnulib
interface still need to provide a reset_optind parameter, given that
setting the parameter true forces reliance on the getopt-gnu module as
currently coded?


The "getopt-gnu" was already a dependency before this patch,
not sure if removing this parameter will save much hassle - what do you
think ?

-assaf


>From 08d0505683cebed0fc10cff082255fd79da2d989 Mon Sep 17 00:00:00 2001
From: Bernhard Voelker 
Date: Thu, 29 Nov 2018 09:06:26 +0100
Subject: [PATCH] long-options: add parse_gnu_standard_options_only

Discussed in https://bugs.gnu.org/33468 .

* lib/long-options.c (parse_long_options): Use EXIT_SUCCESS instead
of 0.
(parse_gnu_standard_options_only): Add function to
process the GNU default options --help and --version and fail for any other
unknown long or short option. See
https://gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html .
* lib/long-options.h (parse_gnu_standard_options_only): Declare it.
* modules/long-options (depends-on): Add stdbool, exitfail.
* top/maint.mk (sc_prohibit_long_options_without_use): Update
syntax-check rule, add new function name.
---
 lib/long-options.c   | 68 +++-
 lib/long-options.h   | 17 +
 modules/long-options |  2 ++
 top/maint.mk |  2 +-
 4 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/lib/long-options.c b/lib/long-options.c
index 037f74b3a..b7acdb040 100644
--- a/lib/long-options.c
+++ b/lib/long-options.c
@@ -29,6 +29,7 @@
 #include 
 
 #include "version-etc.h"
+#include "exitfail.h"
 
 static struct option const long_options[] =
 {
@@ -71,7 +72,7 @@ parse_long_options (int argc,
 va_list authors;
 va_start (authors, usage_func);
 version_etc_va (stdout, command_name, package, version, authors);
-exit (0);
+exit (EXIT_SUCCESS);
   }
 
 default:
@@ -87,3 +88,68 @@ parse_long_options (int argc,
  the probably-new parameters when/if getopt is called later.  */
   optind = 0;
 }
+
+/* Process the GNU default long options --help and --version (see also
+   https://gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html),
+   and fail for any other unknown long or short option.
+   Use with SCAN_ALL=true to scan until "--", or with SCAN_ALL=false to stop
+   at the first non-option argument (or "--", whichever comes first).
+
+   if RESET_OPTIND=true, the global optind variable will be reset to zero,
+   preparing (and requiring) a follow-up gnu-compatible getopt() call
+   (non-gnu getopt functions use optreset=optind=1 instead of 0 for reset).
+
+   if RESET_OPTIND=false, optind is left as-is (suitable for programs
+   which do not process further option parameters (but could still
+   process parameters directly by examining argv[optind]).  */
+void
+parse_gnu_standard_options_only (int argc,
+ char **argv,
+ const char *command_name,
+ const char *package,
+ const char *version,
+ bool scan_all,
+ bool reset_optind,
+ void (*usage_func) (int),
+ /* const char *author1, ...*/ ...)
+{
+  int c;
+  int saved_opterr;
+
+  saved_opterr = opterr;
+
+  /* Print an error message for unrecognized options.  */
+  opterr = 1;
+
+  const char *optstring = scan_all ? "" : "+";
+
+  if ((c = getopt_long (argc, argv, optstring, long_options, NULL)) != -1)
+{
+  switch (c)
+{
+case 'h':
+  (*usage_func) (EXIT_SUCCESS);
+  break;
+
+case 'v':
+  {
+va_list authors;
+va_start (authors, usage_func);
+version_etc_va (stdout, command_name, package, version, authors);
+exit (EXIT_SUCCESS);
+  }
+
+default:
+  (*usage_func) (exit_failure);
+  break;
+}
+}
+
+  /* Restore pr

bug#34488: Add sort --limit, or document workarounds for sort|head error messages

2019-02-18 Thread Assaf Gordon


Hello,

Thanks for all comments (on and off list).
Attached an updated patch with documentation.

The supported options are:

 --default-signal[=SIG]  reset signal SIG to its default signal handler.
 without SIG, all known signals are included.
 multiple signals can be comma-separated.
 --ignore-signal[=SIG]   set signal SIG to be IGNORED.
 without SIG, all known signals are included.
 multiple signals can be comma-separated.
 -p  same as --default-signal=PIPE

(lower-case "-p" as to not conflict with BSD, but of course can be
changed to another letter).

The new 'env-signal-handler.sh' test passes on GNU/linux, non-gnu/linux
(alpine), and Free/Open/Net BSD.

Comments very welcomed,
 - assaf

>From 3542f1762c9f14e2275fe5e61d5d7f6275b420a9 Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Fri, 15 Feb 2019 12:31:48 -0700
Subject: [PATCH] env: new options -p/--default-signal=SIG/--ignore-signal=SIG

New options to set signal handlers to default (SIG_DFL) or ignore
(SIG_IGN) This is useful to overcome POSIX limitation that shell must
not override inherited signal state, e.g. the second 'trap' here is
a no-op:

   trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1'

Instead use:

   trap '' PIPE && sh -c 'env -p seq inf | head -n1'

Similarly, the following will prevent CTRL-C from terminating the
program:

   env --ignore-signal=INT seq inf > /dev/null

See https://bugs.gnu.org/34488#8 .

* NEWS: Mention new options.
* doc/coreutils.texi (env invocation): Document new options.
* man/env.x: Add example of --default-signal=SIG usage.
* src/env.c (signals): New global variable.
(shortopts,longopts): Add new options.
(usage): Print new options.
(parse_signal_params): Parse comma-separated list of signals, store in
signals variable.
(reset_signal_handlers): Set each signal to SIG_DFL/SIG_IGN.
(main): Process new options.
* src/local.mk (src_env_SOURCES): Add operand2sig.c.
* tests/misc/env-signal-handler.sh: New test.
* tests/local.mk (all_tests): Add new test.
---
 NEWS |   3 +
 doc/coreutils.texi   |  43 
 man/env.x|  35 ++
 src/env.c| 127 +-
 src/local.mk |   1 +
 tests/local.mk   |   1 +
 tests/misc/env-signal-handler.sh | 146 +++
 7 files changed, 355 insertions(+), 1 deletion(-)
 create mode 100755 tests/misc/env-signal-handler.sh

diff --git a/NEWS b/NEWS
index fdde47593..5a8e8a3de 100644
--- a/NEWS
+++ b/NEWS
@@ -67,6 +67,9 @@ GNU coreutils NEWS-*- outline -*-
   test now supports the '-N FILE' unary operator (like e.g. bash) to check
   whether FILE exists and has been modified since it was last read.
 
+  env now supports '--default-singal[=SIG]' and '--ignore-signal[=SIG]'
+  options to set signal handlers before executing a program.
+
 ** New commands
 
   basenc is added to complement existing base64,base32 commands,
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index be35de490..57b209e07 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -17227,6 +17227,49 @@ chroot /chroot env --chdir=/srv true
 env --chdir=/build FOO=bar timeout 5 true
 @end example
 
+@item --default-signal[=@var{sig}]
+Reset signal @var{sig} to its default signal handler. Without @var{sig} all
+known signals are reset to their defaults. Multiple signals can be
+comma-separated. The following command runs @command{seq} with SIGINT and
+SIGPIPE set to their default (which is to terminate the program):
+
+@example
+env --default-signal=PIPE,INT seq 1000 | head -n1
+@end example
+
+In the following example:
+
+@example
+trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1'
+@end example
+
+The first trap command sets SIGPIPE to ignore.  The second trap command
+ostensibly sets it back to its  default, but POSIX mandates that the shell
+must not change inherited state of the signal - so it is a no-op.
+
+Using @option{--default-signal=PIPE} (or its shortcut @option{-p}) can be
+used to force the signal to  its  default behavior:
+
+@example
+trap '' PIPE && sh -c "env -p seq inf | head -n1'
+@end example
+
+
+@item --ignore-signal[=@var{sig}]
+Ignore signal @var{sig} when running a program. Without @var{sig} all
+known signals are set to ignore. Multiple signals can be
+comma-separated. The following command runs @command{seq} with SIGINT set
+to be ignored - pressing @kbd{Ctrl-C} will not terminate it:
+
+@example
+env --ignore-signal=INT seq inf > /dev/null
+@end example
+
+
+@item -p
+Equivalent to @option{--default-signal=PIPE} - sets SIGPIPE to its default
+behavior (terminate a program upon SIGPIPE).
+
 @item -v
 @itemx --debug
 @opindex -v
diff --git a/man/env.x b/man/env.x
index 8ee

bug#34488: Add sort --limit, or document workarounds for sort|head error messages

2019-02-17 Thread Assaf Gordon


Hello,

On 2019-02-17 1:12 p.m., Paul Eggert wrote:

Assaf Gordon wrote:

I don't mind either way (env feature or new program).


This should be a new feature of 'nohup' not 'env', as 'nohup' is already 
about signal handling.  I don't see a need for a new program.


With 'nohup' I don't think there will be an easy (or at least intuitive
way) to 'untrap' SIGPIPE without affecting the output: STDOUT will be 
redirected to 'nohup.out' automatically (unless we add more options like 
"--no-redirect").


Example:

env -C /foo/bar PROGRAM## only change directory
env --default-signal=PIPE PROGRAM  ## only untrap SIGPIPE
env -i PROGRAM ## only empty environment

but

nohup --default-signal=PIPE PROGRAM

Will untrap SIGPIPE *and* SIGHUP *and* redirect stdout to a file.
So we'll need to add:

  nohup --no-redirect-stdout --default-signal=PIPE PROGRAM

Also,
nohup's manual pages warns:
   "NOTE:  your  shell  may  have  its own version of nohup, which
usually supersedes the version described here.  Please refer to
your shell's documentation for details about the options it
supports."

And if there is a built-in "nohup", it will confuse users who want to
use our new feature (and then more support questions, and we have to
explain how to use "env nohup" or "\nohup".

What do you think?

-assaf

bug#34488: Add sort --limit, or document workarounds for sort|head error messages

2019-02-17 Thread Assaf Gordon


On 2019-02-16 4:56 p.m., Bernhard Voelker wrote:

On 2/15/19 10:40 PM, Assaf Gordon wrote:

$ seq  | env --default-signal PIPE sort -n | sed 5q | wc -l



  src/env.c| 90 +++-


That's quite a lot of new code.

What about a new program ... quick shot (and maybe an unlucky name): 'trap' ?



I don't mind either way (env feature or new program).

"trap" will get mixed-up with the shell's built-in command.
How about "untrap" (because the goal is to undo the 'trap' command),
and also there's no "untrap" executable name in debian, so no name 
conflicts?


will send an updated patch later today.

-assaf

bug#34488: Add sort --limit, or document workarounds for sort|head error messages

2019-02-15 Thread Assaf Gordon


Helo,

On 2019-02-15 8:20 a.m., Eric Blake wrote:

On 2/15/19 8:43 AM, 積丹尼 Dan Jacobson wrote:

sort: write failed: 'standard output': Broken pipe
sort: write error

[...]

Perhaps coreutils should teach 'env' a command-line option to forcefully
reset SIGPIPE back to default behavior [...]   If we
did that, then even if your sh is started with SIGPIPE ignored (so that
the shell itself can't restore default behavior), you could do this
theoretical invocation:

$ seq  | env --default-signal PIPE sort -n | sed 5q | wc -l
5


That is a nice idea, I could've used it myself couple of times.

Attached a suggested patch.
If this seems like a good direction, I'll complete it with NEWS/docs/etc.

Usage is:
env --default-signal=PIPE
env -P ##shortcut to reset SIGPIPE
env --default-signal=PIPE,INT,FOO


This also works nicely with the recent 'env -S' option,
so a script like so can always start with default SIGPIPE handler:

#!/usr/bin/env -S -P sh
seq inf | head -n1



comments welcomed,
 - assaf

>From d65ddf38cd5cf60ba6fc4f1bf60f7324a3e6bebd Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Fri, 15 Feb 2019 12:31:48 -0700
Subject: [PATCH] env: new option -D/--default-signal=SIG [FIXME]

See https://bugs.gnu.org/34488#8 .
---
 src/env.c| 90 +++-
 src/local.mk |  1 +
 tests/local.mk   |  1 +
 tests/misc/env-signal-handler.sh | 68 ++
 4 files changed, 159 insertions(+), 1 deletion(-)
 create mode 100755 tests/misc/env-signal-handler.sh

diff --git a/src/env.c b/src/env.c
index 3a1a3869e..ebda91589 100644
--- a/src/env.c
+++ b/src/env.c
@@ -21,12 +21,16 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include "system.h"
 #include "die.h"
 #include "error.h"
+#include "operand2sig.h"
 #include "quote.h"
+#include "sig2str.h"
 
 /* The official name of this program (e.g., no 'g' prefix).  */
 #define PROGRAM_NAME "env"
@@ -48,7 +52,15 @@ static bool dev_debug;
 static char *varname;
 static size_t vnlen;
 
-static char const shortopts[] = "+C:iS:u:v0 \t";
+/* if true, at least one signal handler should be reset.  */
+static bool reset_signals ;
+
+/* if element [SIGNUM] is true, the signal handler's should be reset
+   to its defaut. */
+static bool signal_handlers[SIGNUM_BOUND];
+
+
+static char const shortopts[] = "+C:iPS:u:v0 \t";
 
 static struct option const longopts[] =
 {
@@ -56,6 +68,7 @@ static struct option const longopts[] =
   {"null", no_argument, NULL, '0'},
   {"unset", required_argument, NULL, 'u'},
   {"chdir", required_argument, NULL, 'C'},
+  {"default-signal", optional_argument, NULL, 'P'},
   {"debug", no_argument, NULL, 'v'},
   {"split-string", required_argument, NULL, 'S'},
   {GETOPT_HELP_OPTION_DECL},
@@ -88,8 +101,17 @@ Set each NAME to VALUE in the environment and run COMMAND.\n\
   -C, --chdir=DIR  change working directory to DIR\n\
 "), stdout);
   fputs (_("\
+  --default-signal=SIG  reset signal SIG to its default signal handler.\n\
+multiple signals can be comma-separated.\n\
+"), stdout);
+  fputs (_("\
   -S, --split-string=S  process and split S into separate arguments;\n\
 used to pass multiple arguments on shebang lines\n\
+"), stdout);
+  fputs (_("\
+  -P   same as --default-signal=PIPE\n\
+"), stdout);
+  fputs (_("\
   -v, --debug  print verbose information for each processing step\n\
 "), stdout);
   fputs (HELP_OPTION_DESCRIPTION, stdout);
@@ -525,6 +547,63 @@ parse_split_string (const char* str, int /*out*/ *orig_optind,
   *orig_optind = 0; /* tell getopt to restart from first argument */
 }
 
+static void
+parse_signal_params (const char* optarg)
+{
+  char signame[SIG2STR_MAX];
+  char *opt_sig;
+  char *optarg_writable = xstrdup (optarg);
+
+  opt_sig = strtok (optarg_writable, ",");
+  while (opt_sig)
+{
+  int signum = operand2sig (opt_sig, signame);
+  if (signum < 0)
+usage (EXIT_FAILURE);
+
+  signal_handlers[signum] = true;
+
+  opt_sig = strtok (NULL, ",");
+}
+
+  free (optarg_writable);
+}
+
+static void
+reset_signal_handlers (void)
+{
+
+  if (!reset_signals)
+return;
+
+  if (dev_debug)
+  devmsg ("Resetting signal handlers:\n");
+
+  for (int i=0; ihttps://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ seq
+trap_sigpipe_or_skip_
+
+# Paraphrasing http://bugs.gnu.org/34488#8:
+# POSIX requires that sh started with an inherited ignored SIGPIPE must
+# silently ignore all attempts from within the shell to restore SIGPIPE
+# handling to child pr

bug#33468: A bug with yes and --help

2019-02-15 Thread Assaf Gordon


Hello Eric and all,


Thanks for the quick and detailed review.
I've amended all the issues you mentioned.

On 2019-02-13 8:20 p.m., Eric Blake wrote:

  15 files changed, 46 insertions(+), 141 deletions(-)


Nice diffstat.


These are of course Bernhard's improvements,
I just did the testing (and some minor things).


diff --git a/NEWS b/NEWS


Is "argument" better than "option" here?  Or, maybe:

now always process --help and --version options, regardless of any other
arguments present before any optinoal -- end-of-options marker.


I've used your phrasing, and also separated "nohup" from the rest
of the programs, as it does not accept --help/--version anywhere,
just as first arguments.

Attached updated patches, with tests.

comments welcomed,
 - assaf

 P.S.
There is at least one change in behavior, not sure if this is
bad enough to be a regression or doesn't really matter:

  $ yes-OLD me -- --help | head -n1
  me -- --help

  $ yes-NEW me -- --help | head -n1
  me --help



gnulib-0001-long-options-add-parse_gnu_standard_options_only.patch.gz
Description: application/gzip


0001-all-detect-help-and-version-more-consistently.patch.gz
Description: application/gzip

bug#34488: Add sort --limit, or document workarounds for sort|head error messages

2019-02-15 Thread Assaf Gordon


severity 34488 wishlist
retitle 34488 doc: sort: expand on "broken pipe" (SIGPIPE) behavior
stop

Hello,

On 2019-02-15 7:43 a.m., 積丹尼 Dan Jacobson wrote:

Things start out cheery, but quickly get ugly,

$ for i in 9 99 999  9; do seq $i|sort -n|sed 5q|wc -l; done
5
5
5
5
sort: write failed: 'standard output': Broken pipe
sort: write error
5
sort: write failed: 'standard output': Broken pipe
sort: write error

Therefore, kindly add a sort --limit=n,


I don't think this is wise, as "head -n5" does exactly that in much more
generic way.


and/or on (info "(coreutils) sort invocation")
admit the problem, and give some workarounds, lest
our scripts occasionally spew error messages seemingly randomly,
just when the boss is looking.


Just to clarify: why do you think this a "problem" ?

This is the intended behavior of most proper programs:
Upon receiving SIGPIPE they should terminal with an error,
unless SIGPIPE is explicitly ignored.
The errors are not "random" - they happen because you explicitly
cut short the output of a program.

It is an important indication about how your pipe works,
and sort is not to blame, e.g.:

$ seq 10 | head -n1
1
seq: write error: Broken pipe

$ seq 100| cat | head -n1
1
cat: write error: Broken pipe
seq: write error: Broken pipe

This is a good indication that the entire output was not consumed,
and is very useful and important in some cases, e.g. when a program
crashes before consuming all input.

Here's a contrived example:

   $ seq 100 | sort -S 200 -T /foo/bar
   sort: cannot create temporary file in '/foo/bar': No such file or 
directory

   seq: write error: Broken pipe

I force "sort" to fail (limiting it's memory usage and pointing it to
non-existing temporarily directory).
It is then good to know that seq's output was cut short and not consumed.

If you know in advance you will trim the output of a program,
either hide the stderr with "2>/dev/null",
or use the shell's "trap PIPE" mechanism.


And no fair saying "just save the output" (could be big) "into a file
first, and do head(1) or sed(1) on that."


If you want to consume all input and just print the first 5 lines,
you can use "sed -n 1,5p" instead of "sed 5q" - no need for a temporary
file.


I'm marking this as a documentation "wishlist" item,
and patches are always welcomed.

regards,
 - assaf

bug#34475: Mention even more worries for test -a

2019-02-15 Thread Assaf Gordon


severity 34475 wishlist
retitle 34475 doc: test: expand on -a/-o usage
stop

Hello,

On 2019-02-13 6:00 p.m., 積丹尼 Dan Jacobson wrote:

First, on the test(1) man page, at

[...]> Say instead
[...]

I'm marking this as "wishlist" item, patches always welcomed.

-assaf

bug#34487: dd (coreutils) 8.30 – A written ISO image cannot not be booted from BIOS

2019-02-15 Thread Assaf Gordon


tags 34487 notabug
close 34487
stop

Hello,


On 2019-02-15 5:02 a.m., Ricky Tigg wrote:

Hi. An ISO image cannot not be booted from BIOS


[...]
# dd if=debian-9.7.0-amd64-DVD-3.iso of=/dev/sdc 


A CD/DVD image is not the same as a hard drive.
The internal structure differs,
and one can not be copied to the other and expected to work.

Since this is not a bug in dd, I'm closing this item.


For general help regarding operating system installation,
please contact the relevant operating system's mailing list.
In this case, likely Debian-user mailing list:
  https://lists.debian.org/debian-user/
See also https://wiki.debian.org/DebianMailingLists .

regards,
 -assaf

bug#33468: A bug with yes and --help

2019-02-13 Thread Assaf Gordon


Hello,

On 2019-02-12 7:00 p.m., Eric Blake wrote:

On 2/12/19 7:21 PM, Assaf Gordon wrote:


+  optind = 1;


Why are you doing this in every caller, instead of doing it just once
inside the body of parse_gnu_standard_options_only(), so that the state
is left unchanged at optind==1 if there were no options parsed?


That was just an ugly hack.

Here are a more complete patches (both for gnulib and for coreutils).

All existing tests pass (including nohup's exit code)
but I did not yet write new tests for these improvements.

Comments welcomed.
 -assaf







0001-all-detect-help-and-version-more-consistently-FIXME.patch.gz
Description: application/gzip


gnulib-0001-long-options-add-parse_gnu_standard_options_only.patch.gz
Description: application/gzip

bug#33468: A bug with yes and --help

2019-02-12 Thread Assaf Gordon


Hello,

A follow-up and more details:

On 2019-01-12 11:30 a.m., Assaf Gordon wrote:

On 2019-01-12 8:42 a.m., Eric Blake wrote:

On 1/11/19 6:23 PM, Assaf Gordon wrote:


-  optind = 0;
+  optind = 1;


Ouch. You're hitting the portability problem of the difference between
BSD and glibc.


I only tested on Debian Stretch (with Debian GLIBC 2.24-11+deb9u3),
did not yet test on BSDs.

With "optind=1", I see the following:

===
   $ ./src/hostid
   ec68f06c

[...]

With "optind=0" I see the following:

===
   $ ./src/hostid
   ./src/hostid: extra operand ‘./src/hostid’
   Try './src/hostid --help' for more information.




Eric's suggestion was not wrong, "optint=0"
was already used (and worked just fine) in parse_long_option.

But there's a catch: after calling "parse_long_options"
(which sets optind=0), every program called "getopt_long"
again! and that call set optind to non-zero value.

Bernhard's patch removed the (now unneeded) getopt_long call:
===
-  parse_long_options (argc, argv, PROGRAM_NAME, PACKAGE, Version,
-  usage, AUTHORS, (char const *) NULL);
-  if (getopt_long (argc, argv, "", long_options, NULL) != -1)
-usage (EXIT_FAILURE);
+  parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE, 
Version,
+   true, usage, AUTHORS, (char const *) 
NULL);

===

And so all these programs were left with "optind=0" when the checked 
non-option arguments, e.g.:


===
  if (optind < argc) 

{ 

  error (0, 0, _("extra operand %s"), quote (argv[optind])); 

  usage (EXIT_FAILURE); 


}
===

which resulted in all the parsing errors I reported previously.


Perhaps "parse_gnu_standard_options_only" should use "_getopt_long_r"
and avoid the need to reset anything?



_getopt_long_r was ostensibly fine, but turned out to be messy:
when coreutils is built on glibc systems, all of gnulib's getopt
replacement modules are not used, and so _getopt_long_r is not
available.


As all the programs in this patch accept only --help and --yes
(and non-option arguments), the attached ugly hack seems to solve the
issue.
There's probably a prettier way.

With this patch, the only issues left are nohup's exit code (1 instead 
of 125) and "dd --", see https://bugs.gnu.org/33468#29


regards,
 - assaf

>From eb4ed1a5417a2d50941181aa1d8e06b674c661a8 Mon Sep 17 00:00:00 2001
From: Assaf Gordon 
Date: Tue, 12 Feb 2019 17:58:47 -0700
Subject: [PATCH] all: parse_gnu_standard_options_only fixup

---
 src/cksum.c| 1 +
 src/dd.c   | 1 +
 src/hostid.c   | 1 +
 src/hostname.c | 1 +
 src/link.c | 1 +
 src/logname.c  | 1 +
 src/nohup.c| 1 +
 src/sleep.c| 1 +
 src/tsort.c| 1 +
 src/unlink.c   | 1 +
 src/uptime.c   | 1 +
 src/users.c| 1 +
 src/whoami.c   | 1 +
 src/yes.c  | 1 +
 14 files changed, 14 insertions(+)

diff --git a/src/cksum.c b/src/cksum.c
index cda61516a..b62249862 100644
--- a/src/cksum.c
+++ b/src/cksum.c
@@ -291,6 +291,7 @@ main (int argc, char **argv)
 
   parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE, Version,
true, usage, AUTHORS, (char const *) NULL);
+  optind = 1;
 
   have_read_stdin = false;
 
diff --git a/src/dd.c b/src/dd.c
index b361e7d5a..f47e8a788 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -2393,6 +2393,7 @@ main (int argc, char **argv)
 
   parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE, Version,
true, usage, AUTHORS, (char const *) NULL);
+  optind = 1;
   close_stdout_required = false;
 
   /* Initialize translation table to identity translation. */
diff --git a/src/hostid.c b/src/hostid.c
index d9ea8929b..f023a3da1 100644
--- a/src/hostid.c
+++ b/src/hostid.c
@@ -66,6 +66,7 @@ main (int argc, char **argv)
 
   parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE_NAME, Version,
true, usage, AUTHORS, (char const *) NULL);
+  optind = 1;
 
   if (optind < argc)
 {
diff --git a/src/hostname.c b/src/hostname.c
index 761f775b4..3a9d1dd80 100644
--- a/src/hostname.c
+++ b/src/hostname.c
@@ -83,6 +83,7 @@ main (int argc, char **argv)
 
   parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE_NAME, Version,
true, usage, AUTHORS, (char const *) NULL);
+  optind = 1;
 
   if (argc == optind + 1)
 {
diff --git a/src/link.c b/src/link.c
index d70d434d9..d21f36099 100644
--- a/src/link.c
+++ b/src/link.c
@@ -69,6 +69,7 @@ main (int argc, char **argv)
 
   parse_gnu_standard_options_only (argc, argv, PROGRAM_NAME, PACKAGE_NAME, Version,
true, usage, AUTHORS, (char const *) NULL);
+  optind = 1;
 
   if (argc < optind + 2)
 {
diff --git a/src/logname.c b/src/logname.c
index cea43720f..fb9c2bbab 100644
--- a/src/log

bug#34345:

2019-02-09 Thread Assaf Gordon


On 2019-02-09 1:18 p.m., Ricky Tigg wrote:

Covered object by values '1994 s', '2014.25 s' seems to be a unique
time elapsed. Those values can therefore be expected to be identical,
either '1994 s' or '2014.25 s' – 2014 s and 25 hundredths of s –.



The command was:

# dd if=/dev/zero of=/dev/sdc status=progress > 8003555840 bytes (8.0 GB, 7.5 GiB) copied, 1994 s, 4.0 MB/s> dd: 
writing to '/dev/sdc': No space left on device> 15638481+0 records in> 
15638480+0 records out> 8006901760 bytes (8.0 GB, 7.5 GiB) copied, 
2014.25 s, 4.0 MB/s

The first status report (with 1994s) is printed due to "status=progress"
and is updated periodically.
The last status line (with 2014.25s) was printed about 20 seconds later,
hence the time difference.

bug#34220: failure to building with CompCert, patch proposed

2019-02-08 Thread Assaf Gordon


tags 34220 wontfix
close 34220
stop

Hello,

On 2019-01-27 9:03 p.m., Paul Eggert wrote:

DAVID MONNIAUX wrote:
under CompCert, floating-point values are not simplified at compile 
time

[...]
please file a bug report for CompCert so 
that its maintainers can fix the bug in the compiler.

Given the above, I'm closing this as "won't fix".

Discussion can continue by replying to this thread.

-assaf

bug#34340: cp -a doesn't copy acls

2019-02-08 Thread Assaf Gordon


tags 34340 moreinfo
stop

Hello,

On 2019-02-05 4:50 p.m., L A Walsh wrote:

and it is not on the manpage, but tar copies
acls and has them on the manpage.

It guess it is an oversite that cp copies over 'xattrs'
but not acls?


First,
Can you verify the 'cp' binary you are using was compiled with
ACL support? Something like:

  $ ldd $(which cp) | grep acl
  libacl.so.1 => /lib/x86_64-linux-gnu/libacl.so.1 (0x7f0b68066000)


Second,
Are you using a local file-system or a remote one?
There is a previous bug about ACLs on NFS4: https://bugs.gnu.org/20884 .

Third,
Do you have a reproducible case?
e.g. on my local system:

   $ touch a
   $ setfacl -m "u:nobody:w" a
   $ cp -a a b

   $ getfacl b
   # file: b
   # owner: gordon
   # group: gordon
   user::rw-
   user:nobody:-w-
   group::r--
   mask::rw-
   other::r--

If you have a reproducible case, please also run it with "strace" to
help us troubleshoot the issue more clearly, e.g.

   strace -o cp-acl.log cp -a a b

And attach the 'cp-acl.log' file.

regards,
 - assaf

bug#34345: coreutils v.8.30 – Two fractional digits accuracy. Appropriate notation regarding time elapsed.

2019-02-08 Thread Assaf Gordon

severity 34345 wishlist
retitle 34345 dd: report elapsed time as HH:MM:SS.NNN
stop

Hello,

Ricky Tigg's original message was sent to coreut...@gnu.org
(not to bug-coreutils@gnu.org), and did not create a new bug report:

   https://lists.gnu.org/r/coreutils/2019-02/msg3.html

This is of course absolutely fine - but PLEASE keep the same mailing 
list when replying (e.g. don't reply to bug-coreutils@gnu.org if

the message was sent to coreut...@gnu.org).

The original message said:

On 2019-02-06 2:36 a.m., Ricky Tigg wrote:
> Enhancements request
> Hi. Command executed:
>
> # dd if=/dev/zero of=/dev/sdc status=progress
> 8003555840 bytes (8.0 GB, 7.5 GiB) copied, 1994 s, 4.0 MB/s
> dd: writing to '/dev/sdc': No space left on device
> 15638481+0 records in
> 15638480+0 records out
> 8006901760 bytes (8.0 GB, 7.5 GiB) copied, 2014.25 s, 4.0 MB/s
>
>
> - Could values reported at '(8.0 GB, 7.5 GiB)' be displayed using 
a two

> fractional digits accuracy (model 1.23 GiB).

That is not likely to change.
The printing code uses a common function (human_readable) that is used
in several other coreutils programs, and the convention is to print up
to 3 digits (or 2 digits with one decimal point).
Examples:
   1.1 KB
11 KB
   111 KB
11 MB
   111 MB
   1.1 GB
11 GB
   111 GB
   1.1 TB

The exact number of bytes is printed at the beginning of the line,
and you can use 'numfmt' to format it to your liking:

$ dd if=/dev/zero of=/dev/zero bs=1M count=10 2>&1 \
  | tail -n1 | numfmt --to=si --format '%2.5f'
10.48576M bytes (10 MB, 10 MiB) copied, 0.00315612 s, 3.3 GB/s

> - Could values reported at '1994 s', '2014.25 s' to be displayed
> according to notation <*hour*>*h*<*minutes*>*'*<*seconds*>*''*.

An interesting idea.
I couldn't find previous discussion about it, so I'll mark this item
as a "wishlist".

As always, patches are welcomed.

> Potential issue:
> Since values '1994 s', '2014.25 s' expressed here may in the present
> context cover nothing but a same object, non-identical values may be
> interpreted as a programming issue. Regards.

I don't understand the above - can you clarify ?

regards,
 - assaf

bug#34349: unrecognized file system type 0x794c7630

2019-02-06 Thread Assaf Gordon


Hello,

On 2019-02-06 5:16 a.m., Matt Wilder wrote:

tail: unrecognized file system type 0x794c7630 for ‘/var/log/syslog’.
please report this to bug-coreutils@gnu.org. reverting to polling


Thank you for the report.

This has been fixed in version 8.25 and later, for more details
see https://www.gnu.org/software/coreutils/filesystems.html .

regards,
 - assaf

bug#34143: [coreutils 8.28] du -x is reporting a lower disk usage for /mnt when partitions are mounted

2019-01-20 Thread Assaf Gordon


tags 34143 notabug
close 34143
stop

Hello,

On 2019-01-19 3:11 p.m., Joseph Paul wrote:

It may not be a bug at all, but I was surprised to find out that 'du
-x' is reporting a lower disk usage on /mnt when partitions are
mounted.


This is not a bug.

Technically, as you wrote below, du simply skips (and does not count)
any directory that is not on the same filesystem.

[...]

linux$ du -x /mnt
4/mnt/data
4/mnt/VL1800
4/mnt/nfs/nas
8/mnt/nfs
20/mnt

/mnt is now bigger.

Is this a normal result, because even when mounted, physically, the
directories '/mnt/VL1800' and '/mnt/data' still  exist on the '/'
filesystem, or not ?
Shouldn't they still occupy 4Kb of disk space each on the '/'
filesystem when partitions are mounted ?


They do occupy as much disk space as before,
but du has no way to know how much they occupy,
because the kernel reports that they are on a different device
and you requested -x/--one-file-system.

We can even take it a step further, and mount a new filesyetem
on a non-empty directory - all the directory's content won't be counted:

As root, create the directory structure:

cd /tmp
mkdir -p a a/b a/c a/d

Now fill the "b" directory with a large file:

dd if=/dev/zero of=a/b/bigfile bs=1M count=1

Before any mounts, "b" is counted:

# du -x a
4 a/c
1028  a/b
4 a/d
1040  a

Now create a temporary file system loop file, and mount it over "b":

dd if=/dev/zero of=disk.img bs=1M count=10
mkfs.ext3 disk.img
mount -o loop disk.img a/b

Re-checking disk-usage, "b" is not even listed,
and its content (1MB) is not counted:

   # du -x a
   4   a/c
   4   a/d
   12  a

---

To see why du skips it, you can check the Device-ID associated with each 
directory:


   # stat -c "%n   Device-ID: %D   Mount-Point: %m" a a/b a/c a/d
   a   Device-ID: 812   Mount-Point: /tmp
   a/b   Device-ID: 700   Mount-Point: /tmp/a/b
   a/c   Device-ID: 812   Mount-Point: /tmp
   a/d   Device-ID: 812   Mount-Point: /tmp

Your device numbers will differ, but the number for "a/b" will not be
the same as for the rest.

When du sees a different device number, it simply skips the directory.

Once unmounted, the device-id returns to the old value,
and "a/b" will be counted with its content:

   # umount a/b
   # stat -c "%n   Device-ID: %D   Mount-Point: %m" a a/b a/c a/d
   a   Device-ID: 812   Mount-Point: /tmp
   a/b   Device-ID: 812   Mount-Point: /tmp
   a/c   Device-ID: 812   Mount-Point: /tmp
   a/d   Device-ID: 812   Mount-Point: /tmp


As such, I'm closing this as "not a bug",
but discussion can continue by replying to this thread.


regards,
 - assaf

bug#32198: tail -f -F unexpected behavior

2019-01-18 Thread Assaf Gordon


tags 32198 notabug
close 32198
stop

Hello,

It seems your message has not been replied to in a long while.
Sorry about that.

On 2018-07-18 8:24 a.m., Matthew Guidry wrote:

I was doing some experimentation with nano v2.9.3 and tail,
watching the output of tail after saving in nano and encountered some
strange behavior.


This is not a bug at all (not in tail nor in nano).

It is the result of updating a file in-place (i.e. changing existing
bytes) which 'tail' already consumed.


I had two terminals open side by side; one with nano and one with tail.
I opened a file called test.txt in nano and saved with ^w in the first
terminal.
I went to the second terminal and ran tail -f test.txt to watch the file.

I went back to the nano terminal and returned twice and saved.
The tail terminal reports this change properly.
With the file still open in nano, I write any number of characters and save.

The tail terminal reports this change But skips the first character.

To better see what happens, open a third terminal,
and run the following command (after initially saving the file):

watch -n1 od -tc test.txt

Which will show the content of the file, updated once a second.

I will use a similar but slightly different flow:

1. When you first save (in nano) the file, it is empty.
The "od" terminal will show:

000

2. Type "12345" (don't press ENTER), and save (ctrl-O).
The "od" terminal will show:

   000  1   2   3   4   5  \n
   006

The "tail" terminal will show:

   12345

AND the cursor in the "tail" terminal will go to the next line
(as there is a newline in the file).


3. Still in nano, on the same line, type "67890" (don't press ENTER),
and save (CTRL-O).
The "od" terminal will show:

  000   1   2   3   4   5   6   7   8   9   0  \n
  011

The "tail" terminal will show:

  12345
  7890

Here, the "6" character was not displayed by "tail".
The reason is that that character in offset 6 of the file used to be a
newline, and "tail" already consumed it.
When the line was changed, nano went back and changed existing data in
the file (or re-wrote the file completely - not sure about the 
implementation). "tail" has no way to detect that or "go back" in the file.


This is a carefully constructed example, where the data change is
small enough so that that "tail" almost doesn't notice it.

If you make larger changes, or delete some parts of the file,
nano will rewrite the file completely and "tail" will issue a warning 
such as:

tail: test.txt: file truncated
and then re-read the file.

As such I'm closing this as "not a bug", but discussion can continue
by replying to this thread.

regards,
 - assaf

bug#32455: cp gets confused by symlinks to parent directory

2019-01-18 Thread Assaf Gordon


tags 32455 notabug
close 32455
stop

Hello,

It seems your message has not been replied to in a long while.
Sorry about that.

On 2018-08-16 8:47 a.m., Mike Crowe wrote:

If cp is passed the -d option and told to copy a symlink to the directory
containing the symlink then it ends up removing the target directory so it
is unable to create the symlink.


If my understanding is correct, the "-d" flag is not relevant to the
issue. The problem is that "self" is a symlink to a directory:



Reproduction script:

--8<--
#!/bin/sh
set -x

rm -rf temp
mkdir -p temp/src temp/dest

ln -s . temp/src/self

# This one works
cp -vd temp/src/self temp/dest/self


This works because "temp/dest/self" does not exist.
In this case, "temp/dest" is taken as the destination directory,
and "self" is taken as the name of the file/dir/symlink to create.

That is, you could run "cp -vd temp/src/self temp/dest/foobar"
to create "foobar" as a copy of "self".


# This one fails
cp -vd temp/src/self temp/dest/self


Here, "temp/dest/self" already exists, and it is a symlink to
a directory.
Meaning, the request is: copy "temp/src/self" into the directory
"temp/dest/self/" (and create "temp/dest/self/self").

This would have gone well, except that because "self" is a symlink
to ".", it can be resolved indefinitely:

  $ file temp/dest/self
  temp/dest/self: symbolic link to .

  $ file temp/dest/self/self/self/self/self/self
  temp/dest/self/self/self/self/self/self: symbolic link to .

  $ file temp/dest/self/self/self/self/self/self/self
  temp/dest/self/self/self/self/self/self/self: symbolic link to .

"cp" first removes "temp/dest/self/self" (which is valid),
but then, "temp/dest/self" is gone (since it is the same file path after 
resolving it).


Hence, "cp" fails by saying "no such directory" on "temp/dest/self/self".

When this step is done, "temp/dest/self" does not exist,
and so:


# This one works again
cp -vd temp/src/self temp/dest/self


This works as before.

You can observe what happens on the kernel level
by adding "strace -e trace=file" before the "cp" commands,
this might help in deeper understanding.


To illustrate this differently:

When creating regular directories and files,
then deleting the innermost files, it is naively expected
that the parent directories still exist:

mkdir -p a/b/c/d
touch a/b/c/d/e
rm a/b/c/d/e

That is, a normal program can call "dirname("a/b/c/d/e")"
to get the parent directory of "e", and expect it to still
exist even after "e" is deleted.

But with your case:

   $ mkdir a
   $ ln -s . a/self
   $ rm a/self/self/self/self/self/self

All the "apparent" parent directories ("self/self/self/self/self")
are gone!



Expected behaviour:

There should be no error message emitted by the second invocation of cp,
and the target directory should be in the same state as it is after the
first or third attempts to copy the symlink.


Not exactly.

What you want is for the DEST parameter of "cp" to always be a file,
never to be considered a directory, i.e. "temp/dest/self"
should always be interpreted as file "self" in directory "temp/dest",
never as directory "temp/dest/self".
Luckily, there is already an option for that:

 -T, --no-target-directory
  treat DEST as a normal file

With "-T", repeated commands work as you expected:

  $ mkdir -p temp/src temp/dest
  $ ln -s . temp/src/self
  $ cp -vdT temp/src/self temp/dest/self
  'temp/src/self' -> 'temp/dest/self'
  $ cp -vdT temp/src/self temp/dest/self
  removed 'temp/dest/self'
  'temp/src/self' -> 'temp/dest/self'
  $ cp -vdT temp/src/self temp/dest/self
  removed 'temp/dest/self'
  'temp/src/self' -> 'temp/dest/self'
  $ cp -vdT temp/src/self temp/dest/self
  removed 'temp/dest/self'
  'temp/src/self' -> 'temp/dest/self'
  [... ad infinitum ...]


I hope this addresses the issue,
I'm closing this as "not a bug", but discussion can continue by replying
to this thread.

regards,
 - assaf

bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)

2019-01-18 Thread Assaf Gordon


Hello,

On 2019-01-18 2:56 a.m., René J.V. Bertin wrote:


the code isn't the most welcoming to dive into I've ever seen ;)


Two online resources that might help in exploring the code:

  http://www.maizure.org/projects/decoded-gnu-coreutils/

  https://opengrok.housegordon.com/source/xref/coreutils/

regards,
 - assaf

bug#8960: stdbuf on bi-arch systems

2019-01-18 Thread Assaf Gordon


severity 8960 wishlist
stop

(triaging old bugs)

Hello,

On 2011-07-04 10:15 a.m., Pádraig Brady wrote:

On 29/06/11 21:47, Bruno Haible wrote:

The program 'stdbuf' on bi-arch x86 / x86_64 systems cannot work on all kinds
of programs.

[...]

I would like to have a single binary that works on both x86 and x86_64 programs.

[...]

if stdbuf sets both LD_PRELOAD_32 and LD_PRELOAD_64 to the
appropriate libstdbuf.so, it should just work.


It's been more than 7 years since last comments/progress on this issue.
Is it still relevant / needed ?

If no one replies, I'll close it as "wontfix" in a few days.

regards,
 - assaf

bug#12339: Gnu rm, changed only recently (4-5 years), and didn't follow letter of posix...(statement follows)

2019-01-18 Thread Assaf Gordon


close 12339
stop

(triaging old bugs)


Hello,

This long and winding thread covers several topics
relating to rm(1), historical unix and POSIX compatibility
(and a bugfix or two in the mix).

An enlightening read for those interested...
( https://bugs.gnu.org/12339 )

But the bottom line is:

   rm -rf .

will not delete the content of the current directory
(while keeping the directory itself) and that is not likely to change.

Two suggested alternatives:

   find . -delete
   rm -rf * .[!.] .??*


As such, and with no more comments in 6 years, I'm closing this bug.

PLEASE do not reply to this thread.

If there are other relevant issues (that have not been discussed
elsewhere, and have not been previously rejected),
please start a new thread by emailing coreut...@gnu.org .

regards,
 - assaf

bug#12400: rmdir runs "amok", users "curse" GNU...(as rmdir has no option to stay on 1 file system)...

2019-01-18 Thread Assaf Gordon


retitle 12400 rmdir: add --one-file-system option
severity 12400 wishlist
tags 12400 wontfix
stop

(triaging old bugs)

Hello,

On 2012-09-09 11:22 p.m., Bob Proulx wrote:

Linda Walsh wrote:

If you are going to only provide 1 mode of functionality, it should
be to only rmdir dirs on the same file system as the starting args.



[...]

But rmdir only removes the directories you tell it to remove.


[...]


If you want a recursive option why not use 'rm -rf'?

There is always 'find' with the -delete option.  But regardless there
has been the find -exec option.

   find /some/path -type d -delete

   find /some/path -depth -type d -exec rmdir {} +



With no further comments in 6 years, I'm closing this
request.

regards,
 - assaf

bug#33211: coreutils.mo is in both LC_TIME and LC_MESSAGES folders

2019-01-18 Thread Assaf Gordon


tags 33211 notabug
close 33211
stop

Hell0,

On 2018-10-30 3:33 p.m., scootergrisen wrote:

I wonder if its a mistake that in Fedora i can see coreutils.mo in both:
/usr/share/locale/*/LC_TIME
/usr/share/locale/*/LC_MESSAGES

They seem to be identical.


This is not a mistake (nor a bug).

Not only they are identical, one is a symlink to the other:

  $ cd /usr/local/share/locale/ca
  $ ls -log LC_*/coreutils.mo
  -rw-r--r-- 1 379478 Dec 27 22:47 LC_MESSAGES/coreutils.mo
  lrwxrwxrwx 1 27 Dec 27 22:47 LC_TIME/coreutils.mo -> 
../LC_MESSAGES/coreutils.mo



coreutils.mo is the only file i see in the /usr/share/locale/*/LC_TIME 
folder.


Most programs that use gettext (https://www.gnu.org/software/gettext/)
are concerned with user visible messages, hence most of the translation
only use LC_MESSAGES directory, and there's no need for other files.

Few coreutils programs (e.g. date, sort) do care about translation of
time-related strings (e.g. days / month names).
That's why coreutils also uses LC_TIME.

One can ask for the date/time to use one local,
and messages to use another:

  $ export LC_TIME=ru_RU.UTF-8
  $ export LANGUAGE=ja_JP.UTF-8

  $ date
  Пт янв 18 01:06:10 MST 2019

  $ date -d ABCD
  date: `ABCD' は無効な日付です


Should the /usr/share/locale/*/LC_TIME/coreutils.mo files be removed so 
there is only the /usr/share/locale/*/LC_MESSAGES/coreutils.mo files?


No, Both should exist, otherwise setting LC_TIME won't work.

Technically, the translated strings for both messages and time are
stored in the same file - that's why when coreutils is installed,
one is a symlink to the other.


Even more technically, when building from source,
the file "bootstrap.conf" contains the following:

  # Other locale categories that need message catalogs. 


  EXTRA_LOCALE_CATEGORIES=LC_TIME

The directory "./po" is populated with available translation
(e.g. "ru.po" and "ja.po").

During the build, the ".po" files are compiled into binary ".gmo" files.
During installation, the files are copied/symlinked:
  $ make install
  [...]
  make[2]: Entering directory '/home/gordon/projects/coreutils/po'
  installing af.gmo as /usr/local/share/locale/af/LC_MESSAGES/coreutils.mo
  installing af.gmo link as /usr/local/share/locale/af/LC_TIME/coreutils.mo
  installing be.gmo as /usr/local/share/locale/be/LC_MESSAGES/coreutils.mo
  installing be.gmo link as /usr/local/share/locale/be/LC_TIME/coreutils.mo
  [...]


Hope this addresses the issue.
I'm closing this as "not a bug", but discussion can continue by replying
to this thread.

regards,
 - assaf

bug#33646: [PATCH] doc: improve wording of the --kibibytes option description

2019-01-17 Thread Assaf Gordon


Hello,

On 2018-12-06 6:32 a.m., Kamil Dudka wrote:

Bug: https://bugzilla.redhat.com/1527391
---
  doc/coreutils.texi | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index f8339d73f..e93fe71a0 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -7975,9 +7975,11 @@ Append @samp{*} for executable regular files, otherwise 
behave as for
  @opindex --kibibytes
  Set the default block size to its normal value of 1024 bytes,
  overriding any contrary specification in environment variables
-(@pxref{Block size}).  This option is in turn overridden by the
-@option{--block-size}, @option{-h} or @option{--human-readable}, and
-@option{--si} options.
+(@pxref{Block size}).  If @option{--block-size}, @option{-h},
+@option{--human-readable}, or @option{--si} options are used,
+they take precedence over @option{-k} or @option{--kibibytes}
+even if @option{-k} or @option{--kibibytes} is placed after
+the other options.
  
  The @option{-k} or @option{--kibibytes} option affects the

  per-directory block count written by the @option{-l} and similar



I'm ok with this improvement - if there are no comments
I'll push in the next few days.

-assaf

bug#33718: Syntaxe problem? I can't find the solution :-(

2019-01-17 Thread Assaf Gordon


tags 33718 moreinfo
stop

Hello,

On 2018-12-13 1:54 a.m., Rudy BROSTEAUX wrote:

Environment: AIX 7.2 TL3 SP1 (on IBM Power Systems)
Origin of the coreutils RPM used @release 8.30 is perzl.org
Installed using a yum server.

*** /root> /usr/bin/time timeout 2.3 sleep 5
timeout: warning: timer_create: Invalid argument




From: Bernhard Voelker 


No idea.  For an analysis, we need more information:
timeout version, OS/kernel version, and finally of course a reproducer, i.e., 
the exact command line you were using.

timeout works well here:

   $ /usr/bin/time -f '%e' timeout 2.3 sleep 4.5
   Command exited with non-zero status 124
   2.30


I tested coreutils-8.30 built from source on AIX 7.2,
both 64 bit and 32 bit, and both work fine:

  $ file src/timeout
  src/timeout: 64-bit XCOFF executable or object module not stripped
  $ /usr/bin/time ./src/timeout 2.3 sleep 5
  Real   2.31
  User   0.00
  System 0.00

  $ file src/timeout
  src/timeout: executable (RISC System/6000) or object module not stripped
  $ /usr/bin/time ./src/timeout 2.3 sleep 5
  Real   2.30
  User   0.00
  System 0.00


Perhaps it is a problem in the RPM package?

Please try to build from source code and see if you see experience the
issue.

regards,
 - assaf

bug#9089: pipe failure with cat and head of coreutils 6.12

2019-01-17 Thread Assaf Gordon


close 9089
stop

(triaging old bugs)

Hello,

On 2011-07-15 5:30 a.m., Philipp Thomas wrote:

I'm trying to track down a bug in cat of coreutils 6.12. Doing

cat /var/log/Xorg.0.log | head -n70

under ksh consistently fails with 'cat: write error: Connection reset by
peer'.  It does not fail when run under bash and it does not fail in current
coreutils .


I'm still able to reproduce this problem (i.e. "cat: write error"
with coreutils 6.12 under ksh on Linux 4.9.0).

However,
Given that it is has been seven and a half years ago since the last
comment, and even then it was already acknowledged that the problem does
not happen in later versions, and it happens because ksh uses 
socketpairs instead of pipes - I'm closing this bug.


If the issue of cat(1) supporting socketpair/ECONNRESET instead of
pipes/EPIPE is still relevant, we can re-open the bug.

regards,
 - assaf

bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)

2019-01-17 Thread Assaf Gordon


severity 34110 wishlist
retitle 34110 du: add dual-column showing apparent-size and disk-size
stop

Hello,

On 2019-01-17 3:13 a.m., René J.V. Bertin wrote:

On Wednesday January 16 2019 16:06:50 Assaf Gordon wrote:


I hope this helps to clarify "apparent-size".


Yes and no :) I understand what "apparent-size" does [] 
My whole point is that there might be a better name. 


The parameter name "--apparent-size" is not likely to be changed.
It has been named so for about 16 years (since 'fileutils 4.5.8'
which is even before 'coreutils' was created as a unified package).

Changing it would break existing scripts and user expectations.


I realise that you cannot really call the content size observable "real size" when 
reporting from a disk-usage viewpoint, but "content size" (--content-size, -C) should be 
clear enough?


Creating a second alias to "--apparent-size" is possible, but I'm not
sure it's warranted.

---

I think the discussion about "--apparent-size" is mostly concluded,
but the idea to have two-columns is an interesting feature request.

I'm marking this as a "wish list" item.
Concrete patches are welcomed.

regards,
 - assaf

bug#13738: Add --all option to 'users' command

2019-01-17 Thread Assaf Gordon


tags 13738 wontfix
close 13738
stop

(triaging old bugs)

Hello,

On 2013-02-18 2:01 p.m., Bob Proulx wrote:

anatoly techtonik wrote:

Bob Proulx wrote:

anatoly techtonik wrote:

The 'users' command shows users who are currently online. It will be nice
to have --all option to show all users.


Do you mean the equivelent to this?

   $ getent passwd | awk -F: '{print$1}'



[]

Solving the problem in general gets messy very quickly.  It is
therefore one of those that is better solved locally by providing the
tools needed to do what is needed on a case by case basis.  So far
after forty years of Unix and GNU systems this hasn't been needed and
therefore the use cases must be unusual.  The philosophy isn't to
solve all problems but just to make all problems solvable.

It would help if you could say a few words about the case in
which this would be helpful?


With no further comments in almost 6 years,
and this item already listed under our "rejected requests" page,
I'm closing this as "won't fix".

regards,
 - assaf

bug#16282: revisit; reasoning for not using ENV vars to provide workarounds for POSIX limitations?

2019-01-17 Thread Assaf Gordon


severity 16282 wishlist
tags 16282 wontfix
close 16282
stop


(triaging old bugs)

Hello,

On 2013-12-28 1:03 p.m., Paul Eggert wrote:

[...] if it makes a standard
utility behave in odd ways, it'll break scripts that
don't expect the odd behavior.  That's the essential
objection here.

Yes, we've used env vars in the past for this, but we've
come to regret it, and we don't want to make matters worse
in this respect without a compelling justification.


Given the above, and with no further comments in 5 years,
I'm closing this bug.

More details about the reasoning for rejecting new environment variables
are summarized here:
  https://www.gnu.org/software/coreutils/rejected_requests.html#envvar

regards,
 - assaf

bug#12820: FWIW, this is still happening as of gnulib 4a82904

2019-01-17 Thread Assaf Gordon


close 12820
stop

(triaging old bugs)

Hello,

On 2013-02-28 10:08 a.m., Paul Eggert wrote:


Perhaps there's a bug in nap () but if so the bug should
be fixed there.



Given the above, and with no further comments in almost 6 years,
I'm closing this bug.

Discussion can continue by replying to this thread.
 - assaf

bug#33785: df: don't suppress remote mounts

2019-01-17 Thread Assaf Gordon


tags 33785 notabug
close 33785
stop


Hello,

On 2018-12-19 10:05 a.m., Pádraig Brady wrote:

On 17/12/18 22:42, lzhong wrote:


According to the following commit

commit 2e81e62243409c5c574b899f52b08c000e4d99fd
  df: only suppress remote mounts of separate exports with --total


[...]

The remote mounts should not be suppressed after this change. However,
it turns out

it doesn't work as the message described. The remote mounts are still
suppressed. And here is



The intent of the patch was not to suppress _separate_ exports on the server.
I.E. nas.example.com:/Photos and nas.example.com:/Download would not
be suppressed (even if they have the same device id).

If you want all nfs mounts you could `df -a -t nfs`


With no further comments, I'm closing this as "notabug".

Discussion can continue by replying to this thread.
  -assaf

bug#34115: coreutils v. 8.30– Document's content gets deleted using cat(1)

2019-01-17 Thread Assaf Gordon


tags 34115 notabug
close 34115
merge 34115 33823
stop


Hello,


On 2019-01-17 5:53 a.m., Ricky Tigg wrote:
[...]


$ cat > .inputrc
set enable-bracketed-paste on

Press *Return*, then *Ctrl D*.


[...]


Content of *.inputrc*,  which is expected to be still present, has been


This sounds very similar to the previous email
you sent ( https://bugs.gnu.org/33823 ).

As before, it is not a problem in "cat",
but perhaps a problem in your GUI, X11, xterminal,
or perhaps a problem in copy to the clipboard.

regards,
 - assaf

bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)

2019-01-16 Thread Assaf Gordon


Hello,

I'll address only the "apparent-size" issue (not the two-columns, or 
compressed file-systems):


On 2019-01-16 1:13 p.m., René J.V. Bertin wrote:


According to `du --help`, the apparent-size option reports a size that is not 
the actual disk usage. The numbers above seem to show the opposite.
If anything, I find the concept of "apparent size" more appropriate to the size a file 
occupies on the storage medium because ultimately that storage device will not give you more than 
"struct stat : st_size" bytes for uncompressed filesystems.
Another way to say it: with "--apparent-size", du returns the actual file size; 
without, it returns how large the file appears to be (judging from its disk footprint).


"apparent-size" shows how much content/data the file has.
without "apparent-size" du shows the amount of storage consumed (or 
"wasted"?) on the storage medium (accounting sparse file holes, though 
I'm not sure about compression).


To illustrate, create three files with specific sizes:

  $ head --bytes=1700 /dev/zero > a
  $ head --bytes=4097 /dev/zero > b
  $ truncate --size=105 c# will be a sparse file

These are their sizes, as in the amount of bytes they contain:

  $ ls -log
  total 12
  -rw-r--r-- 11700 Jan 16 15:36 a
  -rw-r--r-- 14097 Jan 16 15:36 b
  -rw-r--r-- 1 105 Jan 16 15:37 c


These are their "apparent-sizes", rounded up to the nearest
1K block:

  $ du --apparent-size a b c
  2 a
  5 b
  1026  c

e.g. file "a" is 1700 bytes, rounded-up to 2K, and "du --apparent-size"
shows "2".

Using "--apparent-size --block-size=1" (and its equivalent, "--bytes")
will show the exact sizes:

  $ du --apparent-size --block-size=1 a b c
  1700 a
  4097 b
  105  c

Without "--apparent-size", du shows how much storage space is actually 
used/wasted/consumed on the storage medium by the files:


  $ du a b c
  4a
  8b
  0c

How are these numbers calculated?

The simplest case is file "c" - it is completely sparse - so despite
logically containing 1,050,000 zeros, on the actual storage medium it 
consumes zero data blocks (ignoring inodes blocks and somesuch).


File "a" has 1,700 bytes of data.
On my filesystem the basic block size is 4096, as shown by "stat -f":

  $ stat -f /
File: "/"
  ID: 5a2cade519bada6a Namelen: 255 Type: ext2/ext3
->Block size: 4096   Fundamental block size: 4096<-
  Blocks: Total: 27559017   Free: 18845977   Available: 17435289
  Inodes: Total: 7036928Free: 6496730

Therefore, any file from size 1 to size 4096 will consume exactly one
disk block. On most common filesystems, disk blocks can not be shared
between files. Meaning that this block is fully consumed.

That's why for file "a" du shows "4" - meaning 4K bytes (exactly one
block) is consumed on the storage medium by this file.

Similarly for file "b" - its size is 4097, which is 1 byte more than one
filesystem block. Hence, file "b" consumes 2 blocks, coming up to 8K.
du then shows "8" for file "b".


Now to your examples:


%> du -hcs /Volumes/nif64/tmp/.npm/ ; du -hcs --apparent-size

/Volumes/nif64/tmp/.npm/

340M/Volumes/nif64/tmp/.npm/ > 180M/Volumes/nif64/tmp/.npm/
Same folder on btrfs (mounted with compress=lzo): > %> du -hcs /mnt/.npm/ ; du -hcs --apparent-size  /mnt/.npm> 198M 

/mnt/.npm/> 181M/mnt/.npm

In both cases, "du --apparent-size" shows about 180MB of actual data 
(181MB in the second example). That is the amount of actual content

(number of total bytes in these files).

In the first case, these files consume 340MB of space on your disk.
In the second case, these files consume 198MB of space on your disk.
The reason they consume MORE than their actual data is explained above
with the file-system blocks.

This suggest to me that compression is not accounted for in these
values. If it was, then the consumed size (without "--apparent-size")
should've been less than the actual size (with "--apparent-size").

A quick on-line search shows that btrsf's default block size is 16K,
while ZFS's default record-size is 128KB. That might explain
why similar amount of data (and I assume, similar number of files and
sizes) consume more disk space on ZFS (Could be wrong, though, comments
are welcomed).


I hope this helps to clarify "apparent-size".

I'll leave it to others to comment on how compressed file systems
come into play with du.

regards,
 - assaf

bug#33468: A bug with yes and --help

2019-01-12 Thread Assaf Gordon


Hello Eric,

On 2019-01-12 8:42 a.m., Eric Blake wrote:

On 1/11/19 6:23 PM, Assaf Gordon wrote:


-  optind = 0;
+  optind = 1;


Ouch. You're hitting the portability problem of the difference between
BSD and glibc.



Otherwise many things fail like so:

   $ ./src/dd
   ./src/dd: unrecognized operand ‘./src/dd’
   Try './src/dd --help' for more information.


That's the symptoms on BSD for optind = 0 (there, you HAVE to use
optreset=optind=1 for a complete reset; or plain optind=1 for a soft
reset where the man page is not clear if it will always work).  But on
glibc, optind=1 does a soft reset (works if the optstring does not start
with '-' or '+' and if you did not change POSIXLY_CORRECT), but MUST use
optind = 0 if you want a hard reset.


I only tested on Debian Stretch (with Debian GLIBC 2.24-11+deb9u3),
did not yet test on BSDs.

With "optind=1", I see the following:

===
  $ ./src/hostid
  ec68f06c

  $ ./src/sleep
  ./src/sleep: missing operand
  Try './src/sleep --help' for more information.

  $ ./src/uptime
   11:14:05  up 23 days 21:23,  4 users,  load average: 1.16, 1.05, 0.52

  $ ./src/users
  gordon gordon gordon gordon

  $ ./src/nohup
  ./src/nohup: missing operand
  Try './src/nohup --help' for more information.

  $ ./src/dd   ## waits for CTRL-C
  ^C
  0+0 records in
  0+0 records out
  0 bytes copied, 1.10243 s, 0.0 kB/s

  $ ./src/yes | head -n1
  y
===

With "optind=0" I see the following:

===
  $ ./src/hostid
  ./src/hostid: extra operand ‘./src/hostid’
  Try './src/hostid --help' for more information.

  $ ./src/sleep
  ./src/sleep: missing operand
  Try './src/sleep --help' for more information.

  $ ./src/users

  $ ./src/users | od -tx1
  000 02 e2 03 0a
  004

  $ ./src/users /var/log/wtmp
  ./src/users: extra operand ‘/var/log/wtmp’
  Try './src/users --help' for more information.

  $ ./src/nohup
  ./src/nohup: ignoring input and appending output to 'nohup.out'
  ^C

  $ ./src/dd
  ./src/dd: unrecognized operand ‘./src/dd’
  Try './src/dd --help' for more information.

  $ ./src/yes | head -n1
  ./src/yes
===

Perhaps "parse_gnu_standard_options_only" should use "_getopt_long_r"
and avoid the need to reset anything?

regards,
 - assaf

bug#33468: A bug with yes and --help

2019-01-11 Thread Assaf Gordon


Hello Berny and all,

On 2018-11-29 1:48 a.m., Bernhard Voelker wrote:


The attached are quite raw attempts to address this - yes, as a function
instead of a macro. ;-)

* [PATCH] long-options: add parse_gnu_standard_options_only
   gnulib patch!


For the gnulib patch, I believe the following is needed:

diff --git a/lib/long-options.c b/lib/long-options.c
index 52ef1f2f8..9567d5135 100644
--- a/lib/long-options.c
+++ b/lib/long-options.c
@@ -139,7 +139,7 @@ parse_gnu_standard_options_only (int argc,
   /* Restore previous value.  */
   opterr = saved_opterr;

-  /* Reset this to zero so that getopt internals get initialized from
+  /* Reset this to one so that getopt internals get initialized from
  the probably-new parameters when/if getopt is called later.  */
-  optind = 0;
+  optind = 1;
 }


Otherwise many things fail like so:

  $ ./src/dd
  ./src/dd: unrecognized operand ‘./src/dd’
  Try './src/dd --help' for more information.

The "1" value matches the instructions in the getopt_long(3) man page.


* [PATCH] all: detect --help and --version more consistently [FIXME]
   FIXME: NEWS, syntax-check, tests.


With the above 'optind=1' change, there are only two major differences:
---
  $ nohup-8.30 -/ ; echo $?
  nohup: invalid option -- '/'
  Try 'nohup --help' for more information.
  125

  $ ./src/nohup -/ ; echo $?
  src/nohup: invalid option -- '/'
  Try 'src/nohup --help' for more information.
  1


  $ dd-8.30 -- if=/dev/null
  0+0 records in
  0+0 records out
  0 bytes copied, 3.9014e-05 s, 0.0 kB/s

  $ ./src/dd -- if=/dev/null
  ./src/dd: unrecognized operand ‘--’
  Try './src/dd --help' for more information.
---

Which in turn cause "tests/misc/invalid-opt",
"tests/misc/usage_vs_getopt", and "tests/dd/misc" to fail.

All other test pass as before (tested only on Debian Stretch).

regards,
 - assaf

P.S.
https://bugs.gnu.org/29617  "seq: `seq 1 --help' doesn't give help"
will also likely be fixed by your patch.

bug#25159: chown bug ? or sys glitch ?

2019-01-11 Thread Assaf Gordon


close 25159
stop

On 2018-10-28 1:35 a.m., Assaf Gordon wrote:


On 2016-12-10 6:51 a.m., ahfc wrote:

Maybe a system glitch or a chown bug so just fyi.

[...]

chown: changing ownership of ‘/run/media/rest_/of_/path_/filename ':
Operation not permitted



If this is still an issue for you, can you provide more details,
in particular what is the file system on /run/media ?


With no further replies, I'm closing this bug.

regards,
 - assaf

bug#29285: Error building coreutils 8.28.32-a4eed under Archlinux from AUR

2019-01-11 Thread Assaf Gordon


close 29285
stop


On 2018-10-29 8:09 p.m., Assaf Gordon wrote:

On 2017-11-13 7:43 a.m., timofonic timofonic wrote:


As the coreutils build system reported, I'm sending the following
building error from using the coreutils-git Arch User Repository
package ( https://aur.archlinux.org/packages/coreutils-git/)



Do you still get similar failure with more recent coreutils versions?



With no further replies, I'm closing this bug.

regards,
 - assaf

bug#33204: Failed to modify 'Access Time' for files without extension using the Touch tool ver 8.4

2019-01-11 Thread Assaf Gordon


close 33204
stop

Hello,

On 2018-10-31 7:10 a.m., Eric Blake wrote:

On 10/30/18 3:49 AM, ˮ��֮�� wrote:

HI,Dear developer of GNU tools：
  I found a possible bug when using the Touch tool.


Most likely, this is not a bug in coreutils, but a limitation between 
the operating system and file system you are using. If it is a GNU/Linux 
system, using strace would confirm that.  But since you did not give us 
those details, it's hard to say if there's anything further we can do to 
help you.



With no further replies, I'm closing this bug.

regards,
 - assaf

bug#15328: Bug or dubious feature?

2019-01-11 Thread Assaf Gordon


tags 15328 notabug
close 15328
stop

Hello,

On 2013-09-10 3:01 p.m., Linda Walsh wrote:

Whatever the problem is, it's not in 'mv'...


Given the above, and no further comments in 5 years,
I'm closing this item.

regards,
 - assaf

bug#15727: Bug: cp <-a|-archive> (w/<-f|--remove-destination>) breaks if one of files is a dir and other not

2019-01-11 Thread Assaf Gordon


severity 15727 wishlist
retitle 15727 doc: cp: expand dirs-vs-files with -f/--remove-dest
stop


Hello,

On 2013-10-29 12:20 p.m., Linda Walsh wrote:
[...]

You need to make the docs much more clear about "cp"s limitations.

update isn't eally update, and -T is certainly wrong at the very
least.  If you feel you'd rather document cp's limitations,
that's fine...

cp is a great tool, don't get me wrong!  But when it added update
and -T, --remove-destination, it started inferring or promising
more than you were willing to deliver.  That should be documented.


Based on the above (as the result of the long discussion),
I'm marking this as a documentation wish-list item:

clarify that "-f" and "--remove-destination" won't replace a file
with a directory (as explained by Pádraig and Bernhard in the thread).

Similarly for related limitations of "-T".

regards,
 - assaf

bug#22022: ls - error making symbolic links with relative paths

2019-01-11 Thread Assaf Gordon


retitle 22022 ln: error making symbolic links with relative paths
tags 22022 notabug
close 22022
stop

Hello,

On 2015-11-26 9:13 p.m., Eric Blake wrote:
[...]

You may be interested in trying 'ln --relative -sv b/* c/' instead,
which creates 'c/a' as a symlink to '../b/a', and therefore resolves
rather than creating a dangling symlink.


With no further comments to Eric's suggestion,
I'm assuming this is resolved.

Discussion can continue by replying to this thread.

regards,
 - assaf

bug#34009: warn that mkdir --mode doesn't affect parents created

2019-01-11 Thread Assaf Gordon


severity 34009 wishlist
retitle 34009 doc: mkdir: warn that --mode doesn't affect parents
stop

Hello,

On 2019-01-07 8:36 a.m., 積丹尼 Dan Jacobson wrote:


do warn that --mode doesn't affect any parents created.

$ mkdir --mode 700 -p /tmp/g/h/i
$ find /tmp/g -ls
 55795  0 drwxr-xr-x   3 jidanni  jidanni60 Jan  7 23:30 /tmp/g
 55796  0 drwxr-xr-x   3 jidanni  jidanni60 Jan  7 23:30 
/tmp/g/h
 55797  0 drwx--   2 jidanni  jidanni40 Jan  7 23:30 
/tmp/g/h/i

Also warn on (info "(coreutils) mkdir invocation") more directly. Thanks.


The info manual does contain a short sentence about parents' modes:
 "To set the file permission bits of any newly-created parent
  directories to a value [...]"

But this can be improved.

Marking as wishlist. Patches are welcomed.

regards,
 - assaf'

bug#20775: cp -a -u destroys files after they are copied

2019-01-11 Thread Assaf Gordon


severity 20775 wishlist
retitle 20775 cp: improve hardlink dups handling with "cp -a -u"
stop

With no further comments in more than 3 years,
I'm marking this as a "wish list" item.

-assaf

bug#32291: Fwd: ls -ltcr and ls -lrt report different modification dates

2019-01-11 Thread Assaf Gordon


tags 32291 notabug
close 32291
stop

Hello,

Seems your message was not replied to in 6 months - sorry about that.

On 2018-07-27 3:48 a.m., Ludovic Tolhurst-Cleaver wrote:


`ls -ltcr` seems to be the one showing the correct date here. I like to 
use `ls -ltc` because it's my initials. My colleague was running `ls -lrt`.



$ ls -ltcr ludo*

-rw-rw-rw- 1 pax pax 237817 Jul 20 06:53 
ludovic.tolhurst-cleaver_sabstt.com-log-20180720.gz


$ ls -lrt ludo*

-rw-rw-rw- 1 pax pax 237817 Jul 18 12:30 
ludovic.tolhurst-cleaver_sabstt.com-log-20180720.gz


The manual explains the "-c" option:
===
$ ls --help
  -c with -lt: sort by, and show, ctime (time of last
 modification of file status information);
===

What you are seeing is the file's status-change timestamp (with "-c")
versus the file's content modification timestamp (without "-c").
You can view all timestamps at once with:
   stat ludo*

As such, I'm closing this as "not a bug", but discussion can continue
by replying to this thread.

regards,
 - assaf

1 2 3 4 5 6 >

1 - 100 of 574 matches

Mail list logo