bug#49217: [PATCH] tests: exercise shuf --input-range edge cases

2021-06-26 Thread Erik Auerswald
* tests/misc/shuf.sh: Test valid "shuf -i" edge cases that result
in a single line of input, or no line at all.  Test an invalid
range, too.
---
 tests/misc/shuf.sh | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tests/misc/shuf.sh b/tests/misc/shuf.sh
index 892386b3f..83e940ec4 100755
--- a/tests/misc/shuf.sh
+++ b/tests/misc/shuf.sh
@@ -39,6 +39,15 @@ compare in out > /dev/null && { fail=1; echo "not random?" 
1>&2; }
 sort -n out > out1
 compare in out1 || { fail=1; echo "not a permutation" 1>&2; }
 
+# Exercise border conditions of shuf's -i option
+# LO == HI gives one line
+echo 1 > in1 || framework_failure_
+shuf -i 1-1 > out || fail=1
+compare in1 out || fail=1
+# LO == HI+1 gives no output
+shuf -i 1-0 > out || fail=1
+compare /dev/null out || fail=1
+
 # Exercize shuf's -r -n 0 options, with no standard input.
 shuf -r -n 0 in <&- >out || fail=1
 compare /dev/null out || fail=1
@@ -95,7 +104,7 @@ test "$c" -eq 3 || { fail=1; echo "Multiple -n failed">&2 ; }
 { shuf -i0-9 -n10 -i8-90 || test $? -ne 1; } &&
   { fail=1; echo "shuf did not detect multiple -i usage.">&2 ; }
 # Test invalid range
-for ARG in '1' 'A' '1-' '1-A'; do
+for ARG in '1' 'A' '1-' '1-A' '3-1'; do
 { shuf -i$ARG || test $? -ne 1; } &&
 { fail=1; echo "shuf did not detect erroneous -i$ARG usage.">&2 ; }
 done
-- 
2.17.1





bug#49217: [PATCH] doc: clarify valid ranges for shuf -i

2021-06-26 Thread Erik Auerswald
* doc/coreutils.texi (shut invocation): Mention valid and invalid
edge cases for --input-range.
---
 doc/coreutils.texi | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index ea040458e..f59c5e962 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -4978,7 +4978,10 @@ Treat each command-line operand as an input line.
 @opindex --input-range
 @cindex input range to shuffle
 Act as if input came from a file containing the range of unsigned
-decimal integers @var{lo}@dots{}@var{hi}, one per line.
+decimal integers @var{lo}@dots{}@var{hi}, one per line.  If @var{lo} is
+equal to @var{hi}, this is a single line.  If @var{lo} is one bigger than
+@var{hi}, this is accepted as the empty range.  Other cases of @var{lo}
+greater than @var{hi} are rejected as invalid.
 
 @end table
 
-- 
2.17.1





bug#49217: 'shuf' returns nothing if the low range number is higher by 1 than the high number

2021-06-25 Thread Erik Auerswald
Hi Paul,

On Fri, Jun 25, 2021 at 09:29:04AM -0700, Paul Eggert wrote:
> On 6/24/21 11:49 PM, Erik Auerswald wrote:
> > $ shuf -i 2-0 ; echo %exit code $?
> > shuf: invalid input range: ‘2-0’
> > %exit code 1
> > $ shuf -i 1-0 ; echo %exit code $?
> > %exit code 0
> >
> >This looks inconsistent and possibly not exactly as intended.
> 
> It's exactly what I intended and there's no inconsistency. When you
> say 'shuf -i M-N' you select from a collection of N-M+1 lines.

It also specifies the contents of those lines, unless there is less than
one line.

> N-M+1 = 0 (no input lines) makes sense, but N-M+1 < 0 (negative number
> of input lines?) does not.

I do not think that it makes sense to specify the contents of no input
lines.  Perhaps we can agree to disagree on this?

Then the documentation does not describe it that way.  I think that can
lead to confusion.

The documentation describes the option as simulating input "from a file
containing the range of unsigned decimal integers LO...HI, one per line."
>From this description it is not obvious that "1-0" is OK, but "2-0"
is not.  In both cases LO > HI, but one is accepted without error,
but the other is not.

I think that "select from a negative number of lines" makes just as much
sense as "select from no lines at all."  Here we seem to disagree, which
is OK with me.

Similarly to "shuf -iLO-HI", "seq FIRST LAST" produces LAST-FIRST+1 lines.
But seq does allow to ask, to adapt your wording, for a negative number
of lines:

$ seq 2 0 ; echo %exit code $?
%exit code 0
$ seq 1 0 ; echo %exit code $?
%exit code 0
$ seq 0 0 ; echo %exit code $?
0
%exit code 0

The problem I see is that the intention behind "shuf -i" that can be
gleaned from your implementation, and that you have described above,
is not obvious from the documentation or from similar functionality in
the GNU Core Utilities.

I see three views regarding the case of LO > HI in this thread:

  1. The bug reporter expected LO > HI to always produce an error,
 or possibly to never produce an error.

  2. Your "shuf" implementation sees LO == HI+1 as the one allowed
 possibility to specify no input, based on the HI-LO+1 formula for
 the number of lines to choose from.

  3. The "seq" implementation in the GNU Core Utilities allows LO > HI
 and interprets it as the empty sequence.  I actually like this best.

Thus I think that it is not as clear and obvious as you seem to
expect that the current "shuf" behavior is the obviously correct one.
No offense intended!

I do not care deeply which behavior is selected.  I just want to make
it clearer for others, including me, to understand that the current
implementation is as intended.  Adding to the documentation (for users)
and the tests (for developers) seems to be helpful to me.

> >I'd like to document it and add test cases.
> 
> Feel free,

Thanks, I'll think about a wording both simple to understand and including
the special case.  I intend to send a patch to this bug report in a
couple of days.

> though we need to reserve the right to extend 'shuf' in
> the future. In other words, not every invocation of 'shuf' that
> provokes a diagnostic now will provoke a diagnostic in the future.

I like that.

HTH, HAND,
Erik
-- 
Be water, my friend.
-- Bruce Lee





bug#49217: 'shuf' returns nothing if the low range number is higher by 1 than the high number

2021-06-25 Thread Erik Auerswald
Hi,


On Fri, Jun 25, 2021 at 08:54:43AM +0200, Erik Auerswald wrote:
> On Fri, Jun 25, 2021 at 08:49:51AM +0200, Erik Auerswald wrote:
> > On Thu, Jun 24, 2021 at 09:19:36PM -0700, Paul Eggert wrote:
> > > On 6/24/21 4:46 PM, F8ER F8ER wrote:
> > > >For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
> > > >= 0 (unexpected).
> > > 
> > > Actually, it's the expected behavior. It's the same behavior as
> > > 'shuf -n 1  > > exactly 1 line"; it means "output at most 1 line".
> > 
> > I think the reported issue is with producing no error with LO==HI+1,
> > but producing an error with LO ^
>   LO>HI+1

The code seems to intentionally silently ignore LO == HI+1, but not
LO > HI+1.  But this is neither documented nor tested.  This may be
an intentionally interesting way to simulate reading from an empty
file containing no lines between LO and HI.

Please see my previous patch as a suggestion on how to make the code
less suprising.

I am fine with keeping the current behavior, but then I'd like to
document it and add test cases.  Please let me know if you'd rather
have a documentation change & tests patch than the current code
change & tests patch.

I do think that it would be better to either change the code or the
documentation, and add test cases, than to do nothing.

Thanks,
Erik
-- 
Simplicity is prerequisite for reliability.
-- Edsger W. Dijkstra





bug#49217: [PATCH] shuf: fix bug with "-i 1-0"

2021-06-25 Thread Erik Auerswald
"shuf -i 1-0" would mistakenly accept the invalid range
without an error message and produce no output.  Other
invalid ranges, e.g., "shuf -i 2-0", would be detected
and produce an error message, non-zero exit code, and
no output.

Bug reported by "F8ER F8ER."

* src/shuf.c (main): Fix bug.
* tests/misc/shuf.sh: Add a test case for the bug.
---
 src/shuf.c | 2 +-
 tests/misc/shuf.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/shuf.c b/src/shuf.c
index 1af1b533a..91430a88a 100644
--- a/src/shuf.c
+++ b/src/shuf.c
@@ -431,7 +431,7 @@ main (int argc, char **argv)
  _("invalid input range"), 0);
 
   n_lines = hi_input - lo_input + 1;
-  invalid |= ((lo_input <= hi_input) == (n_lines == 0));
+  invalid |= (lo_input > hi_input);
   if (invalid)
 die (EXIT_FAILURE, errno, "%s: %s", _("invalid input range"),
  quote (optarg));
diff --git a/tests/misc/shuf.sh b/tests/misc/shuf.sh
index 892386b3f..2a7cba4d3 100755
--- a/tests/misc/shuf.sh
+++ b/tests/misc/shuf.sh
@@ -95,7 +95,7 @@ test "$c" -eq 3 || { fail=1; echo "Multiple -n failed">&2 ; }
 { shuf -i0-9 -n10 -i8-90 || test $? -ne 1; } &&
   { fail=1; echo "shuf did not detect multiple -i usage.">&2 ; }
 # Test invalid range
-for ARG in '1' 'A' '1-' '1-A'; do
+for ARG in '1' 'A' '1-' '1-A' '1-0' '2-0'; do
 { shuf -i$ARG || test $? -ne 1; } &&
 { fail=1; echo "shuf did not detect erroneous -i$ARG usage.">&2 ; }
 done
-- 
2.17.1





bug#49217: 'shuf' returns nothing if the low range number is higher by 1 than the high number

2021-06-25 Thread Erik Auerswald
Hi,

On Fri, Jun 25, 2021 at 08:49:51AM +0200, Erik Auerswald wrote:
> On Thu, Jun 24, 2021 at 09:19:36PM -0700, Paul Eggert wrote:
> > On 6/24/21 4:46 PM, F8ER F8ER wrote:
> > >For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
> > >= 0 (unexpected).
> > 
> > Actually, it's the expected behavior. It's the same behavior as
> > 'shuf -n 1  > exactly 1 line"; it means "output at most 1 line".
> 
> I think the reported issue is with producing no error with LO==HI+1,
> but producing an error with LOHI+1

Sorry!

Thanks,
Erik
-- 
Hofstadter's Law: It always takes longer than you expect, even when
  you take into account Hofstadter's Law.





bug#49217: 'shuf' returns nothing if the low range number is higher by 1 than the high number

2021-06-25 Thread Erik Auerswald
Hi,

On Thu, Jun 24, 2021 at 09:19:36PM -0700, Paul Eggert wrote:
> On 6/24/21 4:46 PM, F8ER F8ER wrote:
> >For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
> >= 0 (unexpected).
> 
> Actually, it's the expected behavior. It's the same behavior as
> 'shuf -n 1  exactly 1 line"; it means "output at most 1 line".

I think the reported issue is with producing no error with LO==HI+1,
but producing an error with LO

bug#47859: Additional seq outlandish example: seq 0 dangers

2021-04-18 Thread Erik Auerswald
Hi,

On Sun, Apr 18, 2021 at 09:26:28AM +0800, 積丹尼 Dan Jacobson wrote:
> On (info "(coreutils) seq invocation") we read
>Be careful when using ‘seq’ with outlandish values: otherwise you
>may...
> 
> Here's another 'fun/sad/DDOS yourself' example you might add:
> 
> One day I wrote a Makefile,
> m:
>   seq 0 9|sed s/$$/號.html/|xargs make
> but before using it, I though I'll just test with one item,
> m:
>   seq 0  |sed s/$$/號.html/|xargs make
> well of course... as seq prints nothing here,
> this triggered a massive ever growing recursive loop...
> 
> Yes, all my fault for picking 0. I'll pick 1 next time.
> 
> P.S., perhaps document how to get seq to cough up just "0". One way I
> found was:
> $ seq 0 1 0
> 0

I would like to add more information to this bug report with the intent of
helping everybody involved now or in the future.

A slighly simpler method to make 'seq' print just '0' is:

$ seq 0 0
0

This is documented, but more generally, e.g., in 'seq --help':

$ seq --help
Usage: seq [OPTION]... LAST
  or:  seq [OPTION]... FIRST LAST
  or:  seq [OPTION]... FIRST INCREMENT LAST
Print numbers from FIRST to LAST, in steps of INCREMENT.
[...]
If FIRST or INCREMENT is omitted, it defaults to 1.  [...]
[...]

Thus, 'seq 0' is the same as 'seq 1 1 0' and 'seq 0 0' is the same as
'seq 0 1 0'.

The default value of '1' for omitted parameters affects other values, too,
not just '0':

$ seq -1
$ seq -1 -1
-1
$ seq -10
$ seq -10 -10
-10

When "FIRST" and "LAST" are the same, any valid "INCREMENT" value results
in 'seq' printing just one value, not just the default of '1':

$ seq 0 200 0
0
$ seq 0 -200 0
0
$ seq 0 0 0
seq: invalid Zero increment value: ‘0’
Try 'seq --help' for more information.

Thus IMHO a possible addition to the documentation should probably not
just single out 'seq 0', but mention any number smaller than the default
value for "FIRST" of '1'.

HTH, HAND
Erik
-- 
Inside every large problem is a small problem struggling to get out.
-- Hoare's Law of Large Problems





bug#47246: pr -f does not pause

2021-03-18 Thread Erik Auerswald
POSIX specifies that pr -f shall "[p]ause before beginning the first
page if the standard output is associated with a terminal" (see
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pr.html).
GNU pr does not do this.

[The recently reported bug#47243
(https://lists.gnu.org/archive/html/bug-coreutils/2021-03/msg00033.html)
made me look again at the POSIX specification for pr where I noticed
this.]

Thanks,
Erik





bug#47085: du: why does 'usage' show prefixes 'Z' or 'Y' if they are disallowed?

2021-03-11 Thread Erik Auerswald
Hi,

On Thu, Mar 11, 2021 at 08:53:03PM -0800, L A Walsh wrote:
> I thought to display 0 (or 0) for 1st arg by doing:
> 
> du -BY, as -B says I can list a unit for scaling, but for
> -BY and -BZ I get:
> du: -B argument 'Y' too large.
> 
> It doesn't even look to see how much space is used, it
> immediately returns Y & Z are "too large".

Speculation (i.e., I did not look at the code): Z means 2^70, Y means
2^80, so they are both too big for unsigned 64bit integers.  Thus they
may be too big for du?

> Why are those suffixes listed as valid under the program 'usage'
> and manpage, when they are automatically disallowed?

Perhaps they are automatically used with sufficiently sized integer types,
i.e., this may be future proofing?

You could look at the code to get a deeper insight.

HTH,
Erik
-- 
Be water, my friend.
-- Bruce Lee





bug#45358: bootstrap fails due to a certificate mismatch

2021-03-09 Thread Erik Auerswald
Hi,

On Tue, Mar 09, 2021 at 11:28:18AM +0200, Grigoriy Sokolik wrote:
> I've rechecked:

I cannot reproduce the problem, the certificate is trusted by my system:

# via IPv4
$ gnutls-cli --verbose translationproject.org  [...]issuer `CN=DST Root CA X3,O=Digital Signature Trust Co.'[...]

On my Ubuntu 18.04 system, I find it via symlink from /etc/ssl/certs:

$ ls /etc/ssl/certs/DST_Root_CA_X3.pem -l
lrwxrwxrwx 1 root root 53 Mai 28  2018 /etc/ssl/certs/DST_Root_CA_X3.pem -> 
/usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt
$ certtool --certificate-info < 
/usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt | grep Subject:
Subject: CN=DST Root CA X3,O=Digital Signature Trust Co.

HTH,
Erik
-- 
[A]pplied cryptography mostly sucks.
-- Green's law of applied cryptography





bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-14 Thread Erik Auerswald
On Sun, Feb 14, 2021 at 11:04:21PM +, Pádraig Brady wrote:
> On 14/02/2021 19:22, Erik Auerswald wrote:
> >May I ask you to test the new patch (v4) as well?
> 
> This version looks good.
> I'll probably apply this after a little more local testing.

Thanks!





bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-14 Thread Erik Auerswald

Hi,

On 13.02.21 21:28, Leonard Janis Robert König wrote:

On Sat, 2021-02-13 at 21:15 +0100, Erik Auerswald wrote:

On 13.02.21 19:29, Leonard Janis Robert König wrote:

[...]
That being said, I don't see this exact distinction reflected in
the
code, so perhaps I just misunderstood.


Disabling "Tabification" only when "-s" was active is missing.  That
resulted in the 2007 bug.  Making the needed special treatment always
used fixed the 2007 bug, but broke your use case.

That some special treatment is needed and intended can be gleaned
from the following comment (with line numbers from pr.c in the
current master branch @ 2de30c7350a77b091afa1eb284acdf082c0f6aa5):

1031  /* It's rather pointless to define a TAB separator with column
1032 alignment */


The code after that comment does not disable alignment, but changes
the separator from a TAB to a space.


My patch adds the special treatment, since it works both for the 2007
bug and this bug (bug#46422).


The attached version 4 of my patch does that in a way that more
clearly shows the intent.  I think this is a better fix for the
2007 bug than commit 553d347d3e08e00ee4f9df520b37c964c3f26e28.
Expanding TABs on input is enabled unless when a single TAB is
used as column separator.  This conforms better to POSIX and
does not introduce the regression that causes the current bug
(bug#46422).

I have added more test cases, because manual testing showed that
the options "-s" and "-s$'\t'" were treated differently by pr.

Using "-s" to activate the default TAB separator should result
in the same output as using "-s$'\t'" to specify one TAB character
as separator, i.e., the default, explicitly.


[...] with the patch my rather obscure (and complex)
use case of printing thousands of lines of code works properly now!


Thanks for testing!


Thanks all to you


May I ask you to test the new patch (v4) as well?

Thanks,
Erik
diff --git a/src/pr.c b/src/pr.c
index 22d032ba3..5b003cb9a 100644
--- a/src/pr.c
+++ b/src/pr.c
@@ -1237,6 +1237,8 @@ init_parameters (int number_of_files)
 col_sep_string = column_separator;
 
   truncate_lines = true;
+  if (! (col_sep_length == 1 && *col_sep_string == '\t'))
+untabify_input = true;
   tabify_output = true;
 }
   else
diff --git a/tests/pr/pr-tests.pl b/tests/pr/pr-tests.pl
index b7d868cf8..d0ac40520 100755
--- a/tests/pr/pr-tests.pl
+++ b/tests/pr/pr-tests.pl
@@ -466,6 +466,27 @@ push @Tests,
 {IN=>{2=>"m\tn\to\n"}},
 {IN=>{3=>"x\ty\tz\n"}},
  {OUT=>join("\t", qw(a b c m n o x y z)) . "\n"} ];
+# -s and -s$'\t' use different code paths
+push @Tests,
+   ['merge-w-tabs-sepstr', "-m -s'\t' -t",
+{IN=>{1=>"a\tb\tc\n"}},
+{IN=>{2=>"m\tn\to\n"}},
+{IN=>{3=>"x\ty\tz\n"}},
+ {OUT=>join("\t", qw(a b c m n o x y z)) . "\n"} ];
+
+# Exercise a variant of the bug with pr -m -s (commit 553d347)
+# test 2 files, too (merging 3 files automatically aligns columns on TAB stops)
+push @Tests,
+   ['merge-2-w-tabs', '-m -s -t',
+{IN=>{1=>"a\tb\tc\n"}},
+{IN=>{2=>"m\tn\to\n"}},
+ {OUT=>join("\t", qw(a b c m n o)) . "\n"} ];
+# -s and -s$'\t' use different code paths
+push @Tests,
+   ['merge-2-w-tabs-sepstr', "-m -s'\t' -t",
+{IN=>{1=>"a\tb\tc\n"}},
+{IN=>{2=>"m\tn\to\n"}},
+ {OUT=>join("\t", qw(a b c m n o)) . "\n"} ];
 
 # This resulted in reading invalid memory before coreutils-8.26
 push @Tests,
@@ -474,6 +495,23 @@ push @Tests,
 {IN=>{2=>"a\n"}},
  {OUT=>"a\t\t\t\t  \t\t\ta\n"} ];
 
+# Exercise a bug with pr -t -2 (bug #46422)
+push @Tests,
+   ['mcol-w-tabs', '-t -2',
+{IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx   x\tx\tx\tx\tx\n"} ];
+
+# generalize case from commit 553d347 (problem results from -s, not -m)
+push @Tests,
+   ['mcol-w-tabs-w-tabsep', '-t -2 -s',
+{IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx\tx\tx\tx\tx\tx\n"} ];
+# -s and -s$'\t' use different code paths
+push @Tests,
+   ['mcol-w-tabs-w-tabsep-sepstr', "-t -2 -s'\t'",
+{IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx\tx\tx\tx\tx\tx\n"} ];
+
 @Tests = triple_test \@Tests;
 
 my $save_temps = $ENV{DEBUG};


bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-13 Thread Erik Auerswald

Hi,

On 13.02.21 19:29, Leonard Janis Robert König wrote:


first:  Thank you very much for the work, I really owe you one!


You're welcome. :-)


On Sat, 2021-02-13 at 17:58 +0100, Erik Auerswald wrote:

On 13.02.21 15:17, Erik Auerswald wrote:

On 11.02.21 20:20, Erik Auerswald wrote:

On Thu, Feb 11, 2021 at 06:09:28PM +0100, Leonard Janis Robert
König
wrote:

On Thu, 2021-02-11 at 16:45 +0100, Erik Auerswald wrote:

On Thu, Feb 11, 2021 at 04:12:54PM +0100, Leonard Janis
Robert
König wrote:

On Thu, 2021-02-11 at 13:00 +0100, Erik Auerswald wrote:

On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard Janis
Robert
König wrote:

I'm sorry if I this is not a bug but to be expected,
but I thnk
pr doesn't get the alignment of tabs in multicolumn
output
right.  [...]  This seems *kind* of related to multi-
column
merged output, as was discussed some years ago here:
https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html


This thread contains the bug-introducing patch in message
https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00160.html
  
This is commit 553d347d3e08e00ee4f9df520b37c964c3f26e28.

That commit removed the 'assume -e' part of the POSIX
description
of the -COLUMN option from GNU pr.

[...]

I have found a fix to the problem described by you.  I am quite
sure that
this is not *correct*, but I did not find a way to make
print_sep_string()
account for tabs that did not break quite a few existing tests,
even if
the merged files problem from 2007 and this columnating bug were
both
fixed.  Thus I just tighten the 2007 bug fix to apply in less
cases.
This way all existing tests pass, and a new one pertaining to
this bug
report passes, too.  I do think this is in the same spirit as the
"fix"
from 2007 (commit 553d347d3e08e00ee4f9df520b37c964c3f26e28).


I think the attached patch is a better fix than my previous one,
because it applies the special treatment of TAB as separator more
consistently.  It may still not be complete (the code seems quite
convoluted to me) but I do think it improves the situation
significantly, and does not make it worse.


Hm, I'm not sure whether I understand this special case.  When we have
a tab as column separator, doesn't this imply that the second column is
starting on a position n*8, (effectively equivalent to the first
column), thus guaranteeing that the alignment is honored?  So, if my


Whatever the reason (perhaps conforming to POSIX, perhaps other pr
implementations doing the same), GNU pr implements a special
treatment for TAB as column separator, and the thread from 2007
implies that the pr from HP-UX does as well.

The POSIX spec says:

"-s[char]

 Separate text columns by the single character char instead of
 by the appropriate number of  characters (default for
 char shall be )."

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pr.html

So use of -s needs to always result in one separator character
between columns.

This is implemented by GNU pr, and seemingly by pr from HP-UX, too
(https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html)

Of all the printable ASCII characters, only TAB results in
interactions with "Tabification," i.e., turning TABs into spaces
on input and spaces into TABs on output.  Thus only TAB as
separator may require the special treatment of disabling
"Tabification."

Omitting this special treatment resulted in the bug from 2007.

Removing the implicit "-e" and "-i" from "-NUMBER" and "-m"
to fix the 2007 bug resulted in this bug (bug#46422), and does
not conform to the POSIX specification nor to the GNU pr info
documentation.

My v3 patch restricts this special treatment of "-s" to just
the cases where it is used without specifying a separator and
thus using the default of TAB, or when it is used with a single
TAB ("-s$'\t'").  Thus it restricts the 2007 change from commit
553d347d3e08e00ee4f9df520b37c964c3f26e28 to affect only those
use cases it should affect, instead of all multi-column use cases.

It may be possible to add some appropriate special treatment for
TAB as separator without disabling "Tabification."  But I do not
know how.  Just accounting for the output position change resulting
from printing a TAB in print_sep_string() does not work, i.e.,
breaks many of the existing tests.


[...]
That being said, I don't see this exact distinction reflected in the
code, so perhaps I just misunderstood.


Disabling "Tabification" only when "-s" was active is missing.  That
resulted in the 2007 bug.  Making the needed special treatment always
used fixed the 2007 bug, but broke your use case.

That some special treatment is needed and intended can be gleaned
from the following comment (with line numbers from pr.c in the
current master branch @ 2de30c7350a77b091afa1eb284acdf082c0f6aa5):

1031  /* It's rather pointless to define a TAB separa

bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-13 Thread Erik Auerswald

Hi,

On 13.02.21 15:17, Erik Auerswald wrote:

On 11.02.21 20:20, Erik Auerswald wrote:
On Thu, Feb 11, 2021 at 06:09:28PM +0100, Leonard Janis Robert König 
wrote:

On Thu, 2021-02-11 at 16:45 +0100, Erik Auerswald wrote:

On Thu, Feb 11, 2021 at 04:12:54PM +0100, Leonard Janis Robert
König wrote:

On Thu, 2021-02-11 at 13:00 +0100, Erik Auerswald wrote:

On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard Janis Robert
König wrote:

I'm sorry if I this is not a bug but to be expected, but I thnk
pr doesn't get the alignment of tabs in multicolumn output
right.  [...]  This seems *kind* of related to multi-column
merged output, as was discussed some years ago here:
https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html 



This thread contains the bug-introducing patch in message
https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00160.html 



This is commit 553d347d3e08e00ee4f9df520b37c964c3f26e28.


ah, thanks for digging, I read the message but must have missed
the patch.


That commit removed the 'assume -e' part of the POSIX description
of the -COLUMN option from GNU pr.

[...]

Your test case requires expanding tabs during input, which is
the reason that "expand | pr" could be used as a workaround (with
"expand | pr | unexpand", pr would not need to mess with tabs at all,
but I do think that GNU pr is currently buggy and should be fixed).


Absolutely, expand would be a workaround (I happen to use `pr -e | pr`
in my script, for other reasons).
[...]

I have found a fix to the problem described by you.  I am quite sure that
this is not *correct*, but I did not find a way to make 
print_sep_string()

account for tabs that did not break quite a few existing tests, even if
the merged files problem from 2007 and this columnating bug were both
fixed.  Thus I just tighten the 2007 bug fix to apply in less cases.
This way all existing tests pass, and a new one pertaining to this bug
report passes, too.  I do think this is in the same spirit as the "fix"
from 2007 (commit 553d347d3e08e00ee4f9df520b37c964c3f26e28).


I think the attached patch is a better fix than my previous one,
because it applies the special treatment of TAB as separator more
consistently.  It may still not be complete (the code seems quite
convoluted to me) but I do think it improves the situation
significantly, and does not make it worse.


It seems to me as if "untabify_input = true;" should be re-introduced
in one additional place to fix the regression from commit 553d347,
please see the attached patch version 3.


I'd like to ask the GNU Coreutils maintainers to consider merging
the attached patch.


The latest version, i.e., v3 for now.

Thanks,
Erik
diff --git a/src/pr.c b/src/pr.c
index 22d032ba3..e193d632c 100644
--- a/src/pr.c
+++ b/src/pr.c
@@ -1226,7 +1226,10 @@ init_parameters (int number_of_files)
   if (join_lines)
 col_sep_string = line_separator;
   else
-col_sep_string = column_separator;
+{
+  col_sep_string = column_separator;
+  untabify_input = true;
+}
 
   col_sep_length = 1;
   use_col_separator = true;
@@ -1235,6 +1238,8 @@ init_parameters (int number_of_files)
  alignment */
   else if (!join_lines && col_sep_length == 1 && *col_sep_string == '\t')
 col_sep_string = column_separator;
+  else
+untabify_input = true;
 
   truncate_lines = true;
   tabify_output = true;
diff --git a/tests/pr/pr-tests.pl b/tests/pr/pr-tests.pl
index b7d868cf8..ed5c61dc9 100755
--- a/tests/pr/pr-tests.pl
+++ b/tests/pr/pr-tests.pl
@@ -467,6 +467,14 @@ push @Tests,
 {IN=>{3=>"x\ty\tz\n"}},
  {OUT=>join("\t", qw(a b c m n o x y z)) . "\n"} ];
 
+# Exercise a variant of the bug with pr -m -s (commit 553d347)
+# test 2 files, too (merging 3 files automatically aligns columns on TAB stops)
+push @Tests,
+   ['merge-2-w-tabs', '-m -s -t',
+{IN=>{1=>"a\tb\tc\n"}},
+{IN=>{2=>"m\tn\to\n"}},
+ {OUT=>join("\t", qw(a b c m n o)) . "\n"} ];
+
 # This resulted in reading invalid memory before coreutils-8.26
 push @Tests,
['asan1', "-m -S'\t\t\t' -t",
@@ -474,6 +482,18 @@ push @Tests,
 {IN=>{2=>"a\n"}},
  {OUT=>"a\t\t\t\t  \t\t\ta\n"} ];
 
+# Exercise a bug with pr -t -2 (bug #46422)
+push @Tests,
+   ['mcol-w-tabs', '-t -2',
+{IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx   x\tx\tx\tx\tx\n"} ];
+
+# generalize case from commit 553d347 (problem results from -s, not -m)
+push @Tests,
+   ['mcol-w-tabs-w-tabsep', '-t -2 -s',
+{IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx\tx\tx\tx\tx\tx\n"} ];
+
 @Tests = triple_test \@Tests;
 
 my $save_temps = $ENV{DEBUG};


bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-13 Thread Erik Auerswald

On 11.02.21 20:20, Erik Auerswald wrote:

On Thu, Feb 11, 2021 at 06:09:28PM +0100, Leonard Janis Robert König wrote:

On Thu, 2021-02-11 at 16:45 +0100, Erik Auerswald wrote:

On Thu, Feb 11, 2021 at 04:12:54PM +0100, Leonard Janis Robert
König wrote:

On Thu, 2021-02-11 at 13:00 +0100, Erik Auerswald wrote:

On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard Janis Robert
König wrote:

I'm sorry if I this is not a bug but to be expected, but I thnk
pr doesn't get the alignment of tabs in multicolumn output
right.  [...]  This seems *kind* of related to multi-column
merged output, as was discussed some years ago here:
https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html


This thread contains the bug-introducing patch in message
https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00160.html

This is commit 553d347d3e08e00ee4f9df520b37c964c3f26e28.


ah, thanks for digging, I read the message but must have missed
the patch.


That commit removed the 'assume -e' part of the POSIX description
of the -COLUMN option from GNU pr.

[...]

Your test case requires expanding tabs during input, which is
the reason that "expand | pr" could be used as a workaround (with
"expand | pr | unexpand", pr would not need to mess with tabs at all,
but I do think that GNU pr is currently buggy and should be fixed).


Absolutely, expand would be a workaround (I happen to use `pr -e | pr`
in my script, for other reasons).
[...]

I have found a fix to the problem described by you.  I am quite sure that
this is not *correct*, but I did not find a way to make print_sep_string()
account for tabs that did not break quite a few existing tests, even if
the merged files problem from 2007 and this columnating bug were both
fixed.  Thus I just tighten the 2007 bug fix to apply in less cases.
This way all existing tests pass, and a new one pertaining to this bug
report passes, too.  I do think this is in the same spirit as the "fix"
from 2007 (commit 553d347d3e08e00ee4f9df520b37c964c3f26e28).


I think the attached patch is a better fix than my previous one,
because it applies the special treatment of TAB as separator more
consistently.  It may still not be complete (the code seems quite
convoluted to me) but I do think it improves the situation
significantly, and does not make it worse.

The code does not try to create equal width columns when using a
TAB as column separator.  This is made clear through comments:

1018 /* Tabification is assumed for multiple columns. */
...
1031 /* It's rather pointless to define a TAB separator with column
1032alignment */

Thus the intent of the code seems to be follow the general idea
of using equal width columns by "assuming Tabification," i.e.,
working as if the options -e and -i were given, as specified by
POSIX, unless the column separator has been changed to a TAB.
The attached patch results in following through with this in more
cases, fixing this bug (bug#46422) without introducing known
regressions.

The patch adds more test cases.  One identical to the new test
from my previous patch, another generalizes the case from 2007
to use '-2 -s' to trigger special treatment with TAB as separator.

Creating three column output as done in the bug report from 2007
automatically aligns the columns with the default TAB stops of pr,
thus the patch adds another variant of the 2007 case merging two
files.  Merging files (-m) is done with a slightly different code
path from -NUMBER, while both create columns.

I'd like to ask the GNU Coreutils maintainers to consider merging
the attached patch.

Thanks,
Erik
diff --git a/src/pr.c b/src/pr.c
index 22d032ba3..d518b81ab 100644
--- a/src/pr.c
+++ b/src/pr.c
@@ -1226,7 +1226,10 @@ init_parameters (int number_of_files)
   if (join_lines)
 col_sep_string = line_separator;
   else
-col_sep_string = column_separator;
+{
+  col_sep_string = column_separator;
+  untabify_input = true;
+}
 
   col_sep_length = 1;
   use_col_separator = true;
diff --git a/tests/pr/pr-tests.pl b/tests/pr/pr-tests.pl
index b7d868cf8..ed5c61dc9 100755
--- a/tests/pr/pr-tests.pl
+++ b/tests/pr/pr-tests.pl
@@ -467,6 +467,14 @@ push @Tests,
 {IN=>{3=>"x\ty\tz\n"}},
  {OUT=>join("\t", qw(a b c m n o x y z)) . "\n"} ];
 
+# Exercise a variant of the bug with pr -m -s (commit 553d347)
+# test 2 files, too (merging 3 files automatically aligns columns on TAB stops)
+push @Tests,
+   ['merge-2-w-tabs', '-m -s -t',
+{IN=>{1=>"a\tb\tc\n"}},
+{IN=>{2=>"m\tn\to\n"}},
+ {OUT=>join("\t", qw(a b c m n o)) . "\n"} ];
+
 # This resulted in reading invalid memory before coreutils-8.26
 push @Tests,
['asan1', "-m -S'\t\t\t' -t",
@@ -474,6 +482,18 @@ push @Tests,
 {IN=>{2=>"a\n"}},
 

bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-11 Thread Erik Auerswald
Hi,

On Thu, Feb 11, 2021 at 06:09:28PM +0100, Leonard Janis Robert König wrote:
> On Thu, 2021-02-11 at 16:45 +0100, Erik Auerswald wrote:
> > On Thu, Feb 11, 2021 at 04:12:54PM +0100, Leonard Janis Robert
> > König wrote:
> > > On Thu, 2021-02-11 at 13:00 +0100, Erik Auerswald wrote:
> > > > On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard Janis Robert
> > > > König wrote:
> > > > > I'm sorry if I this is not a bug but to be expected, but I thnk
> > > > > pr doesn't get the alignment of tabs in multicolumn output
> > > > > right.  [...]  This seems *kind* of related to multi-column
> > > > > merged output, as was discussed some years ago here:
> > > > > https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html
> > > > 
> > > > This thread contains the bug-introducing patch in message
> > > > https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00160.html
> > > > 
> > > > This is commit 553d347d3e08e00ee4f9df520b37c964c3f26e28.
> > > 
> > > ah, thanks for digging, I read the message but must have missed
> > > the patch.
> > > 
> > > > That commit removed the 'assume -e' part of the POSIX description
> > > > of the -COLUMN option from GNU pr.
> > > [...]
> > Your test case requires expanding tabs during input, which is
> > the reason that "expand | pr" could be used as a workaround (with
> > "expand | pr | unexpand", pr would not need to mess with tabs at all,
> > but I do think that GNU pr is currently buggy and should be fixed).
> 
> Absolutely, expand would be a workaround (I happen to use `pr -e | pr`
> in my script, for other reasons).
> 
> I've looked a bit further through the code but there's hardly a single
> place that needs to be touched in order to not introduce other bugs
> again.  For now I can only put it on my to-do list to fix, but no idea
> when I get around doing it.

I have found a fix to the problem described by you.  I am quite sure that
this is not *correct*, but I did not find a way to make print_sep_string()
account for tabs that did not break quite a few existing tests, even if
the merged files problem from 2007 and this columnating bug were both
fixed.  Thus I just tighten the 2007 bug fix to apply in less cases.
This way all existing tests pass, and a new one pertaining to this bug
report passes, too.  I do think this is in the same spirit as the "fix"
from 2007 (commit 553d347d3e08e00ee4f9df520b37c964c3f26e28).

See the following inline patch:

8<
diff --git a/src/pr.c b/src/pr.c
index 22d032ba3..ad1e36769 100644
--- a/src/pr.c
+++ b/src/pr.c
@@ -1237,6 +1237,8 @@ init_parameters (int number_of_files)
 col_sep_string = column_separator;
 
   truncate_lines = true;
+  if (!parallel_files)
+untabify_input = true;
   tabify_output = true;
 }
   else
diff --git a/tests/pr/pr-tests.pl b/tests/pr/pr-tests.pl
index b7d868cf8..0894d3804 100755
--- a/tests/pr/pr-tests.pl
+++ b/tests/pr/pr-tests.pl
@@ -474,6 +474,12 @@ push @Tests,
 {IN=>{2=>"a\n"}},
  {OUT=>"a\t\t\t\t  \t\t\ta\n"} ];
 
+# Exercise a bug with pr -t -2 (bug #46422)
+push @Tests,
+   ['mcol-w-tabs', '-t -2',
+{IN=>"x\tx\tx\tx\tx\nx\tx\tx\tx\tx\n"},
+ {OUT=>"x\tx\tx\tx\tx   x\tx\tx\tx\tx\n"} ];
+
 @Tests = triple_test \@Tests;
 
 my $save_temps = $ENV{DEBUG};
>8

It is up to the GNU Coreutils maintainers if they want to add this
additional band-aid to the interesting 'pr' code or not.  Adherents to
test-driven development would probably like this approach.  ;-)

Thanks,
Erik
-- 
Bugs are like mushrooms - found one, look around for more...
-- Al Viro





bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-11 Thread Erik Auerswald
Hi,

On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard Janis Robert König wrote:
> I'm sorry if I this is not a bug but to be expected, but I thnk pr
> doesn't get the alignment of tabs in multicolumn output right.
> [...]
> This seems *kind* of related to multi-column merged output, as was
> discussed some years ago here:
> https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html

This thread contains the bug-introducing patch in message
https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00160.html

This is commit 553d347d3e08e00ee4f9df520b37c964c3f26e28.

That commit removed the 'assume -e' part of the POSIX description of
the -COLUMN option from GNU pr.

Reverting this patch (i.e., adding the one deleted line back to pr.c)
fixes *this* bug, but then re-introduces the bug reported in 2007, i.e.,
sub-test 'merge-w-tabs' of test 'pr-tests.pl' fails.

Thanks,
Erik
-- 
If it ain't broke, don't fix it.





bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-10 Thread Erik Auerswald
Hi,

On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard Janis Robert König wrote:
> I'm sorry if I this is not a bug but to be expected, but I thnk pr
> doesn't get the alignment of tabs in multicolumn output right.
> [...]
> Unfortunately the POSIX spec is, in my reading, a bit unclear here. 

I do not think so:

-column
  Produce multi-column output that is arranged in column columns[...].
  The options -e and -i shall be assumed for multiple text-column output.

-e[char][gap]
  Expand each input  to the next greater column position[...].

-i[char][gap]
  In output, replace multiple s with s wherever[...].

https://pubs.opengroup.org/onlinepubs/009695399/utilities/pr.html
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pr.html

The way I read the POSIX spec, "pr" needs to account for the column
separation by adjusting whitespace while still using tab characters in
the output.

> But I think the behavior of GNU/pr is rather unexpected when printing
> multicolumn source code and not in line what the original authors
> intended.

I concur.

Thanks,
Erik
-- 
[M]ost parts of this industry just work by chance.
-- Thomas Gleixner





bug#46422: 'pr' screws up tabstops in multicolumn outpt?

2021-02-10 Thread Erik Auerswald
Hi,

On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard Janis Robert König wrote:
> I'm sorry if I this is not a bug but to be expected, but I thnk pr
> doesn't get the alignment of tabs in multicolumn output right.
> 
> Consider the following test input, where everything from x->x is a tab
> (with tabs 8):

Email quoting disturbs the alignment with tabs, thus I omit those
examples.

> Run it through multicolumn pr, e.g.,
> 
> pr -t -2 test > out
> 
> The output looks [garbled.]
> [...]
> In contrast, on a SunOS 5.10 machine, I get:
> 
> 123456781234567812345678123456781   123456781234567812345678123456781
> x   x   x   x   x   x   x   x

This is lacking the first 'x', did you use a different input file?

> Basically, SunOS pr notices, that it cannot print "\tx\tx\tx\tx"
> anymore, since the separation between the pages messed that up.
> Instead it prints "\t x\t x\t x\t x".

You can work around the issue by using "expand" to change tabs into
spaces before using "pr":

$ expand test | pr -t -2
123456781234567812345678123456781   123456781234567812345678123456781
x   x   x   x   x   x   x   x   x   x

> [...]
> This seems *kind* of related to multi-column merged
> output, as was discussed some years ago here:
> https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html

Just keeping tabs for the second column cannot always work.

> [...]
> What do you think?

It seems to me the approach of "expand"ing the tabs to spaces before using
"pr" is the most general.

Thanks,
Erik
-- 
Be water, my friend.
-- Bruce Lee





bug#46060: Offer ls --limit=...

2021-01-24 Thread Erik Auerswald

Hi Dan,

On 23.01.21 22:13, 積丹尼 Dan Jacobson wrote:

I hereby propose "ls --limit=..."

$ ls --limit=1 # Would only print one result item:
A

You might say:
"Jacobson, just use "ls|sed q". Closed: Worksforme."

Ah, but I am talking about items, not lines:


You can use the ls option '-1' to print one item per line:

$ touch {a..z}
$ ls -1 | head -n8
a
b
c
d
e
f
g
h

You can use 'column' (from package "bsdmainutils" in Debian etc.)
to columnate the result:

$ ls -1 | head -n8 | column
a   b   c   d   e   f   g   h


Indeed, directories might be huge. And any database command already has
a --limit option these days, and does not rely on a second program to
trim its output because it can't control itself. Indeed, on some remote
connections one would only want to launch one program, not two. Thanks.


It might be nice not to have to create all the output that is to be
discarded, especially on remote and/or slow file systems.

The one program requirement could be fulfilled by a script or shell
function.

I am sorry if my email hinders possible acceptation of an implementation
of your suggestion, but I did want to show that there is a workaround
(adding non-GNU software to the mix, though).

Thanks,
Erik





bug#44444: RFE for 'env'?

2020-11-06 Thread Erik Auerswald
Hi,

On Thu, Nov 05, 2020 at 11:41:44AM -0800, L A Walsh wrote:
> On 2020/11/04 08:09, Erik Auerswald wrote:
> >Please see
> >https://www.gnu.org/software/coreutils/manual/html_node/env-invocation.html#g_t_002dS_002f_002d_002dsplit_002dstring-usage-in-scripts
> >for an explanation.
> 
> Ah...so what I asked for has already been added in a newer version.
> 
> I seem to have > env --version
> env (GNU coreutils) 8.26.18-5e871
> 
> What version of env should I try and test?

According to the NEWS file, env from Coreutils 8.30 introduced this option.

Thanks,
Erik
-- 
Thinking doesn't guarantee that we won't make mistakes. But not thinking
guarantees that we will.
-- Leslie Lamport





bug#44444: RFE for 'env'?

2020-11-04 Thread Erik Auerswald
Hi,

On Wed, Nov 04, 2020 at 07:27:17AM -0800, L A Walsh wrote:
> Rewriting this bug as the other one, apparently, was too unclear
> to be understood.
> 
> This gives an example, two in fact.
> 
> 
> On 2020/11/03 14:48, Bernhard Voelker wrote:
> >On 11/3/20 6:29 PM, L A Walsh wrote:
> >>I tried to use 'env' to find perl in my path and wanted to pass
> >>the -T option to perl.
> >>
> >>cat >/tmp/taint+print
> #!/usr/bin/env perl -T
> printf "Hello World\n";
> 
> I am unable to get this to run and print out:
> 
> "Hello World" \
> 
> Instead of expected output, I get:
> /usr/bin/env: ‘perl -T’: No such file or directory

That is not env, that is the Linux kernel providing 'perl -T' as single 
argument to env.

$ cat taint+print
#!/usr/bin/env perl -T
printf "Hello World\n";
$ ./taint+print
/usr/bin/env: ‘perl -T’: No such file or directory
$ /usr/bin/env perl -T taint+print
Hello World
$ 

Please see
https://www.gnu.org/software/coreutils/manual/html_node/env-invocation.html#g_t_002dS_002f_002d_002dsplit_002dstring-usage-in-scripts
for an explanation.

HTH,
Erik
-- 
Ow, you made me look at perl code.
-- Andrew Morton





bug#37702: Suggestion for 'df' utility

2020-05-31 Thread Erik Auerswald
Hi John,

On Sun, May 31, 2020 at 06:52:04PM +1000, John Pye wrote:
> The purpose of "df" is to show "disk free". Hence any filesystems that
> are read-only or which are FUSE-mounted one on of the local physical
> filesystems, or similar things (what others?) should be suppressed by
> default.

The FUSE case seems tricky, since, e.g., NTFS filesystems are mounted
via FUSE on, e.g., Ubuntu 18.04.  Thus FUSE should not be excluded by
default, I'd say.

HTH,
Erik





bug#37702: Suggestion for 'df' utility

2020-05-30 Thread Erik Auerswald

Hi all,

On 30.05.20 05:18, Bryce Harrington wrote:

On Fri, Oct 11, 2019 at 12:56:20PM -0700, Paul Eggert wrote:

On 10/11/19 11:20 AM, Pádraig Brady wrote:


if you want to exclude nested file systems like that,
you could try:

    alias df='df -x squashfs'


On my Fedora 30 workstation that option doesn't make any difference.
Regardless of whether '-x squashfs' is used, I see this output from 'df':

Filesystem  1K-blocks  Used  Available Use% Mounted on
devtmpfs  4065704 04065704   0% /dev
tmpfs 4081560 366164044944   1% /dev/shm
tmpfs 4081560  16964079864   1% /run
tmpfs 4081560 04081560   0% /sys/fs/cgroup
/dev/sda559614116  16910684   39645412  30% /
tmpfs 4081560   1244081436   1% /tmp
/dev/sda2  1849433716 207781976 1547682948  12% /home
/dev/sda1 50950402444684572044   6% /boot
tmpfs  81631260 816252   1% /run/user/1000

and most of these lines are useless.


In the above example, it seems useful to exclude tmpfs as well:

alias df='df -x tmpfs -x squashfs'

That does remove the "useless" lines from df output on my Ubuntu 18.04
system, to be concrete.

What I do not like about this approach is the lack of an "unexclude"
option to add an excluded filesystem back in.  One could, e.g., use
\df to not use the alias, or use a different name for the alias (e.g.,
dfx), though.

I do remember a time when at least some distributions by default used
tmpfs for /tmp.  In that situation just this tmpfs filesystem should
probably not be excluded from the default df output.


For many years we've put up with the problem of too many filesystems in the
default plain 'df' output, and now's as good a time as any to fix that.
[...]
We can add a flag or two for the rare people who want to see these
normally-useless lines.

[...]
I'd like to help in fixing this issue.
[...]
I've taken a stab at a proof-of-concept implementation of #3, by adding
an environment variable DF_EXCLUDE_FSTYPES.
[...] > Further, even
with a config file users would probably want a cli switch and/or env var
to override the config file settings.


I concur that a command line option to override config file (or env var)
settings seems useful if a config file and/or env var approach is used.
Just as it seems useful to me to allow unsuppressing of output that has
been suppressed as useless by a possible new df default behavior.

HTH,
Erik





bug#41563: Possible bug with 'sort -Vr' version sorting

2020-05-28 Thread Erik Auerswald
Hi,

On Thu, May 28, 2020 at 01:01:05PM +0200, Kamil Dudka wrote:
> On Thursday, May 28, 2020 11:02:43 AM CEST Erik Auerswald wrote:
> > On Thu, May 28, 2020 at 08:48:16AM +0200, Kamil Dudka wrote:
> > > It is the underscore in the .x86_64 suffix what breaks the version compare
> > > algorithm.  If you replace the underscore by an alphabetic character, it
> > > sorts as you expect:
> > > 
> > > # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | \
> > > 
> > > sed 's/x86_64/x86X64/' | sort -Vr | sed 's/x86X64/x86_64/'
> > > 
> > > 3.10.0-1127.8.2.el7.x86_64
> > > 3.10.0-1127.el7.x86_64
> > > 3.10.0-1062.18.1.el7.x86_64
> > 
> > That is interesting.  The underscore can be replaced by a digit or even
> > removed as well.  Replacing it with a dot (.)  does not help.
> 
> If there is no underscore, the .el7.x86X64 suffix is recognized as file
> extension.  See the corresponding documentation:
> 
> https://www.gnu.org/software/coreutils/manual/html_node/Special-handling-of-file-extensions.html

Ah, el7.x86X64 or el7.x86164 is seen as an extension (i.e., a sequence
of suffixes), but el7.x86.64 or el7.x86_64 is not.  Since .8.2 does not
contain a letter, it is not seen as part of the extension.  Very subtle,
but documented.

Trvia: the usual 7-Zip extension .7z is no suffix resp. file extension
for this algorithm (according to the documented definition).

Thus changing the platform indicator to look like a file extension,
and relying on the behavior that the distribution version information
is interpreted as a file extension as well, you create a file extension
where initially there was none.  This file extension is then ignored for
the comparison, unless that comparison results in equality.  This seems
to be a useful hack when working with Red Hat products.

Fascinating. :-)

> > This differs from Debian's "dpkg --compare-versions", where the results
> > of the comparison do not change by replacing the underscore with a
> > digit or character, or by removing it (the underscore is identified as
> > problematic, though):
> 
> The problem is that `dpkg --compare-versions` expects version numbers only.
> It does not work well if you feed it with file names including extensions:

I did not, as you can see in the examples.  I gave version information
to dpkg, though not Debian version information.  So of course this is
illegal input and the GIGO principle applies.

> $ dpkg --compare-versions 3.10.0-1127.8.2 '>>' 3.10.0-1127 && echo '>>' || 
> echo '<='
> >>
> $ dpkg --compare-versions 3.10.0-1127.8.2.bz2 '>>' 3.10.0-1127.bz2 && echo 
> '>>' || echo '<='
> <=
> 
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86_64 lt
> > 3.10.0-1127.el7.x86_64 && echo less dpkg: warning: version
> > '3.10.0-1127.8.2.el7.x86_64' has bad syntax: invalid character in revision
> > number dpkg: warning: version '3.10.0-1127.el7.x86_64' has bad syntax:
> > invalid character in revision number less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86.64 lt
> > 3.10.0-1127.el7.x86.64 && echo less less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86X64 lt
> > 3.10.0-1127.el7.x86X64 && echo less less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86164 lt
> > 3.10.0-1127.el7.x86164 && echo less less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x8664 lt
> > 3.10.0-1127.el7.x8664 && echo less less
> > 
> > The way I read the GNU Coreutils documentation, removing the underscore
> > should not affect the version sort comparison result.
> 
> Not really.  See the link above to the documentation that covers this part.

Yes, you are correct.  I find this quite surprising, and see it as another
example where --version-sort fails to deliver on the short form promise
of "natural sort."  I am well aware that the long form description shows
that the sorting order is not "natural," but rather strange IMHO.

$ sort --help | grep -- --version-sort
  -V, --version-sort  natural sort of (version) numbers within text

But then I do not even understand what is "natural" about version numbers
anyway. ;-)

Thanks,
Erik
-- 
[M]ost parts of this industry just work by chance.
-- Thomas Gleixner





bug#41563: Possible bug with 'sort -Vr' version sorting

2020-05-28 Thread Erik Auerswald
Hi,

On Thu, May 28, 2020 at 08:48:16AM +0200, Kamil Dudka wrote:
> On Wednesday, May 27, 2020 2:07:32 PM CEST Danie de Jager via GNU coreutils 
> Bug Reports wrote:
> > 
> > I use sort -Vr to sort version numbers. I noticed this discrepancy on
> > the latest kernel version from Centos 7.8.
> > 
> > command to get output:
> > # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | 
> > sort -Vr
> > 
> > 3.10.0-1127.el7.x86_64
> > 3.10.0-1127.8.2.el7.x86_64
> > 3.10.0-1062.18.1.el7.x86_64
> 
> It is the underscore in the .x86_64 suffix what breaks the version compare 
> algorithm.  If you replace the underscore by an alphabetic character, it
> sorts as you expect:
> 
> # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | \
> sed 's/x86_64/x86X64/' | sort -Vr | sed 's/x86X64/x86_64/'
> 
> 3.10.0-1127.8.2.el7.x86_64
> 3.10.0-1127.el7.x86_64
> 3.10.0-1062.18.1.el7.x86_64

That is interesting.  The underscore can be replaced by a digit or even
removed as well.  Replacing it with a dot (.)  does not help.

This differs from Debian's "dpkg --compare-versions", where the results
of the comparison do not change by replacing the underscore with a
digit or character, or by removing it (the underscore is identified as
problematic, though):

$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86_64 lt 
3.10.0-1127.el7.x86_64 && echo less
dpkg: warning: version '3.10.0-1127.8.2.el7.x86_64' has bad syntax: invalid 
character in revision number
dpkg: warning: version '3.10.0-1127.el7.x86_64' has bad syntax: invalid 
character in revision number
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86.64 lt 
3.10.0-1127.el7.x86.64 && echo less
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86X64 lt 
3.10.0-1127.el7.x86X64 && echo less
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86164 lt 
3.10.0-1127.el7.x86164 && echo less
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x8664 lt 
3.10.0-1127.el7.x8664 && echo less
less

The way I read the GNU Coreutils documentation, removing the underscore
should not affect the version sort comparison result.

Thanks,
Erik
-- 
There is no remedy for anything in life.
-- Ernest Hemingway





bug#41563: Possible bug with 'sort -Vr' version sorting

2020-05-27 Thread Erik Auerswald
Hi,

On Wed, May 27, 2020 at 02:07:32PM +0200, Danie de Jager via GNU coreutils Bug 
Reports wrote:
> I use sort -Vr to sort version numbers. I noticed this discrepancy on
> the latest kernel version from Centos 7.8.
> 
> command to get output:
> # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue
> | sort -Vr
> 3.10.0-1127.el7.x86_64
> 3.10.0-1127.8.2.el7.x86_64
> 3.10.0-1062.18.1.el7.x86_64
> 
> I'd expect the middle value to be the highest version number. Is this
> by design or a bug? If it is a bug please let me know if I must log it
> somewhere.

I'd say this is by design:

Sorting compares runs of non-digits, then runs of digits.  Thus each
"dot" (.) terminates a run of digits.  The "problem" is an unbalanced
number of digit and non-digit runs in the version numbers.

See the following two manual sections:
http://www.gnu.org/software/coreutils/manual/coreutils.html#Version_002dsort-ordering-rules
http://www.gnu.org/software/coreutils/manual/coreutils.html#Punctuation-Characters

The "version sort" is based on Debian's version sort (but different).
It seems as if Red Hat version numbers follow different rules.

HTH,
Erik
-- 
Be water, my friend.
-- Bruce Lee





bug#38299: A bug while trying to decode a non encode base64

2019-11-21 Thread Erik Auerswald
Hi,

On Thu, Nov 21, 2019 at 12:04:11PM +0530, vardhaman narasagoudar wrote:
> On Thu, Nov 21, 2019 at 12:51 AM Paul Eggert  wrote:
> > On 11/20/19 6:22 AM, Martin Schulte wrote:
> > > vardhamanbn1 is a valid encoding
> >
> > Thanks for explaining; closing the bug report.
> 
> Thanks for replying the query, but if I check online (
> https://www.base64decode.org/) for decoding  the same in online .
> 
> I get  an error  message (which is valid) e.g:-
> 
> 1) if I try to decode "99"  I get an error message
> 
> "No printable characters found, try another source charset, or upload your
> data as a file for binary decoding."

The error message says that the decoded data is not printable.  It does
not say anything about invalid input data, although the input data is
not correctly Base64 encoded.

> Similarly we got return code as 1 "invalid input" in the terminal.
> 
> 2) Now if I try to decode "vardhamanbn1" I get the error message  (any 12
> characters or multiple of 12 characters which is a non-encoded value, if
> try to decode)
> "No printable characters found, try another source charset, or upload your
> data as a file for binary decoding."

You get the same error message about the decoded data.  This is correct.
The site even tells you that the interface you use does not support
binary, i.e., non-printable data.

> But when we try the same in terminal , we get the return code as 0 the
> symbol as per inputs given
>  "UTF-8 and thus leads to �."
> 
> Now as a work around we are using

That is not a workaround, but the necessary check for valid output data
for your application, since you seem to require a Base64 encoding of
UTF-8 data.

> a) [vardhaman@oc6085028360 ~]$ echo -n "vardhamanbn1" | base64 -d | iconv
> -f utf8
> iconv: illegal input sequence at position 0

Base64 can encode any binary data, not just valid UTF-8 text.

> also we tried on another sample
> 
> b) [vardhaman@oc6085028360 ~]$ echo  -n '99' | base64 -d | iconv -f utf8
> base64: invalid input
> iconv: illegal input sequence at position 0
> 
> without using "iconv -f utf8"
> 
> [vardhaman@oc6085028360 ~]$  echo  -n '99' | base64 -d
> base64: invalid input
> 
> 
> So we feel its something still with 12 & multiple of 12 characters leading
> to the issue, when we try to decode a non-decode value.

The magic number is actually 4, because each symbol in a base64 encoded
string represents 6 bits, thus 4 symbols give you 3 bytes of encoded data.
Any combination of Base64 symbols that forms a string of a length
divisibale by 4 is a valid Base64 encoding.  This does not give any
guarantee about the data.

> Or should we think characters of multiple of 12 will be treated as a base64
> format

Yes. Actually, any multiple of 4 characters.

>  e.g when I tried decoding 24 non-encode character:-
>  [vardhaman@oc6085028360 ~]$ echo -n 'vardhamanbn1vardhamanbn1' | base64
> --decode
> ��݅���݅�[vardhaman@oc6085028360 ~]$ echo $?
> 0

Thanks,
Erik
-- 
The laws of mathematics are very commendable, but the only law that
applies in Australia is the law of Australia.
-- Australian Prime Minister Malcolm Turnbull





bug#37062: Changes set are no applied to a locally plugged external device

2019-08-19 Thread Erik Auerswald
Hi Ricky,

On Sat, Aug 17, 2019 at 01:23:31PM +0300, Ricky Tigg wrote:
> Component under* Linux Fedora*: coreutils.x86_64   8.31-2.fc30  @updates
> 
> Changes set are no applied to a locally plugged external device

Perhaps the external device's filesystem does not support Unix access
controls. Perhaps the external device is mounted with options to fix user,
group, and access rights as seen by Linux.

> Commands executed:
> # whoami
> root
> # cd /run/media/yk/maxell/Varmuuskopiot/Fedora/
> # chown -R root:citser ../Fedora/ && chmod -R 750 ../Fedora/
> # ls -l ../; ls -l
> [...]
> drwxrwxrwx. 1 yk yk  440 16. 8. 23:32 Fedora
> [...]
> -rwxrwxrwx. 1 yk yk 155 16. 8. 23:32 config
> 
> As noticeable owner and groups mentions are both yk while they respectively
> should be root, citser.

You could try to find the active mount options by issuing:

mount | fgrep media

Thanks,
Erik
-- 
If it ain't broke, don't fix it.





bug#36831: enhance 'directory not empty' message

2019-08-01 Thread Erik Auerswald
Hi,

On Wed, Jul 31, 2019 at 04:05:05PM -0600, Assaf Gordon wrote:
> On Mon, Jul 29, 2019 at 06:50:46PM -0500, Paul Eggert wrote:
> > On 7/29/19 1:28 AM, Assaf Gordon wrote:
> > > +  if (rename_errno == ENOTEMPTY || rename_errno == EEXIST)
> > > +{
> > > +  error (0, 0, _("cannot move %s to %s: Target directory not 
> > > empty"),
> > > + quoteaf_n (0, src_name), quoteaf_n (1, dst_name));
> > 
> > Although this is an improvement, it is not general enough, as other errno
> > values are relevant only for the destination. Better would be to have a
> > special case for errno values that matter only for the destination, and use
> > the existing code for errno values where we don't know whether the problem
> > is the source or the destination. Something like the attached, say.
> 
> > +case EDQUOT: case EEXIST: case EISDIR: case ENOSPC: case 
> > ENOTEMPTY:
> > +  error (0, rename_errno, "%s", quotearg_colon (dst_name));
> > +  break;
> > +
> 
> [...]
> An explicit error explicitly saying "cannot move", and mention the source and
> destination, and also "blames" the target directory seems the most
> user-friendly and least ambiguous.

I agree with this reasoning and prefer Assaf's error message improvement.

Thanks,
Erik
-- 
If you're willing to restrict the flexibility of your approach,
you can almost always do something better.
-- John Carmack





bug#35032: date ISO 8601 / RFC 3339 formats

2019-03-28 Thread Erik Auerswald
Hi,

On Thu, Mar 28, 2019 at 10:43:42AM -0700, Paul Eggert wrote:
> On 3/28/19 10:20 AM, Nicolas Mailhot wrote:
> > Would it be possible to make them both optional in --rfc-3339, and
> > both mandatory in --iso-8601 ?
> 
> Sorry, I don't understand what you're proposing, specifically. Can you
> say exactly what you want, with specific calls to 'date' and what you
> want the output to look like, and why?

Sadly, you stripped too much of the original mail. I'll repeat the
relevant parts of that mail:

On Thu, Mar 28, 2019 at 06:20:14PM +0100, Nicolas Mailhot wrote:
> A long, long time ago, [...]
> Unfortunately, coreutils managed to make both of those incompatible
> with the W3C iso-8601 profile lots of software languages use:
> 
> 1. The W3C profile mandates T as time separator, and ":" as
> hour/minutes separator
> 2. RFC 3339 makes both optional
> 
> Then, logically, date removed the ":" for its --iso-8601 option,
> $ date --iso-8601=seconds
> 2019-03-28T18:09:47+0100
   ^^
   there should be a ':' for W3C compatibility

> and then removed T from its --rfc-3339 option
> $ date --rfc-3339=seconds
> 2019-03-28 18:10:11+01:00
^
there should be a 'T' for W3C compatibility

> [...]

Nicolas asks for an ISO 8601 compatible format using both a 'T' as
separator between date and time, and a ':' as separator between hours
and minutes in the timezone designator, as well as the other contents
that are identical in --iso-8601 and --rfc-3339.

>From looking at https://www.w3.org/TR/NOTE-datetime, the important part
is using both 'T' and a TZD with ':' in the middle, the other variability
(e.g. minutes, seconds, fractional seconds as decimals) can be chosen
as fits.

Thanks,
Erik
-- 
I do like the 24 hour a day development process. I can describe a
problem, go to sleep, and have the answer in my mailbox with my first
cup of coffee.
-- Dave Täht





bug#34700: rm refuses to remove files owned by the user, even in force mode

2019-03-03 Thread Erik Auerswald

Hi,

On 3/3/19 09:40, L A Walsh wrote:

On 3/2/2019 11:31 AM, Bob Proulx wrote:

But regardless of that it does not change the fact that the entire
purpose of read-only directories is to prevent removing and renaming
of files within them.
   


 But not by the user owning them.


The rationale given by the Go developers is to prevent downloaded test
code to remove or alter files in the modules directory, not to prevent
the user from doing that.


[...]

I would suggest people with specific directories that inhibit deletion of
files inside although they should not (e.g. a "cache") to deliberatly change
the permissions of said directories prior to deleting files inside. Using a
script like the above, even without the basic mistakes in the script, is
quite dangerous.
 

Yeah...I wouldn't do it, I'd write a script that invokes the app and
clears out the cache dir when the app exits if it bothered me enough.


The Go developers implemented "go clean -modcache" for that purpose.
https://github.com/golang/go/issues/27161#issuecomment-415213240
https://tip.golang.org/cmd/go/#hdr-Remove_object_files_and_cached_files


Much better to let the computer do the repetitive deletions.  If I do it
manually, it increases the chances of me creating a problem the more often
I do it.

Really -- scripts are much better at handling redundant/routine matters that
turn parts of my brain off.  OTOH, some people are better at redundant
detail
and don't suffer the same problems I would.  People are different.


I concur to let software handle repetitive tasks.

If cleaning the cache occurs seldom, manually performing the changes, or
better invoking an existing specialized program (or script) for this
specific cache seems to be better than circumventing the safety net in a
general purpose utility. Especially if this circumvention means
transparently changing access rights on a directory that is not
mentioned in the utility invocation.

If cleaning the cache occurs all the time, using "go cache -modcache"
(or whatever program is appropriate for the specific cache) should be
the routine used. If there is no specialized program provided yet, a
script could be developed for that purpose.

Thanks,
Erik





bug#34700: rm refuses to remove files owned by the user, even in force mode

2019-03-02 Thread Erik Auerswald

Hi,

On 3/2/19 07:18, Bob Proulx wrote:

Nicolas Mailhot wrote:

For their own reasons, the Go maintainers have decided the user Go cache
will now be read-only.
https://github.com/golang/go/issues/27161#issuecomment-433098406
That means cleaning up cache artefacts with rm does not work anymore
https://github.com/golang/go/issues/30502

[...]
However regardless of intentions and design if one really wants to
smash it then this is easily scripted.  No code modifications are
needed.

   #!/bin/sh
   chmod -R u+w $1
   rm -rf $1


To everyone considering the above "script": do not use it! It does not 
even guard against spaces in file names. Besides being dangerously 
buggy, it does not even solve the problem of deleting a file inside a 
read-only directory.


I would suggest people with specific directories that inhibit deletion 
of files inside although they should not (e.g. a "cache") to deliberatly 
change the permissions of said directories prior to deleting files 
inside. Using a script like the above, even without the basic mistakes 
in the script, is quite dangerous.


Thanks,
Erik





bug#33371: RFC: option for numeric sort: ignore-non-numeric characters

2018-11-14 Thread Erik Auerswald
Hi,

On Tue, Nov 13, 2018 at 06:32:55PM -0800, L A Walsh wrote:
> I have a bunch of files numbered from 1-over 2000 without leading zeros
> (think rfc's)...
> They have names with a non-numeric prefix & suffix around the number.

Are prefix and suffix constant? RFC files are usually named rfc${NR}.txt.

> It would be nice if sort had the option to ignore non-numeric
> data and only sort on the numeric data in the 'lines'/'files'.

Perhaps --version-sort could work for you?

$ for r in rfc{1..100}.txt; do echo "$r"; done | sort | sort -V

(The first sort un-sorts the sorted input data, the seconds sorts it
again.)

> [...]
> Or is there an options for this already, and my manpage out of date?

AFAIK not exactly.

Thanks,
Erik
-- 
It's impossible to learn very much by simply sitting in a lecture,
or even by simply doing problems that are assigned.
-- Richard P. Feynman





bug#32127: RFE -- in the way "cp -rl" -- enable 'ln' to do likewise?

2018-07-18 Thread Erik Auerswald
Hi,

On Wed, Jul 18, 2018 at 04:36:44AM -0600, Mike Hodson wrote:
> On Wed, Jul 18, 2018 at 4:24 AM L A Walsh  wrote:
> 
> > In the case of creating a link to a directory there is
> > no choice in creating a "working solution".  If you want a link
> > there, it HAS to be a symlink.  That the user would bother to
> > use the 'ln' (link) command in the first place is a sufficiently
> > convincing "argument" that they really DID want a link there.

This sounds reasonalble to me: a link was requested, it might not matter
which kind, and only one kind of link can be created. Thus 'ln' could
try to do the right thing and create a (symbolic) link.

> > That they didn't explicitly specify the type should additionally
> > be taken that they didn't care enough to specify the type -- only
> > that the link be created.
> >
> > I hope that clarifies that I'm not attempting to always
> > find some "automatic action", but saw that in this case, it
> > wouldn't be hard to figure out what was wanted and that doing
> > so wouldn't be hard to undo if it was not.
> 
> I wager that some people *aren't* aware that you cannot hardlink a
> directory, and instead of writing hundreds of NEW bug reports "linking
> broken" "why can't I link a directory" leaving 'ln' as it has been since
> the dawn of time is the better option.

Printing a helpful warning message that a symbolic link has been created
instead of a hard link, because a hard link cannot be created (perhaps
less verbose) would help at least a little bit. A new option that is
needed to enable that behavior would prevent the confused users, until
distributions start to add it to the default aliases.

> You don't think this will happen? I assure you it will.
> 
> Look at the YEARS of new users being introduced, as their distributions
> finally 'stabilize' newer coreutils, to the new "Quoted Filenames" in 'ls'
> . So many people have been totally confused, angry, and rather taken aback
> that such an old utility did something different.

I immediatly searched for the respective option and changed my aliases
to not quote 'ls' output. ;-)

I did not like that the default output of 'ls' was changed, but at least
I can disable this anti-feature.

> Let us all learn from history, on this same maillist, of when and when not
> to change the default workings of a 40 year old tool.

This sounds reasonable to me, but others have different view, see your
'ls' example. ;-)

Anyway, IMHO Linda's arguments for the specific change requested
have merit. I personally would prefer an option to enable that new
functionality instead of making it the default, if someone were to
implement said functionality.

Thanks,
Erik
-- 
Design your product to please the users.
-- Paul Graham





bug#32127: RFE -- in the way "cp -rl" -- enable 'ln' to do likewise?

2018-07-17 Thread Erik Auerswald
Hi,

On Mon, Jul 16, 2018 at 11:14:21PM +0200, Bernhard Voelker wrote:
> On 07/14/2018 07:51 PM, L A Walsh wrote:
> > Paul Eggert wrote:
> >> On 07/12/2018 02:16 AM, L A Walsh wrote:
> >>> I'm asking why does 'ln' bother to tell the user that they are
> >>> wrong and do nothing useful?  Why not just go ahead and create a symlink
> >>
> >> The user didn't ask for a symlink,
> > User didn't ask for a physical or hardlink (-P) either.  Just asked
> > for a link, kind unspecified.
> > 
> >> and it sounds unwise for ln to be
> >> second-guessing that.
> > ---
> >   True - should **probably** have given them *SOME* link.  Since
> > they didn't specify Physical or Symlink...either would be fine.
> > 
> > 
> >> Sometimes, reporting an error and exiting is a
> >> better thing to do.
> > ---
> > Unless they claim to want one or the other (-P or -l), unless
> > it is an "undo-able" operation (like one that deletes data), why would
> > you guarantee ln doing the wrong thing, rather than having a better than
> > 50% chance of doing the right thing?
> 
> I disagree here: some people are not that familiar with the differences
> between symlinks and hardlinks, okay, but the consequences for using either
> type may be quite dramatic later on.  Therefore I think it's better to give
> a helping error instead of second-guessing what the user *may* have wanted.
> The point is: also an experienced user may sometimes forget to specify
> the -s option, and I'm sure they *want* a proper error message.

Just as food for thought: how about adding an option to ln to try to do
the right thing? That option could be used in an alias so that it is not
needed to always type it. Perhaps in time some GNU/Linux distributions
even add that option to their default aliases.

The option name could be --do-what-i-mean or --do-the-right-thing
or --fallback-symbolic (I am quite sure everyone can come with
additionalsuggestions. ;) Please do not take these suggestions too
seriously. ;-)

Anyway, I understand both sides of this discussion, and I definitely do
not expect anyone to go ahead and implement my solution. It is just a
suggestion for whomever wants to have this functionality, and intends
to implement it themselves, on how to find a compromise that might be
acceptable upstream.

Peace,
Erik
-- 
Be water, my friend.
-- Bruce Lee





bug#31184: tail -f on Network FS not refreshing as soon as the file is changed.

2018-04-17 Thread Erik Auerswald
Hi Jewsco,

did you already try the -F option instead of -f?

Thanks,
Erik

On Tue, Apr 17, 2018 at 03:46:27PM +, Jewsco Pius Jacquez wrote:
> Padraig, thanks for your response,
> 
> The ---disable-inotify didn't refresh either.
> 
> [root@cmilsbtest03 ~]# stat -f -c '%t %T'  /media/samba/test.file
> ff534d42 cifs
> [root@cmilsbtest03 ~]# df -h /media/samba/test.file
> FilesystemSize  Used Avail Use% Mounted on
> //10.124.61.52/finance
>14G   13G  1.6G  89% /media/samba
> [root@cmilsbtest03 ~]# grep /media/samba /proc/mounts
> //10.124.61.52/finance/ /media/samba cifs 
> rw,relatime,sec=ntlm,cache=loose,unc=\134\13410.124.61.52\134finance,username=,uid=0,noforceuid,gid=0,noforcegid,addr=10.124.61.52,unix,posixpaths,serverino,acl,rsize=1048576,wsize=65536,echo_interval=60,actimeo=1
>  0 0
> [root@cmilsbtest03 ~]#
> 
> 
> Thanks,
> Jewsco
> 
> 
> -Original Message-
> From: Pádraig Brady [mailto:p...@draigbrady.com] 
> Sent: Tuesday, April 17, 2018 2:29 AM
> To: Jewsco Pius Jacquez ; 31...@debbugs.gnu.org
> Subject: Re: bug#31184: tail -f on Network FS not refreshing as soon as the 
> file is changed.
> 
> On 16/04/18 10:11, Jewsco Pius Jacquez wrote:
> > Hello,
> > 
> > We have a legacy application that is using tail -f command in the 
> > application and is running in Redhat 9 under a shared Samba filesystem.
> > 
> > We want to migrate the application to RHEL7 and we noticed that the tail -f 
> > command here is not refreshing as soon as the file get changed. In Redhat 
> > 9, it is working fine, every write on the file got reflected straight 
> > away(no waiting interval).
> > 
> > Is there a way that we can make the tail -f working as it was in Redhat 9? 
> > For this reason, we are not able to migrate our Legacy application.
> 
> To get around the issue, the undocumented ---disable-inotify option may help 
> (note the three dashes)
> 
> If that does help then there is an issue with the misdetection of a known 
> file system as local, when it should be treated as remote.
> Can you show the file system type for the file you're trying to tail, using:
> 
>   stat -f -c '%t %T' /path/to/your/file
> 
> cheers,
> Pádraig
> This message and the information contained herein is proprietary and 
> confidential and subject to the Amdocs policy statement,
> 
> you may review at https://www.amdocs.com/about/email-disclaimer 
> 
> 
> 
> 
> 





bug#24813: Du Maximum files?

2016-10-28 Thread Erik Auerswald
Hi Ben,

On Fri, Oct 28, 2016 at 05:56:07AM +, Benny D. Miller Jr. wrote:
>  I am not so sure that this is a bug but a limitation. I am using
> "du" for a disk file listing/usage in the command:
> 
> du --all --time --human-readable --apparent-size $1;

You did not specify any problems with the above line pertaining to "du".

> printf "Total number of files: ";  find $1 -type f | wc -l;

The above line does not relate to "du", but to "find" and "wc".

> But it seems to maximum out the file count to 38341
> 
>  So I am thinking that this is a limitation to the counter in "wc"?

I don't think so. You can easily check wc's counter by generating known
input, e.g. as follows:

$ yes | head -n8 | wc -l
8
$ yes | head -n2678400 | wc -l
2678400

>  Just to let you know I have a script that takes a picture every
> minute and stores it to a 1TB HD and deletes anything older than 31
> days.
> Total file count should be near 2,678,400 files.

For some random directory I find the following:

$ find . -type f | wc -l
124168

I do not see the limit you observed.

Best regards,
Erik
-- 
Be water, my friend.
-- Bruce Lee





bug#24561: Unmathematical bc exponentiation behavior

2016-09-29 Thread Erik Auerswald
Hi Tobias,

On Wed, Sep 28, 2016 at 03:48:27PM +, Martens, Tobias wrote:
> echo "-(1)^2" | bc
> 1
> 
> I would have expected -1. This behavior is unmathematical and very
> confusing, because otherwise bc acts quite logic.

bc did exactly what you asked it to do. You probably meant to write:

echo "-(1^2)" | bc

Thanks,
Erik
-- 
If it ain't broke, don't fix it.





bug#23773: su is not working for non-root users

2016-06-16 Thread Erik Auerswald
Hi,

On Thu, Jun 16, 2016 at 07:44:06AM +0200, Bernhard Voelker wrote:
> On 06/15/2016 06:31 PM, Al Mamun wrote:
> > I was trying to "su - nonrootuser" but it returns incorrect password but
> > the password is ok and I can login from ssh. Everything is good with root.
> > Only the non-root user is causing the problem.
> 
> The program su is no longer part of coreutils since version 8.18 (2012),
> so - depending on your distribution - chances are that you're using the
> implementation of the util-linux package.

To verify if you are using a GNU Coreutils su program you can use:

  su --version

A GNU program is supposed to tell you its version (see for example "ls
--version"), while the non-GNU su program on my systems does not recognize
this option:

$ su --version
su: unrecognized option '--version'
Usage: su [options] [LOGIN]
[...]

> Thus said, we cannot help you here, and therefore I'm marking this as
> done in our bug tracker.

Best regards,
Erik
-- 
[T]he fact that something *can* be done the stupid way is in no way an
argument that it *should* be done the stupid way.
-- Linus Torvalds





bug#22511: [request] Add "--preserve-setuid" to the chown command

2016-02-01 Thread Erik Auerswald
Hi,

On Mon, Feb 01, 2016 at 03:33:29AM +0100, William Di Luigi wrote:
> if I understand it correctly, chown clears the setuid bit for security
> reasons (since, when changing the owner or group for a file, you could
> potentially be allowing *new people* to run that file as root).
> 
> While this is good for security, sometimes you want to be able to
> preserve the setuid bit. For example, when packaging software
> (https://bbs.archlinux.org/viewtopic.php?pid=1600551)

How about using "install" to install files, setting owner and mode bits
in one go?

HTH,
Erik
-- 
Always use the right tool for the job.
-- Rob Pike





bug#22001: Is it possible to tab separate concatenated files?

2015-11-27 Thread Erik Auerswald
Hi,

On Thu, Nov 26, 2015 at 08:28:13PM -0700, Eric Blake wrote:
> On 11/26/2015 04:52 PM, Linda Walsh wrote:
> 
> >> Because every plain
> >> text line in a file must be terminated with a newline.
> > 
> >That's only a recent POSIX definition.  It's not related to
> > real life.  When I looked for a text file definition on google, nothing
> > was mentioned about needing a newline on the last line -- except on
> > 1 site -- and that site was clearly not talking about 'text' files, but
> > Unix-text-record files w/each record terminated by a NL char.
> > 
> 
> Quit spreading FUD about POSIX.  That definition of text file is NOT a
> recent invention; even back in POSIX 2001 the definition read:
> 
> 3.392 Text File
> 
> A file that contains characters organized into one or more lines. The
> lines do not contain NUL characters and none can exceed {LINE_MAX} bytes
> in length, including the . Although IEEE Std 1003.1-2001 does
> not distinguish between text files and binary files (see the ISO C
> standard), many utilities only produce predictable or meaningful output
> when operating on text files. The standard utilities that have such
> restrictions always specify "text files" in their STDIN or INPUT FILES
> sections.
> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html

At least the definition of a "line" is needed as well to understand the
above (from the same URL):

 3.205 Line

 A sequence of zero or more non- s plus a terminating .

[...]
> 
> No, it has ALWAYS been a problem.  Even 40 years ago, before POSIX was
> invented, the only PORTABLE way to use programs like sed was to use it
> on text files [...]

The sed of Solaris 10 ignores trailing text after the last line, that
is after the last newline. I am quite sure this behavior has been in
older Solaris and SunOS versions as well.

Best regards,
Erik
-- 
http://www.unix-ag.uni-kl.de/~auerswal/





bug#21349: who shows no users nowadays on Debian

2015-08-27 Thread Erik Auerswald
Hi,

On Wed, Aug 26, 2015 at 02:13:37PM -0600, Bob Proulx wrote:
 Erik Auerswald wrote:
  This works on a current Debian/testing system (stable as well), so it might
  be a recent Debian/Sid (unstable) issue. Perhaps you want to open a bug
  report there?
 
 Updating utmp depends upon the terminal emulator.  XTerm updates it.

Logins via X used to update it as well (seldomly used nowadays). The local
desktop session usually updates utmp as well, at least with XFCE on
Debian/testing this is still the case. GNOME Terminal updates utmp as well.
Screen updates it, too. When I last looked at it, Konsole (from KDE) did
not update utmp.

 AFAIK it doesn't have anything to do with Debian changing anything.

Sven Joachim wrote:
 It most probably has, the latest xterm version (319) only writes a utmp
 entry if you start a login shell (i.e. use the -ls option)

Linda A. Walsh noticed a similar thing:
 The same thing happens on openSuSE

Of course this is most likely caused by changes outside of coreutils. On a
desktop system without any terminal windows, the desktop session should
be shown in the who output. On all systems I could easily check that is
the case. I do not have any SystemD/GNOME or KDE systems to test.

Thanks,
Erik





bug#21349: who shows no users nowadays on Debian

2015-08-26 Thread Erik Auerswald
Hi Dan,

On Wed, Aug 26, 2015 at 07:14:41AM +0800, 積丹尼 Dan Jacobson wrote:
 (info (coreutils) who invocation) says
 
 If given no non-option arguments, ‘who’ prints the following
 information for each user currently logged on: login name, terminal
 line, login time, and remote hostname or X display.
 
 Say if this means
 remote hostname or remote X display.
 
 or
 
 X display or remote hostname.
 
 By the way, now on Debian sid:
 
 $ who
 $ ps a
   PID TTY  STAT   TIME COMMAND
   593 tty7 Ss+0:10 /usr/lib/xorg/Xorg :0 -nolisten tcp vt7
  1310 pts/0Ss 0:00 bash
  1338 pts/1Ss 0:00 su -
  1339 pts/1S+ 0:00 -su
  2005 tty1 Ss+0:00 /sbin/agetty --noclear tty1 linux
  2073 pts/0R+ 0:00 ps a

This works on a current Debian/testing system (stable as well), so it might
be a recent Debian/Sid (unstable) issue. Perhaps you want to open a bug
report there?

Thanks,
Erik





bug#21325: ls : feature request --width=zero

2015-08-23 Thread Erik Auerswald
Hi,

On Sat, Aug 22, 2015 at 08:58:01PM -0700, Paul Eggert wrote:
 Pádraig Brady wrote:
 Also base64 -w0 has similar meaning.
 
 I didn't know that, but I don't like that either.  Utilities should
 use an explicit representation for infinity, if that's what they
 need.  'Inf', say.
 
 In the meantime, the patch that I installed is helpful even if we
 later add an explicit representation of infinity.

Using 0 to disable a length or width limit is quite common with networking
gear. I do not know any example requiring a keyword like Inf, neither
UNIX (like) nor other CLIs.

Anyway, an explicit Inf keyword is still better than some number that
relies on system limits, which may change with time.

Thanks,
Erik





bug#21325: ls : feature request --width=zero

2015-08-23 Thread Erik Auerswald
Hi,

On Sun, Aug 23, 2015 at 04:35:06AM -0700, Paul Eggert wrote:
 Erik Auerswald wrote:
 an explicit Inf keyword is still better than some number that
 relies on system limits
 
 With the latest patch, there are no system limits; you can use as
 big a number as you like.  I'm aware of the use 0 to denote
 infinity tradition, but I'm still leery of using a valid width
 value to denote infinity; I'd rather use some other string.

Numbers between 32 bit SIZE_MAX and 64 bit SIZE_MAX will show
differing behavior between 32 and 64 bit platforms (and data models).
In practice this should be irrelevant, but it might result in very
obscure failures. Explicitly setting the width to infinity, that is
not adding any line breaks, avoides that.

I would not consider 0 to be a valid width value, and the ls command
agrees:

$ ls -w0
ls: invalid line width: 0
$ ls --version | head -n1
ls (GNU coreutils) 8.21

As always, the asked for new command behavior can be achived by a bit
of scripting (be it shell or awk or whatever). I do not even argue that
enabling ls output in a single line is useful, Beco did. I just try to
give another perspective on the request.

IMHO explicitly specifying that no width limit should be assumed is
better than implicitly interpreting some implementation defined range
of numbers as infinity. Beco's suggestion of 0 has precedent and seems
more obvious to me than requiring a special keyword.

I hope this email documents that there are differing views on the matter,
and thank you for reading and considering my point of view.

Erik





bug#20936: suggestion for a 'wart-ish' extension off of 'sort'

2015-06-30 Thread Erik Auerswald
Hi,

On Tue, Jun 30, 2015 at 12:28:09AM -0700, Linda Walsh wrote:
 I admit the ability to show a summary line might not bethe first
 thing you'd think a pure-sorting utility might do, but it would be
 awfully handy if sort had a 'Numeric sum' option (-N -- preferred
 '-s', but it's already taken) to go with the -h sorting:
 
 ala:
 
  du -sh *|sort -h|tail

Why not use 'du -shc * | sort -h | tail -n11'?
The total produced by du will sort after all the individual parts.

Thanks,
Erik





bug#20936: suggestion for a 'wart-ish' extension off of 'sort'

2015-06-30 Thread Erik Auerswald
Hi,

On Tue, Jun 30, 2015 at 02:35:17AM -0700, Linda Walsh wrote:
 
 On 6/30/2015 12:46 AM, Erik Auerswald wrote:
 
 du -sh *|sort -h|tail
 Why not use 'du -shc * | sort -h | tail -n11'?
 The total produced by du will sort after all the individual parts.
 Good idea  -- didn't know about '-c', but two things, 1 troubling,
 the other a confusion.  If you have a dir named 'total' it can be
 slightly confusing:

You'll always know that the last total is the total of the above. ;-)

 Ishtar:/tmp/dutest du -shc * |sort -h|tail
 1.5Msperl,v
 3.6Mtotal
 5.0Mtotal
 Ishtar:/tmp/dutest du -sh * |hsort -s|tail
 1.5Msperl,v
 3.6Mtotal
 -
 5.1MTOTAL
 
 But more a more obvious problem is 'du -shc' seems to be coming up with
 the wrong number -- i.e. 1.5+3.6 = 5.1, not 5.0.

That are probably rounding errors avoided by du, that hsort cannot avoid
anymore.

Thanks,
Erik





bug#20745: I would like to make a request for the sort command

2015-06-08 Thread Erik Auerswald
Hi,

On Mon, Jun 08, 2015 at 10:51:59AM +0100, Stephane Chazelas wrote:
 2015-06-08 11:16:37 +0200, Erik Auerswald:
 [...]
  FWIW I use 'sort' to sort IPv4 addresses in my ping_scan[1] script.
  
  The info documentation for sort provides another example, log files
  sorted by IP address and time stamp. That specific example even needs
  two runs of sort, because sort lacks built-in support for IP addresses.
  
  While IPv4 addresses are readily sorted by sort -s -t '.' -k 1,1n -k
  2,2n -k 3,3n -k 4,4n, this is not the case for IPv6 addresses. Having
  an option for sorting IP addresses that supports both IPv4 and IPv6
  seems like a useful addition to me.
 [...]
 
 I'm not even sure having a tool just for that specific task
 would make sense though. Here, it sounds more like a job for a
 high level language like perl/python... (what if I want to sort
 on roman numerals now, week day names, astrological signs...)

Well, IP addresses are often encountered on Internet connected computers.
;-)

 for instance, here using yash syntax (you can use named pipes or
 possibly coprocs with some other shells):
 
 ip2hex() {
   perl -MSocket=:all -nle '
 print unpack (H2)*, inet_pton(/:/?AF_INET6:AF_INET, $_)'
 }
 
 mysort() {
   (
 exec 3|4
 tee /dev/fd/3 |
   cut -f1 3- | ip2hex 3- |
   paste - /dev/fd/4 3-
   ) | sort | cut -f2-
 }
 
 mysort  EOF
 127.0.0.1   blah
 6.6.6.6 foo
 ::1 bar
 EOF
 
 That's still quite awkward. A shame that piping capabilities in
 shells don't extend to  more  complex scenarii where the output
 of some command can be piped to two others the output of which
 can be merged back easily.
 
 named pipes can be used for that, but cleaning up and
 restricting access to them makes their usage quite messy.
 
 Of course, the whole thing can be done with perl.

I'd say the above is a very good reason for implementing the asked for
feature in sort.

Thanks,
Erik
-- 
Design your product to please the users.
-- Paul Graham





bug#20745: I would like to make a request for the sort command

2015-06-08 Thread Erik Auerswald
Hi,

On Fri, Jun 05, 2015 at 01:57:33PM -0600, Eric Blake wrote:
 On 06/05/2015 01:35 PM, Silverman, Jeffrey X. -ND wrote:
 
  This was previously discussed, and while has merit
  at the time it was thought not important enough to add:
 
   http://www.gnu.org/software/coreutils/rejected_requests.html
 http://lists.gnu.org/archive/html/coreutils/2011-06/msg00082.html
  
  I would like to join the debate.  Would you entertain that, or is the
  issue settled.
  
  If I wrote the code, would you include it?
 
 [...]
 justification on why people want sorted IP addresses.

FWIW I use 'sort' to sort IPv4 addresses in my ping_scan[1] script.

The info documentation for sort provides another example, log files
sorted by IP address and time stamp. That specific example even needs
two runs of sort, because sort lacks built-in support for IP addresses.

While IPv4 addresses are readily sorted by sort -s -t '.' -k 1,1n -k
2,2n -k 3,3n -k 4,4n, this is not the case for IPv6 addresses. Having
an option for sorting IP addresses that supports both IPv4 and IPv6
seems like a useful addition to me.

Thanks,
Erik

[1] https://www.unix-ag.uni-kl.de/~auerswal/ping_scan/
-- 
Be water, my friend.
-- Bruce Lee





bug#20553: 'echo -e' does not escape backslash correctly

2015-05-12 Thread Erik Auerswald
Hi,

On Mon, May 11, 2015 at 11:17:34PM +0100, Stephane Chazelas wrote:
 2015-05-11 23:50:25 +0200, Jo Drexl (FFGR-IT):
  Hi guys,
  I had to write a Windows bat file for twentysomething users and - as
  Linux geek - wrote a small Bash script for it. The code in question is
  as follows:
  
  echo -e net use z: srv\\aqs /persistent:no /user:%USERNAME%
  $BG_PASSWD\r
 [...]
 
 If that's a bash script, then that has nothing to do with GNU
 coreutils as bash has its own builtin version of echo.
 
 In any case, there's no bug here. and GNU coreutils echo or the
 bash one behave the same.
 
 \ is used as an escape character both for the bash language
 within double quotes, and for echo -e.
 
 echo -e 
 
 Passes 3 arguments to echo: echo, -e and \\

So you need to add another handful of \ characters:

$ echo -e 
\\
$ /bin/echo -e 
\\

Erik





bug#20474: tr command

2015-05-01 Thread Erik Auerswald
Hi,

On Thu, Apr 30, 2015 at 11:10:52AM -0600, Eric Blake wrote:
 On 04/30/2015 10:31 AM, Joseph Piette wrote:
  When transferring files from the Windows environment to the Linux 
  environment we execute a script to remove the \cr characters. The script 
  performs a simple
  
  tr -d '\r'   input   output
  [...]
 
 Another thing to try: the 'dos2unix' command exists in many Linux
 distros as a way to automate the work without having to figure out the
 commands to run yourself.

Many other Linux distros provide the 'recode' command instead.

Erik





bug#20354: [feature request] ln with command line arguments in reverse order

2015-04-17 Thread Erik Auerswald
On Fri, Apr 17, 2015 at 01:45:02PM +0100, Pádraig Brady wrote:
 On 17/04/15 12:45, Erik Auerswald wrote:
  On Fri, Apr 17, 2015 at 01:12:01PM +0200, Bernhard Voelker wrote:
  On 04/17/2015 10:39 AM, Ma Jiehong wrote:
  Currently, 'cp', 'mv' and 'ln' share the same basic syntax, that is to 
  say the following:
 
  cp [OPTION]  SOURCE DEST
  mv [OPTION] SOURCE DEST
  ln [OPTIONS] TARGET LINK_NAME
  [...]
  output would be the better way.
  
  I'd say that using TARGET instead of SOURCE creates confusion that would be
  avoided by using SOURCE and DEST as with cp and mv.
 
 Not really, as one could still consider that
 DEST was the destination of a symlink.
 
 How I think about it is:
 
   cp [OPTION]  EXISTING NEW
   mv [OPTION]  EXISTING NEW
   ln [OPTIONS] EXISTING NEW

That's good wording.

Thanks,
Erik





bug#20354: [feature request] ln with command line arguments in reverse order

2015-04-17 Thread Erik Auerswald
On Fri, Apr 17, 2015 at 01:12:01PM +0200, Bernhard Voelker wrote:
 On 04/17/2015 10:39 AM, Ma Jiehong wrote:
 Currently, 'cp', 'mv' and 'ln' share the same basic syntax, that is to say 
 the following:
 
 cp [OPTION]  SOURCE DEST
 mv [OPTION] SOURCE DEST
 ln [OPTIONS] TARGET LINK_NAME
 
 Which is the same exact rule, and is consistent.
  [...]
  In this case, the command would act like this:
  ln --reverse-order LINK_NAME TARGET
 
 Adding an option to reverse the two may have it's merits, but I guess this
 extra flexibility would only confuse the users even more.

If you do not know the original order beforehand, you do not know the
--reverse-order either. IMHO this option does not help.

 The situation would be better if the target would be an operand to that
 option, similar to mv's --target-directory=DIRECTORY option.

Careful here, --target-directory specifies a DESTination, while ln's TARGET
means SOURCE.

 However, I think this would just bloat the code for not much new 
 functionality,
 and I'm convinced that a good translation for TARGET and LINK_NAME in --help
 output would be the better way.

I'd say that using TARGET instead of SOURCE creates confusion that would be
avoided by using SOURCE and DEST as with cp and mv.

Thanks,
Erik





bug#20354: [feature request] ln with command line arguments in reverse order

2015-04-17 Thread Erik Auerswald
On Fri, Apr 17, 2015 at 03:07:50PM +0200, Bernhard Voelker wrote:
 On 04/17/2015 02:45 PM, Pádraig Brady wrote:
ln [OPTIONS] EXISTING NEW
 
 I stilll think this is a translation issue.
 And I don't think the synopsis has to look the same as for
 cp and mv.  If you really want it to be changed, What about
 
   ln [OPTIONS] LINK_TARGET LINK_NAME
 
 ?

IMHO changing TARGET to LINK_TARGET does not help.

Why should the synopsis for similar commands look differently?

cp -l SOURCE DEST
ln TARGET LINK_NAME

cp -s SOURCE DEST
ln -s TARGET LINK_NAME

Perhaps new users should be told to stay away from ln and always use cp?
*SCNR*

Anyway, I have seen the confusion about ln usage by inexperienced
users and just wanted to chime in. The original posters suggestion of
a --reverse-order option would make matters worse IMHO.

Thanks,
Erik





bug#19243: echo comand bug

2014-12-02 Thread Erik Auerswald
Hi,

On Mon, Dec 01, 2014 at 11:12:51AM -0700, Eric Blake wrote:
 On 12/01/2014 10:56 AM, Chema F. Ledesma wrote:
  
  If you execute echo  it does something strange
  repeating the last command before echo comand.
 
 Thanks for the report.  However, this is not a bug in 'echo', but a
 feature of your shell.

And because of this you can use single quotes instead of double quotes to
print those exclamation marks:

$ echo ''


Thanks,
Erik
-- 
Be water, my friend.
-- Bruce Lee





bug#15945: chown: Does now allow setting user and users login group with numerical user ID

2013-11-21 Thread Erik Auerswald
Hi,

On Thu, Nov 21, 2013 at 08:53:47AM -0700, Eric Blake wrote:
 On 11/21/2013 04:50 AM, Tormen wrote:
  
  I think I just found a bug in chown... \o/ ;)
  
  I tried:
  chown 1001: /tmp/bla
  
  Leading to:
  chown: invalid spec: `1001:'
 
 Drop the trailing colon.
 
  ... it should be a bug except if there is a technical detail I am
  missing here.
 
 If you provide a colon, you MUST also provide a group spec.  Per 'chown
 --help', the syntax is:
  chown [OPTION]...  [OWNER][:[GROUP]] FILE...
 so these are valid:
  chown 1001 /tmp/bla# change owner only
  chown :1001 /tmp/bla   # change group only
  chown 1001:1001 /tmp/bla   # change both
 but this is invalid:
  chown 1001: /tmp/bla   # '' is not a valid group

Should this be changed to 'chown [OPTION]...  [OWNER][:GROUP] FILE...'
then?

Erik





bug#15092: Dirname Bug

2013-08-14 Thread Erik Auerswald
Hello Axel,

On Wed, Aug 14, 2013 at 09:16:14AM +0200, Axel Spallek wrote:
 the following throw errors:

 dirname --to-0040257282759-in.wav
 dirname --to-0040257282759-in.wav
 dirname '--to-0040257282759-in.wav'

 IMHO at least the last two ones schould work.

If the arguments to a program start with a - or --, they are assumed to be
options. You can use -- to stop option processing:

$ dirname -- --to-0040257282759-in.wav
.

Quoting the arguments as in your last two examples does not affect this, as
the shell does not interpret leading - characters in the command line, the
program called does.

HTH,
Erik





bug#13394: Misalignment for seq -w

2013-01-09 Thread Erik Auerswald

Hi,

On 01/09/2013 11:34 AM, Bernhard Voelker wrote:

On 01/09/2013 11:14 AM, Marcel Böhme wrote:


There are the following problems with the -w parameter of the seq tool:
[...]


Hmm, according to the TEXI manual, the FIRST number should also use
a fixed point decimal representation when the -w option is used:
[...]
But that leaves the question open if there's a reason for this.
I.e. if it's just documented behavior, a requirement of some
standard or due to compatibility reasons.


That seems to be just documented behavior, since seq is not standardized 
by POSIX and other seq implementations ([1],[2],[3]) don't document 
this. On the contrary, a common example is 'seq -w 0 .05 .1'.


This example works fine with GNU seq:

$ seq -w 0 .05 .1
0.00
0.05
0.10

Even when counting to negative numbers:

$ seq -w 0 -.05 -.1
00.00
-0.05
-0.10

Starting with a negative number with a fractional step size breaks equal 
width for non-negative numbers:


$ seq -w -1 .5 1
-1.0
-0.5
0.0
0.5
1.0

$ seq --version | head -n1
seq (GNU coreutils) 8.13

HTH,
Erik

[1] http://man.cat-v.org/unix_8th/1/seq
[2] http://man.cat-v.org/plan_9/1/seq
[3] http://www.freebsd.org/cgi/man.cgi?query=seqmanpath=FreeBSD+9.0-RELEASE





bug#13394: Misalignment for seq -w

2013-01-09 Thread Erik Auerswald

On 01/09/2013 01:05 PM, Pádraig Brady wrote:

On 01/09/2013 11:01 AM, Erik Auerswald wrote:

Hi,

On 01/09/2013 11:34 AM, Bernhard Voelker wrote:

On 01/09/2013 11:14 AM, Marcel Böhme wrote:


There are the following problems with the -w parameter of the seq tool:
[...]


Hmm, according to the TEXI manual, the FIRST number should also use
a fixed point decimal representation when the -w option is used:
[...]
But that leaves the question open if there's a reason for this.
I.e. if it's just documented behavior, a requirement of some
standard or due to compatibility reasons.


That seems to be just documented behavior, since seq is not
standardized by POSIX and other seq implementations ([1],[2],[3])
don't document this. On the contrary, a common example is 'seq -w 0
.05 .1'.

This example works fine with GNU seq:

$ seq -w 0 .05 .1
0.00
0.05
0.10

Even when counting to negative numbers:

$ seq -w 0 -.05 -.1
00.00
-0.05
-0.10

Starting with a negative number with a fractional step size breaks
equal width for non-negative numbers:

$ seq -w -1 .5 1
-1.0
-0.5
0.0
0.5
1.0

$ seq --version | head -n1
seq (GNU coreutils) 8.13


Looks like a bug. I'll fix with:

diff --git a/src/seq.c b/src/seq.c
index e1b467c..3eb53f8 100644
--- a/src/seq.c
+++ b/src/seq.c
@@ -332,6 +332,8 @@ get_default_format (operand first, operand step,
operand last)
last_width--; /* don't include space for '.' */
if (last.precision == 0  prec)
last_width++; /* include space for '.' */
+ if (first.precision == 0  prec)
+ first_width++; /* include space for '.' */
size_t width = MAX (first_width, last_width);
if (width = INT_MAX)
{


The patch looks plausible. ;-)

Thanks,
Erik





bug#13391: dd silently ignores lseek error

2013-01-08 Thread Erik Auerswald
Hi,

On Wed, Jan 09, 2013 at 01:14:22AM +, Pádraig Brady wrote:
 On 01/08/2013 08:55 PM, Paul Eggert wrote:
 On 01/08/13 10:11, Neil Klopfenstein wrote:
 Note that it begins reading at the _beginning of the ar file_ -- the 'skip'
 argument has failed silently.

 But the 'skip' hasn't failed.  It's merely being implemented via 'read'
 rather than via 'lseek'.  The records are being skipped correctly.

 It might be useful to give dd a new option, which causes it
 to insist on lseeking rather than reading in cases like these,
 and to report an error if the lseek fails.

 I had a look around for a tool to verify
 that a file/device supports the seek operation
 and couldn't find one.
 So this seems like useful functionality.
 Worth applying the attached?
 [...]
 
 * src/dd.c: Add the new O_SEEKABLE flag.
 (main): Verify leek() works if O_SEEKABLE is set.
 ^^lseek()

 [...]
/* else file_size  offset  OFF_T_MAX or file ! seekable */
  
 +
Stray new newline?

do
 [...]

Besides these nitpicks the patch looks good to me.

HTH,
Erik





bug#13295: Possible bug - tr utility

2012-12-28 Thread Erik Auerswald

Hi Randy,

On 12/28/2012 06:37 PM, Killen, Randy wrote:

Hello -

I encountered the situation shown below so thought that I would report it to 
see if it might be a bug or is expected behavior.  Please let me know if you 
need additional information.

Randy


$
$ echo something | tr [:lower:] [:upper:]
SOMETHING
$ echo something | tr '[:lower:]' '[:upper:]'
SOMETHING
$
$ touch l
$ echo something | tr [:lower:] [:upper:]
tr: misaligned [:upper:] and/or [:lower:] construct
$ echo something | tr '[:lower:]' '[:upper:]'
SOMETHING
$ rm l
$
$ touch u
$ echo something | tr [:lower:] [:upper:]
tr: misaligned [:upper:] and/or [:lower:] construct
$ echo something | tr '[:lower:]' '[:upper:]'
SOMETHING
$ rm u
$
$ touch l
$ touch u
$ echo something | tr [:lower:] [:upper:]
something
$ echo something | tr '[:lower:]' '[:upper:]'
SOMETHING
$ rm l
$ rm u


This is expected behavior, caused by lack of quoting that results in the 
shell (Bash) interpreting [...] as a wildcard pattern for file name 
globbing (see glob(7)). If the 'nullglob' option of the shell is 
disabled (use 'shopt nullglob' to display the current setting), a 
wildcard that matches no files is kept as is. Thus the wildcards 
[:lower:] and [:upper:] are either replaced by l resp. u if one of those 
files exist or kept, if no matching file exists.


Quoting the special characters '[' and ']' by using '[:lower:]' resp. 
'[:upper:]' (including the quotes) inhibits the shell from interpreting 
them as file globbing wildcards. Therefore, you should always quote 
character classes that are meant as arguments to a program.


HTH
Erik





bug#12301: tail -F ZFS

2012-08-29 Thread Erik Auerswald
Hi,

On Wed, Aug 29, 2012 at 11:33:13AM +0200, Bernhard Voelker wrote:
 On 08/29/2012 11:12 AM, Jim Meyering wrote:
  Can we be sure that 0x2fc12fc1 is used for all ZFS
  implementations?
 
  If there end up being two or more magic numbers for the same file
  system (or ZFS variants going by new names), we'll adapt.
 
 Ok, that sounds good. Thanks.

How can I check this magic number on a Solaris 10 system with ZFS?
Neither GNU coreutils nor a C compiler installed. :-(

$ uname -a
SunOS demchpux 5.10 Generic_13-08 sun4v sparc SUNW,SPARC-Enterprise-T5120
$ df -n /
/  : zfs
$ df -g /
/  (/ ):   131072 block size   512 
frag size
   0 total blocks   82144537 free blocks 82144537 available   82388038 
total files
82144537 free files 67174412 filesys id
 zfs fstype   0x0004 flag 255 filename length

Regards,
Erik





bug#11994: [PATCH] Re: bug#11994: sort(1) doesn't say SEE ALSO uniq(1)

2012-07-20 Thread Erik Auerswald
Hi,

On Fri, Jul 20, 2012 at 01:54:21AM +0800, jida...@jidanni.org wrote:
 sort(1) doesn't say SEE ALSO uniq(1), and vice versa.

The small attached patch adds that.

Erik
-- 
Golden rule #12: When the comments do not match the code, they probably
 are both wrong.
-- Steven Rostedt
From 8031b27b75f7f668e3ac4989297ce0a0f7e84e52 Mon Sep 17 00:00:00 2001
From: Erik Auerswald auers...@unix-ag.uni-kl.de
Date: Sat, 21 Jul 2012 00:48:17 +0200
Subject: [PATCH] doc: mention uniq(1) in sort(1) man-page and vice versa

* man/sort.x: Add SEE ALSO section with entry uniq(1).
* man/uniq.x: Add sort(1) to SEE ALSO section.
---
 man/sort.x |2 ++
 man/uniq.x |2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/man/sort.x b/man/sort.x
index 5c171dd..b0d4a1a 100644
--- a/man/sort.x
+++ b/man/sort.x
@@ -2,3 +2,5 @@
 sort \- sort lines of text files
 [DESCRIPTION]
 .\ Add any additional description here
+[SEE ALSO]
+uniq(1)
diff --git a/man/uniq.x b/man/uniq.x
index 98a95f9..013cef3 100644
--- a/man/uniq.x
+++ b/man/uniq.x
@@ -3,4 +3,4 @@ uniq \- report or omit repeated lines
 [DESCRIPTION]
 .\ Add any additional description here
 [SEE ALSO]
-comm(1), join(1)
+comm(1), join(1), sort(1)
-- 
1.7.10.4



bug#10355: Add an option to {md5,sha*} to ignore directories

2011-12-23 Thread Erik Auerswald

Hi Gilles,

On 12/23/2011 02:45 PM, Gilles Espinasse wrote:

I was using a way to check md5sum on a lot of file using
  for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
/$myfile  $ALLFILES}.md5; fi; done

But this is slow, comparing with xargs md5sum way.
time (for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
/$myfile  ${ALLFILES}.md5; fi; done)

real0m26.907s
user0m40.019s
sys 0m10.253s

This is faster using xargs md5sum.
time (sed -e '/.\/$/d' -e 's|^.|/|g' ${ALLFILES} | xargs md5sum

${ALLFILES}.md5)

md5sum: /etc/ipsec.d/cacerts: Is a directory
md5sum: /etc/ipsec.d/certs: Is a directory
md5sum: /etc/ipsec.d/crls: Is a directory
md5sum: /etc/ppp/chap-secrets: No such file or directory
md5sum: /etc/ppp/pap-secrets: No such file or directory
md5sum: /etc/squid/squid.conf: No such file or directory

real0m1.176s
user0m0.780s
sys 0m0.400s

That run mostly 30 times faster.
In the above example, I already skipped most of the directories in the list,
removing lines that end with / but not all directories in my list match on
that condition.


How do you create the list of files to check?
You could use find $DIR -type f to list regular files only.

Erik





bug#10079: [PATCH] ln: fix position of --backup values description

2011-11-19 Thread Erik Auerswald
From a8b44ddbfa9b1e75dbaabbe69bb535ed045fbf0a Mon Sep 17 00:00:00 2001
From: Erik Auerswald auers...@unix-ag.uni-kl.de
Date: Sat, 19 Nov 2011 22:07:29 +0100
Subject: [PATCH] ln: fix position of --backup values description

* src/ln.c (usage): A paragraph describing interactions of -s
with -L and -P somehow snuck in between the description of the
--backup option and the values used to control it. Fix this by
moving the value description up.
---
 src/ln.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/ln.c b/src/ln.c
index 3e63bea..88737ed 100644
--- a/src/ln.c
+++ b/src/ln.c
@@ -385,17 +385,17 @@ The version control method may be selected via the 
--backup option or through\n\
 the VERSION_CONTROL environment variable.  Here are the values:\n\
 \n\
 ), stdout);
-  printf (_(\
-Using -s ignores -L and -P.  Otherwise, the last option specified controls\n\
-behavior when the source is a symbolic link, defaulting to %s.\n\
-\n\
-), LINK_FOLLOWS_SYMLINKS ? -L : -P);
   fputs (_(\
   none, off   never make backups (even if --backup is given)\n\
   numbered, t make numbered backups\n\
   existing, nil   numbered if numbered backups exist, simple otherwise\n\
   simple, never   always make simple backups\n\
+\n\
 ), stdout);
+  printf (_(\
+Using -s ignores -L and -P.  Otherwise, the last option specified controls\n\
+behavior when the source is a symbolic link, defaulting to %s.\n\
+), LINK_FOLLOWS_SYMLINKS ? -L : -P);
   emit_ancillary_info ();
 }
   exit (status);
-- 
1.7.7.3






bug#10030: grep: strange behavior with '-' in character class

2011-11-13 Thread Erik Auerswald

Hi,

On 11/12/2011 08:05 PM, Thomas Dignan wrote:

echo /lib64/ld-linux-x86-64.so.2 | grep -o '[a-zA-Z\/0-9\-\.]*'


You probably want to use something like

 echo /lib64/ld-linux-x86-64.so.2 | grep -o '[-a-zA-Z/0-9.]*'

Note: The '-' should be the first character inside a character class if 
it is not used to describe a range.


Erik





bug#9762: tac fails when given multiple non-seekable inputs due to misuse of mkstemp()

2011-10-15 Thread Erik Auerswald
Hi,

On Sat, Oct 15, 2011 at 01:40:17PM -0700, Ambrose Feinstein wrote:
 Trivial reproduction:
 
 $ true | tac - -
 tac: cannot create temporary file in `/tmp': Invalid argument
 
 This is present in coreutils 8.14.

This is present in coreutils 8.13 as well:

$ tac (echo a) (echo b)
tac: cannot create temporary file in `/tmp': Invalid argument
a
$ tac --version
tac (GNU coreutils) 8.13
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Jay Lepreau and David MacKenzie.


Erik





bug#8736: chmod -p --parents

2011-05-27 Thread Erik Auerswald
Hi,

On Thu, May 26, 2011 at 10:13:05PM +0200, francky.l...@telenet.be wrote:
 
 In the past, I was an advocate of the -p --parents option for 
 mkdir. By now this is realised. Now I'm doing the same for chmod. 
 [...]
 I want to be able to execute the following: 
 
 chmod a+rx -p ~/dir1/dir2/dir3/file.ext 

I like the proposal.

 It should be more refined however: the dirs 
 should be rx, but the file only r. 

You can use a capital X to automatically handle the directory case. But
please take a look at the docs for the _exact_ meaning of X.

Regards,
Erik
-- 
In the beginning, there were not enough colors.
-- Guy Keren





bug#8575: Shell Script in Red Hat Enterprise Linux Server release 5.5 (Tikanga)

2011-04-28 Thread Erik Auerswald

Hi,

On 04/28/2011 10:42 AM, Syed Nizamuddin wrote:

I get the following error .

basename: invalid option -- b
Try `basename --help' for more information.
basename: missing operand

I have basename used as

CMDE=`\basename $0 .sh`
echo $basename is $CMDE

Doesn't o/p anything. Please


Try

CMDE=`\basename -- $0 .sh`

Regards,
Erik





bug#8500: util: where am i

2011-04-15 Thread Erik Auerswald

Hi,

please don't top post, thanks. And keep on reading for inline comments. ;-)

On 04/15/2011 09:33 AM, Panagiotis Tsiamis wrote:

2011/4/15 Bjartur Thorlaciussvartma...@gmail.com

On 4/14/11, Panagiotis Tsiamisptsia...@gmail.com  wrote:

Request for adding one more feature on the utillity whoami.

The feature should be able to called by
where am i or whereami

And should locate:
a) System hostname


hostname
uname -n


b) ip of the system


Bob had an excellent example:
ip addr | awk '/inet/{print $2}'

Of course this might be local and private addresses, not the IP address 
used for your internet connection.



c) current working directory


pwd


d) anything else that could be usefull for identify where you are located
currently.


Most GNU/Linux distributions configure the shell prompt to display the 
usually helpful info, i.e. user name, host name, current working 
directory. Some people use color (or even blinking) to highlight working 
as a privileged user (root).



I doubt that should be included in coreutils. I could see the utility
of such an utility, and think packagers of SSH servers could well
suggest it, but I can more easily imagine a number of installations
where `hostname;pwd` would be as good, if not better.


Most shell configurations provide this info all the time.


I don't dissagree about your opinion that involves ssh utillity to do this
job (it could possibly also keep a look of systems that you recently connect
also)
but together with ssh there also are rsh/rlogin, telnet,  and other remote
connection software that can be used from cli.


You can use 'who', 'w', 'last', 'pinky' or 'finger' to find out from 
where you (and others) are connected (and some additional info as well).



I discuss the possibillity to
integrate such a command that keeps tracks of recent systems, current
system, system connection path (hostA-hostB-hostC) and distribute this
information accordingly to each system you connect/disconnect. If anyone has
furthermore ideas or is interested on a tool like this, hope will reply.


This kind of tracking functionality should be strictly opt-in.

All in all I don't see a need for a 'whereami' utility at all.

Regards,
Erik





bug#8391: chmod setuid setguid bits

2011-04-01 Thread Erik Auerswald
Hi,

On Thu, Mar 31, 2011 at 11:54:26AM -0700, Paul Eggert wrote:
 On 03/31/2011 11:25 AM, Christian wrote:
  and using 0755 is explicit enough, isn't it ?
 
 Unfortunately it's not that simple, as having 0755 mean
 something different from 755 would violate the principle
 of least surprise.

I am very surprised that explicitly specifying 0 for SUID, SGID, sticky is
silently ignored.

 Please see the thread starting at
 http://lists.gnu.org/archive/html/bug-coreutils/2006-07/msg00124.html.

Quoting from that message:
   set-user-ID and set-group-ID bits instead of clearing them.  If
   you want to clear the bits you can mention them explicitly, e.g.,
   `chmod 0755 DIR' and `chmod a-s,u=rwx,go=rx DIR'.
 ^^

How could one be more explicit?

Paul Eggert seemed to agree:
http://lists.gnu.org/archive/html/bug-coreutils/2006-07/msg00125.html
   However, I would argue that this is more confusing than
   what we've got right now, since chmod 0755 DIR clearly requests to
   clear the setgid bit.

Jim Meyering disagreed:
http://lists.gnu.org/archive/html/bug-coreutils/2006-07/msg00128.html
   Treating that leading '0' as significant violates the principle of
   least surprise.  Not to say that everyone who uses chmod(1) even knows
   what an octal number is, but enough of us are used to that leading zero
   being insignificant that I think it should remain negligible.
 [...]
   To me, it's not a clear request to clear the setgid bit.

Eric Blake suggested a weird looking (to me anyway ;) solution:
http://lists.gnu.org/archive/html/bug-coreutils/2006-07/msg00130.html
   Should we document chmod 00500 dir as an explicit way to clear the
   bit, or just require a textual mode string?

Furthermore, it was found that vendor's implementations of chmod surprise
in different ways.

I'd suggest adding a warning if chmod (and possibly other utils) encounter
an octal mode number with leading 0, as that might mean 'octal' or 'zero'.
I'd definitely prefer interpreting the leading 0 as a zero for the
SUID/SGID/sticky bits, but coreutil's viewpoint obviuosly differs...

Regards,
Erik
-- 
If you don't know what you are doing, advance designs will not help.
-- Eric Allman





bug#8391: chmod setuid setguid bits

2011-04-01 Thread Erik Auerswald
Hi,

On Thu, Mar 31, 2011 at 02:15:36PM -0600, Eric Blake wrote:
 On 03/31/2011 01:58 PM, Christian wrote:
  Am 31.03.2011 20:54, schrieb Paul Eggert:
  On 03/31/2011 11:25 AM, Christian wrote:
  and using 0755 is explicit enough, isn't it ?
  Unfortunately it's not that simple, as having 0755 mean
  something different from 755 would violate the principle
  of least surprise.  Please see the thread starting at
  http://lists.gnu.org/archive/html/bug-coreutils/2006-07/msg00124.html.
  I read it and I came to the conclusion
  755 should preserve s-bit: OK
  2755 sould set sbit. OK
  0755 should remove sbit, cause it is explicit wanted.
  and not doing so is a lemming behaviour.
 
 No, 0755 is not explicit - it is ambiguous with people that are
 explicitly using printf %#3o to output a 3-digit octal string with
 leading 0 - I don't think we can change this.
 
 But my suggestion of 00755 _is_ explicit - after taking off the leading
 0 for specifying octal, you are _still_ left with four octal digits
 which includes the sticky bit explicitly being set to 0.

It is explicit, but it looks weird (to me) and is surprising, since the
leading 0 for 'octal' is clearly _not_ needed for chmod and friends. No
documentation I know states a need for the leading 0 to denote 'octal' for
the octal mode value. The value is always documented as being an octal
number.

Anyway, this is non-portable and just needs to be documented explicitly and
in length. I did not (yet) check the current coreutils docs and FAQ for
this, so it possibly already is.

Regards,
Erik
-- 
Golden rule #12: When the comments do not match the code, they probably
 are both wrong.
-- Steven Rostedt





bug#7463: truncate

2010-11-22 Thread Erik Auerswald
Hi,

On Mon, Nov 22, 2010 at 09:41:33AM -0500, Rupert Bruce wrote:
 truncate (GNU coreutils) 7.4

 Unexpected behavior:

 $ ls -l
 total 0
 $ truncate --size 0 *.log
 $ ls
 *.log

 I would expect truncate --size 0 *.log to truncate any files ending  
 with .log; instead I get a new file called *.log

That is caused by your shell, see the nullglob option (at least for
bash).

Erik
-- 
Trying to understand [the GNU GPL] in terms of the goals and values
of open source is like trying [to] understand a CD drive's retractable
drawer as a cupholder.
-- Richard Stallman





bug#7042: df --help does not show `-m' option

2010-09-17 Thread Erik Auerswald
Hi,

On Fri, Sep 17, 2010 at 09:56:57AM +0100, Pádraig Brady wrote:
 On 16/09/10 23:34, Paul Eggert wrote:
  If we're going to make incompatible changes, I suggest that
  we solve the problem once and for all, by having df choose
  the default blocksize dynamically, based on the size of the
  output line describing the smallest disk.  For example, where
  df currently outputs this:
  
  Filesystem   1K-blocks  Used Available Use% Mounted on
  /dev/sda111620338002   1437021 11618900981   1% /r/opt
  /dev/sda2 20971520   1335871  19635650   7% /home/eggert
  
  df would notice that the smallest file system is between 1GB and 1TB,
  so it would default to 1 GB blocks, as follows:
  
  Filesystem  1GB-blocks  Used Available Use% Mounted on
  /dev/sda1  11900GB   2GB   11898GB   1% /r/opt
  /dev/sda2 22GB   2GB  21GB   7% /home/eggert
  
  This is much more useful as an output format, because one can visually
  see which file systems are larger by seeing how many digits are there.
  Contrast this to the output of df --si:
  
  Filesystem Size   Used  Avail Use% Mounted on
  /dev/sda1   12T   1.5G12T   1% /r/opt
  /dev/sda2   22G   1.4G21G   7% /home/eggert
  
  which is harder to visually parse that way.

Yes, especially 1.5G of the 12T disk used looks a lot like 1.5T of 1.5T
used.

 That would break lots of scripts I'd say
 (they should use -P, but many don't).

That's obvious (all three points, sadly).

 In any case I don't think there is enough benefit
 in such a format change given the common wide range
 of device sizes attached to systems.

I like Paul's suggestion. Of course there are corner cases (mounting an
older USB stick with e.g. 128MB). The base could be selected by the
smallest mounted fixed disk.

Erik
-- 
[T]he essence of XML is this: the problem it solves is not hard,
and it does not solve the problem well.
-- The Essence of XML by Jérôme Siméon, Philip Wadler





bug#6790: Problem(bug?) with basic sort command in Linux

2010-08-04 Thread Erik Auerswald
Hi George,

On Wed, Aug 04, 2010 at 01:32:55AM +0530, George Thomas Irimben (georgeti) 
wrote:
 I would like to report a problem(bug?) I am facing with sort command in
 Linux.
 
 Sorting of a simple text file using simple sort command is giving me
 incorrect result.
 
 Here is the problem:
 
 Text file to sort has 3 lines
 my-lnx7% cat y
 abc/d,ABC
 abc/,XYZ
 abc/o,MNO
 
 sort command from Linux is giving me below result(According to me, this
 result is incorrect)
 
 my-lnx7% sort y
 abc/d,ABC
 abc/o,MNO
 abc/,XYZ
 
 But, result expected is as below. Because , is ahead of d in ASCII
 table. 
 Same found working on Unix using same input file, same command line.
 
 abc/,XYZ
 abc/d,ABC
 abc/o,MNO
 
 
 Pls let me know if this is a problem in Linux or I am missing something.

You missed the effects of locale settings (see
http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021).

$ printf abc/d,ABC\nabc/,XYZ\nabc/o,MNO\n | LC_COLLATE=en_US.UTF-8 sort
abc/d,ABC
abc/o,MNO
abc/,XYZ
$ printf abc/d,ABC\nabc/,XYZ\nabc/o,MNO\n | LC_COLLATE=C sort 
abc/,XYZ
abc/d,ABC
abc/o,MNO

Erik
-- 
If you're willing to restrict the flexibility of your approach,
you can almost always do something better.
-- John Carmack





bug#6247: Request a pause

2010-05-22 Thread Erik Auerswald

Hi,

On 05/22/2010 06:16 PM, Mark A Powell wrote:

Hello,  I just read the man for the ls command.  I didn't see an
option for pause, as in DOS (dir/p).  There are times when it would be a
great help when, viewing more than one page of files in a directory.
I hope this is not something, I over looked.


There is no such option, you can pipe the output of ls into your pager 
for this. Of course this enables you to use all the features of said 
pager, e.g. searching or scrolling backwards.


Example:
ls | less

Erik





bug#6020: coreutils-8.x: a simple feature enhancement, and how to do it

2010-04-29 Thread Erik Auerswald
Hi,

two nit-picks regarding the test script below:

On Thu, Apr 29, 2010 at 12:39:46AM +0100, Pádraig Brady wrote:
 [...]
 @@ -0,0 +1,51 @@
 +#!/bin/sh
 +# Ensure sort -g sorts floating point limits correctly
 [...]
 +if test $VERBOSE = yes; then
 +  set -x
 +  mv --version
 ^^
 sort
would be nicer.

 +fi
 +
 +. $srcdir/test-lib.sh
 +getlimits_
 +
 +# See if sort should be using long doubles
 +grep '^#define HAVE_C99_STRTOLD 1' $CONFIG_HEADER  /dev/null ||
 ^^^
 -q
would be more concise.

Regards,
Erik






Re: [PATCH] sort: use posix_fadvise to announce access patterns on files opened for reading

2010-03-02 Thread Erik Auerswald
Hi,

On Mon, Mar 01, 2010 at 05:33:38PM -0800, Joey Degges wrote:
 Were you sure to remount your devices to clear the cache before running
 these tests? While testing this patch early on the cache caused me many
 incorrect readings. Another approach I took to clear the cache was to fill
 up all of my RAM with some other process.

To clear the caches you can use

echo 3  /proc/sys/vm/drop_caches

(linux kernel since version 2.6.16).

Erik




Re: Pinky command

2009-11-11 Thread Erik Auerswald
Hi,

On Wed, Nov 11, 2009 at 06:15:32PM -0700, Bob Proulx wrote:
 hemant.ru...@us.ing.com wrote:
  In old days, attackers used to create .project symbolic to passwd
  and group files to get the List of login ids and group via
  fingerd.
 
 The list of uids are already public in the /etc/passwd file.  That file
 is already world readable.  Therefore it isn't clear to me how using
 another command makes this a vulnerability.

Using fingerd, this could disclose login names to remote attackers.
This, of course, does not apply to local invokation of some tool that
uses normal user privileges.

Erik
-- 
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?




Re: the unicode arrow

2009-09-07 Thread Erik Auerswald
Hi,

On Mon, Sep 07, 2009 at 07:23:12AM -0600, Eric Blake wrote:
 According to Michal Svoboda on 9/6/2009 5:33 AM:
  When doing cp -va I can see neat quotes (depending on locale), as in
  „blah“, but the arrow is still composed of a dash and a greater-than
  symbol, as in -. Is there any plan to make the arrow also neat, using
  the unicore arrow symbol?
 
 This was discussed last month.  The verdict is no.
 http://lists.gnu.org/archive/html/bug-coreutils/2009-08/msg00048.html

Actually that discussion was about ls -l, which has a POSIX specified
output format. The cp -v case is different in that it is not POSIX
specified and already uses special characters (those neat quotes).
I'd say that cp -v could very well use an arrow symbol (but I don't
intend to write a patch, since this is not important to me ;-).

Erik
-- 
But hey, don't listen to me - I like C++, and approve of Java.
-- Andrew Morton




Re: new snapshot available: coreutils-7.5.65-61cc6

2009-09-07 Thread Erik Auerswald
Hi Jim,

On Mon, Sep 07, 2009 at 11:09:21AM +0200, Jim Meyering wrote:
 There have been disproportionately many bug fixes since coreutils-7.5.
 It's an interesting mix of fixes for recent regressions and for a few older 
 bugs.
 
 coreutils snapshot:
   http://meyering.net/cu/coreutils-ss.tar.gz
   http://meyering.net/cu/coreutils-ss.tar.xz
   http://meyering.net/cu/coreutils-ss.tar.gz.sig
   http://meyering.net/cu/coreutils-ss.tar.xz.sig
 aka
   http://meyering.net/cu/coreutils-7.5.65-61cc6.tar.gz
   http://meyering.net/cu/coreutils-7.5.65-61cc6.tar.xz

I've run the non-root checks without failures on my debian/sid x86 (32 bit)
system.

All 356 tests passed (32 tests were not run)
All 135 tests passed (16 tests were not run)

Erik




Re: new snapshot available: coreutils-7.4.125-eca6

2009-08-17 Thread Erik Auerswald
On Mon, Aug 17, 2009 at 10:51:54PM +0200, Jim Meyering wrote:
 Mike Frysinger wrote:
  Changes in coreutils since 7.4.115-c9c92:
 
  `make  make check` passes for me:
   - non-root user
   - glibc-2.10.1
   - gcc-4.4.1
   - linux-2.6.30.4
   - x86_64 system
 
 Good to hear.
 Thanks for the speedy feedback!

make  make check passes for me too:
- non-root
- Debian/sid (linux 2.6.30, gcc 4.3.4, glibc 2.9)
- x86_32 system

Erik




Re: no feedback on snapshot? coreutils-7.5 coming soon

2009-08-12 Thread Erik Auerswald
Hi Jim,

On Wed, Aug 12, 2009 at 12:42:59PM +0200, Jim Meyering wrote:
 AFAIK, I am the only one who has built the latest snapshot:
 
 http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/17604
 
 Though it's been only two days.
 
 Unless I hear of new bug reports or portability problems soon,
 expect coreutils-7.5 to be released in the next few days.

FWIW I built it and ran the basic tests (make check) without any errors
on Debian GNU/Linux unstable. I don't have access to any other systems
or hardware architectures for more interesting testing.

Erik
-- 
To do the Unix philosophy right, you have to be loyal to excellence.
-- Eric S. Raymond




Re: [PATCH] ls: Use pretty UTF-8 arrow when showing where symlinks point to

2009-08-06 Thread Erik Auerswald
Hello Lennart,

On Thu, Aug 06, 2009 at 07:24:42PM +0200, Lennart Poettering wrote:
 Diego Pettenò complained that ls -l doesn't use the UTF-8 arrow
 character to show where symlinks point to. This tiny patch fixes that.
 With this applied the character is used when the CODESET is UTF-8
 otherwise we fall back to the traditional - arrow.
 
 Ah, ls -l is so much prettier now with this oh-so-important patch! 
 For verification:
 
 http://pastie.org/573270

What if the used font does not include this symbol? Could you check this
as well?

BTW the symbol in the URL above looks different than that shown in xterm
(I used iceweasel for the URL, default fonts for xterm).

IMHO use of this symbol should not be enabled by default.

Br,
Erik
-- 
Premature optimization is the root of all evil.
-- Donald Knuth




Re: Problem with Hostname

2009-07-11 Thread Erik Auerswald
Hi,

On Sat, Jul 11, 2009 at 05:04:49AM -0400, Alfred M. Szmidt wrote:
This command that accepts the -f option is *not* the GNU hostname
command.
 
 There is a small confusion, there are two versions of GNU hostname.
 One that supports -f (GNU Inetutils hostname), and one that doesn't
 (GNU Coreutils hostname).  The one in coreutils is not installed by
 default.  Maybe we should remove the one in coreutils?

Having two different GNU hostname programs is a really bad idea. If the
coreutils one is not installed by default and provides just a very basic
implementation, it should be removed IMHO.

Thanks,
Erik


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: make check problems with coreutils 7.2 and earlier

2009-04-24 Thread Erik Auerswald
Hello Jim,

On Fri, Apr 24, 2009 at 07:26:45AM +0200, Jim Meyering wrote:
 Tim Mooney wrote:
 
  ./configure --prefix=/local/gnu --exec-prefix=/local/gnu --build 
  x86_64-sun-solaris2.10 --sysconfdir=/etc/local/gnu 
  --libdir=/local/gnu/lib/64 --mandir=/local/gnu/share/man 
  --infodir=/local/gnu/share/info --localstatedir=/var/local/gnu --disable-nls
 
  checking for a BSD-compatible install... /local/gnu/bin/ginstall -c
  checking whether build environment is sane... yes
 
 Considering all of the programs from /usr/xpg4/bin mentioned
 in that output, I suspect you have an unusual PATH.
 What is your PATH?

Just for info: the /usr/xpg4/bin directory is one of several to make
Solaris standards compliant. You can use different binary dirs to have
compliance to different versions of the standards.

Sorry, I don't have access to a Solaris machine right now, so I can't
check all the details.

Best regards,
Erik
-- 
But if you assume a bomb as small as the non-functioning space in a
wristwatch can blow up an airplane, you've got problems far bigger than
one particular brand of wristwatch.
-- Bruce Schneier


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: sort -g (generic numeric) is working only for the first key

2009-03-01 Thread Erik Auerswald
Hi Wasim,

On Sun, Mar 01, 2009 at 08:09:43PM +0530, Wasim Akram S.N. wrote:
 Hi,
 I don't know whether the following is really a bug.
 ...
 wa...@wasim:~/temp$ sort -g -k1,3 -t \t a

This tells sort to regard the first three fields as one key. I think
you need something like sort -g -k1,1 -k2,2 -k3,3 -t \t a which uses
the three fields as three keys. At least that results in your desired
output. ;-)

Br,
Erik


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: new snapshot available: coreutils-6.12.208-2441

2008-09-27 Thread Erik Auerswald
Hi Jim,

On Sat, Sep 27, 2008 at 11:45:40AM +0200, Jim Meyering wrote:
 Here's a snapshot of the latest sources from coreutils
 and the parts of gnulib that it uses.  Please beat it up ;-)
 If things work out, I may even make a test release by Wednesday.

A quick glance showed me no obvious problems, but I'm using a rather
recent linux/x86 system, nothing fancy, so I would have expected nothing
different. ;-)

It would be nice if you could add my name to the THANKS file as well,
right now it's mentioned in the ChangeLog only. Thank you. ;-)

Regards,
Erik
-- 
Perfect is the enemy of good.
-- Linus Torvalds


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages

2008-06-01 Thread Erik Auerswald
Hi,

On Sun, Jun 01, 2008 at 04:51:43PM +0200, Jim Meyering wrote:
 Erik Auerswald [EMAIL PROTECTED] wrote:
  And here it is... (attached).
 
 Thanks.
 Here are some minor changes I expect to amend into your patch.
 They alphabetize lists, tweak wording and correct a comment.
 Plus, in NEWS (as in ChangeLog/commit log), it's good to list all
 program names explicitly, rather than abbreviating like sha*sum.

This is fine with me, go ahead.

Erik


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Feat-Req: uniq -c delimiter should be changable

2008-05-17 Thread Erik Auerswald
Hi,

On Sat, May 17, 2008 at 04:36:30PM +0200, Maximilian Haeussler wrote:
 
 Let's say I only want the 50 most common lines of a file:
 cat textfile | sort | uniq -c | sort -n | tail -n 50 | tr -s ' ' | cut -f2
 
 This will only print the first word of each line with the current uniq
 version though if uniq -c understood -d/-t it would print the whole

You should tell `cut' to output every field starting after the number
field inserted by `uniq':

sort textfile | uniq -c | sort -n | tail -n50 | tr -s ' ' | cut -f3- -d' '

Erik


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages

2008-05-13 Thread Erik Auerswald
Hi,

On Sun, May 04, 2008 at 06:33:45PM +0200, Erik Auerswald wrote:
 On Tue, Apr 22, 2008 at 07:10:58PM +0200, Jim Meyering wrote:
  Erik Auerswald [EMAIL PROTECTED] wrote:
   On Tue, Apr 22, 2008 at 06:05:48PM +0200, Jim Meyering wrote:
   Erik Auerswald [EMAIL PROTECTED] wrote:
IMHO md5sum and sha*sum are too verbose by default, especially when
checking a large collection of files with only a few failing 
validation.
Therefore I'd like to see an option added to suppress just the output
for successfully verified files.
  
   The only suggestion I can make so far is to omit the short-named -q 
   option.
   The --q abbreviation of --quiet is only one byte longer.
 
 The attached patch adds the above feature and adds an option --quiet to
 md5sum and sha*sum (no short option added).
 
 The signed copyright assignment is on it's way, the patch is against the
 current git HEAD.

The copyright assignment process with the FSF is completed, find the
patch against current HEAD as an attachment.

Erik
-- 
I don't want to see the state of the file when I'm editing.
-- Ken Thompson
From f776112966d6771b3c7dd8228fa09fb129c8edde Mon Sep 17 00:00:00 2001
From: Erik Auerswald [EMAIL PROTECTED]
Date: Tue, 13 May 2008 14:13:30 +0200
Subject: [PATCH] md5sum+sha*sum: add option --quiet to suppress OK messages

* src/md5sum.c: add option --quiet to suppress OK messages
* doc/coreutils.texi: document option --quiet
* tests/misc/md5sum: add test for option --quiet
* NEWS: mention new option --quiet for md5sum+sha*sum in New
  features section

Signed-off-by: Erik Auerswald [EMAIL PROTECTED]
---
 NEWS   |5 +
 doc/coreutils.texi |9 +
 src/md5sum.c   |   30 ++
 tests/misc/md5sum  |9 +
 4 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 8cee7f5..69f5444 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,11 @@ GNU coreutils NEWS-*- outline -*-
   tac: avoid segfault with --regex (-r) and multiple files, e.g.,
   echo  x; tac -r x x.  [bug present at least in textutils-1.8b, from 1992]
 
+** New features
+
+  md5sum and sha*sum now know an option --quiet to suppress the printing
+  of 'OK' messages.
+
 
 * Noteworthy changes in release 6.11 (2008-04-19) [stable]
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 206f8dd..8a03581 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -3287,6 +3287,15 @@ If all listed files are readable and are consistent with the associated
 MD5 checksums, exit successfully.  Otherwise exit with a status code
 indicating there was a failure.
 
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] verifying MD5 checksums
+This option is useful only when verifying checksums.
+When verifying checksums, don't generate an 'OK' message per successfully
+checked file.  Files that fail the verification are reported in the
+default one-line-per-file format.  If any files failed verification,
+a warning summarizing any failures is printed to standard error.
+
 @item -t
 @itemx --text
 @opindex -t
diff --git a/src/md5sum.c b/src/md5sum.c
index f83a7b1..1327ced 100644
--- a/src/md5sum.c
+++ b/src/md5sum.c
@@ -114,6 +114,9 @@ static bool status_only = false;
improperly formatted checksum line.  */
 static bool warn = false;
 
+/* With --quiet, don't print a message for successfully verified files */
+static bool quiet = false;
+
 /* The name this program was run with.  */
 char *program_name;
 
@@ -121,7 +124,8 @@ char *program_name;
non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
 enum
 {
-  STATUS_OPTION = CHAR_MAX + 1
+  STATUS_OPTION = CHAR_MAX + 1,
+  QUIET_OPTION
 };
 
 static const struct option long_options[] =
@@ -131,6 +135,7 @@ static const struct option long_options[] =
   { status, no_argument, NULL, STATUS_OPTION },
   { text, no_argument, NULL, 't' },
   { warn, no_argument, NULL, 'w' },
+  { quiet, no_argument, NULL, QUIET_OPTION },
   { GETOPT_HELP_OPTION_DECL },
   { GETOPT_VERSION_OPTION_DECL },
   { NULL, 0, NULL, 0 }
@@ -174,8 +179,9 @@ With no FILE, or when FILE is -, read standard input.\n\
 ), stdout);
   fputs (_(\
 \n\
-The following two options are useful only when verifying checksums:\n\
+The following three options are useful only when verifying checksums:\n\
   --statusdon't output anything, status code shows success\n\
+  --quiet no output for successfully verified files\n\
   -w, --warn  warn about improperly formatted checksum lines\n\
 \n\
 ), stdout);
@@ -527,8 +533,10 @@ digest_check (const char *checkfile_name)
 
 	  if (!status_only)
 		{
-		  printf (%s: %s\n, filename,
-			  (cnt != digest_bin_bytes ? _(FAILED) : _(OK)));
+		  if (cnt != digest_bin_bytes)
+		printf (%s: %s\n, filename, _(FAILED));
+		  else if (!quiet)
+		printf (%s: %s\n, filename, _(OK));
 		  fflush (stdout

Re: [PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages

2008-05-13 Thread Erik Auerswald
Hi,

On Tue, May 13, 2008 at 03:12:57PM +0200, Jim Meyering wrote:
 Erik Auerswald [EMAIL PROTECTED] wrote:
  The copyright assignment process with the FSF is completed, find the
  patch against current HEAD as an attachment.
 
 Thanks again.
 
 I assume that means you sent it.
 The process is complete when the FSF says they've received it
 and everything is in order.

It means the FSF said it's completed:

On Wed, May 07, 2008 at 04:56:10PM -0400, Jonas Jacobson via RT wrote:
 Hello Erik,

 Your COREUTILS assignment/disclaimer process with the FSF is currently
 complete; please find attached a pdf* copy of the signed form.

On Tue, May 13, 2008 at 03:12:57PM +0200, Jim Meyering wrote:
 coreutils-6.12 (soon, I hope) will be a bug-fix-only release,
 So your option addition will be in the following release.

OK, thanks. Do you want me to send an updated patch after coreutils-6.12
is released?

Erik


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages

2008-05-04 Thread Erik Auerswald
Hi,

On Tue, Apr 22, 2008 at 07:10:58PM +0200, Jim Meyering wrote:
 Erik Auerswald [EMAIL PROTECTED] wrote:
  On Tue, Apr 22, 2008 at 06:05:48PM +0200, Jim Meyering wrote:
  Erik Auerswald [EMAIL PROTECTED] wrote:
   IMHO md5sum and sha*sum are too verbose by default, especially when
   checking a large collection of files with only a few failing validation.
   Therefore I'd like to see an option added to suppress just the output
   for successfully verified files.
 
  The only suggestion I can make so far is to omit the short-named -q 
  option.
  The --q abbreviation of --quiet is only one byte longer.

The attached patch adds the above feature and adds an option --quiet to
md5sum and sha*sum (no short option added).

The signed copyright assignment is on it's way, the patch is against the
current git HEAD.

Erik
From 2d18b7e53146be66de0102dd4f0b472ff12f59c6 Mon Sep 17 00:00:00 2001
From: Erik Auerswald [EMAIL PROTECTED]
Date: Sun, 4 May 2008 18:32:45 +0200
Subject: [PATCH] md5sum+sha*sum: add option --quiet to suppress OK messages
To: bug-coreutils@gnu.org

* src/md5sum.c: add option --quiet to suppress OK messages
* doc/coreutils.texi: document option --quiet
* tests/misc/md5sum: add test for option --quiet
* NEWS: mention new option --quiet for md5sum+sha*sum in New
  features section

Signed-off-by: Erik Auerswald [EMAIL PROTECTED]
---
 NEWS   |5 +
 doc/coreutils.texi |9 +
 src/md5sum.c   |   30 ++
 tests/misc/md5sum  |9 +
 4 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index faf2b1d..83072c6 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,11 @@ GNU coreutils NEWS-*- outline -*-
   Printing of such large-numbered, kernel-only (not in /etc/group) group-IDs
   was suppressed in 6.11 due to ignorance that they are useful.
 
+** New features
+
+  md5sum and sha*sum now know an option --quiet to suppress the printing
+  of 'OK' messages.
+
 
 * Noteworthy changes in release 6.11 (2008-04-19) [stable]
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index f42e736..d9e95e9 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -3287,6 +3287,15 @@ If all listed files are readable and are consistent with the associated
 MD5 checksums, exit successfully.  Otherwise exit with a status code
 indicating there was a failure.
 
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] verifying MD5 checksums
+This option is useful only when verifying checksums.
+When verifying checksums, don't generate an 'OK' message per successfully
+checked file. Files that fail the verification are reported in the
+default one-line-per-file format. If any files failed verification,
+a warning summarizing any failures is printed to standard error.
+
 @item -t
 @itemx --text
 @opindex -t
diff --git a/src/md5sum.c b/src/md5sum.c
index f83a7b1..1327ced 100644
--- a/src/md5sum.c
+++ b/src/md5sum.c
@@ -114,6 +114,9 @@ static bool status_only = false;
improperly formatted checksum line.  */
 static bool warn = false;
 
+/* With --quiet, don't print a message for successfully verified files */
+static bool quiet = false;
+
 /* The name this program was run with.  */
 char *program_name;
 
@@ -121,7 +124,8 @@ char *program_name;
non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
 enum
 {
-  STATUS_OPTION = CHAR_MAX + 1
+  STATUS_OPTION = CHAR_MAX + 1,
+  QUIET_OPTION
 };
 
 static const struct option long_options[] =
@@ -131,6 +135,7 @@ static const struct option long_options[] =
   { status, no_argument, NULL, STATUS_OPTION },
   { text, no_argument, NULL, 't' },
   { warn, no_argument, NULL, 'w' },
+  { quiet, no_argument, NULL, QUIET_OPTION },
   { GETOPT_HELP_OPTION_DECL },
   { GETOPT_VERSION_OPTION_DECL },
   { NULL, 0, NULL, 0 }
@@ -174,8 +179,9 @@ With no FILE, or when FILE is -, read standard input.\n\
 ), stdout);
   fputs (_(\
 \n\
-The following two options are useful only when verifying checksums:\n\
+The following three options are useful only when verifying checksums:\n\
   --statusdon't output anything, status code shows success\n\
+  --quiet no output for successfully verified files\n\
   -w, --warn  warn about improperly formatted checksum lines\n\
 \n\
 ), stdout);
@@ -527,8 +533,10 @@ digest_check (const char *checkfile_name)
 
 	  if (!status_only)
 		{
-		  printf (%s: %s\n, filename,
-			  (cnt != digest_bin_bytes ? _(FAILED) : _(OK)));
+		  if (cnt != digest_bin_bytes)
+		printf (%s: %s\n, filename, _(FAILED));
+		  else if (!quiet)
+		printf (%s: %s\n, filename, _(OK));
 		  fflush (stdout);
 		}
 	}
@@ -621,6 +629,7 @@ main (int argc, char **argv)
   case STATUS_OPTION:
 	status_only = true;
 	warn = false;
+	quiet = false;
 	break;
   case 't':
 	binary = 0;
@@ -628,6 +637,12 @@ main (int argc, char **argv)
   case 'w':
 	status_only = false;
 	warn

Re: [PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages

2008-04-22 Thread Erik Auerswald
Hi,

On Tue, Apr 22, 2008 at 06:05:48PM +0200, Jim Meyering wrote:
 Erik Auerswald [EMAIL PROTECTED] wrote:
  IMHO md5sum and sha*sum are too verbose by default, especially when
  checking a large collection of files with only a few failing validation.
  Therefore I'd like to see an option added to suppress just the output
  for successfully verified files.
 
  The attached patch does that by adding the option --quiet/-q, including
  documentation and a testcase.
 
 Thank you for a fine patch.
 That looks very good.
 The only suggestion I can make so far is to omit the short-named -q option.
 The --q abbreviation of --quiet is only one byte longer.

The short -q is commonly used for --quiet, I would expect it's
existence for every progam having short option names and this kind of
functionality.

If you insist on omitting it I'll remove it from the next version of the
patch, but I'd prefer it with -q.

 However, we'll need to deal with copyright paperwork
 before I can apply it.  Please follow the instructions in
 the file, HACKING, in the Copyright assignment section.

I'll see to this.

 P.S., please adjust your mail client.
 Currently, it emits an invalid Mail-Followup-To: line:
 
   Mail-Followup-To: auerswal, bug-coreutils@gnu.org

Sorry, should be fixed.

Erik


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: wordcount (wc)

2008-04-21 Thread Erik Auerswald
Hi,

On Mon, Apr 21, 2008 at 04:27:35PM +0200, Almer S. Tigelaar wrote:
 I have been using the 'wc' program (version 5.97) to manually verify
 some counts outputted by a component part of an application I am
 developing.
 
 I noticed that:
   echo 12345 | wc -m
 Gives me '6' as output. But I don't entirely understand why.
 
 On multi-line input 'wc' seems to add '1' to the character count in each
 sentence. One would say then that this '1' is caused by counting
 'invisible' newline characters, but there is no newline in the example
 above.

There is a newline added by echo. Use echo -n to avoid this.

Erik


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


[PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages

2008-04-20 Thread Erik Auerswald
Hi,

IMHO md5sum and sha*sum are too verbose by default, especially when
checking a large collection of files with only a few failing validation.
Therefore I'd like to see an option added to suppress just the output
for successfully verified files.

The attached patch does that by adding the option --quiet/-q, including
documentation and a testcase.

Erik
From c654c6aea71e636d627f09b06d2153dc99b3bac1 Mon Sep 17 00:00:00 2001
From: Erik Auerswald [EMAIL PROTECTED]
Date: Sun, 13 Apr 2008 18:12:11 +0200
Subject: [PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages
To: bug-coreutils@gnu.org

* src/md5sum.c: add option --quiet/-q to suppress OK messages
* doc/coreutils.texi: document option --quiet/-q
* tests/misc/md5sum: add test for option --quiet/-q
* NEWS: mention new option --quiet/-q for md5sum+sha*sum in New
  features section

Signed-off-by: Erik Auerswald [EMAIL PROTECTED]
---
 NEWS   |3 +++
 doc/coreutils.texi |   11 +++
 src/md5sum.c   |   29 +
 tests/misc/md5sum  |8 
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 04893c6..684b411 100644
--- a/NEWS
+++ b/NEWS
@@ -62,6 +62,9 @@ GNU coreutils NEWS-*- outline -*-
 
 ** New features
 
+  md5sum and sha*sum now know an option --quiet/-q to suppress the
+  printing of 'OK' messages.
+
   join now verifies that the inputs are in sorted order.  This check can
   be turned off with the --nocheck-order option.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index f42e736..ae17705 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -3287,6 +3287,17 @@ If all listed files are readable and are consistent with the associated
 MD5 checksums, exit successfully.  Otherwise exit with a status code
 indicating there was a failure.
 
[EMAIL PROTECTED] -q
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] -q
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] verifying MD5 checksums
+This option is useful only when verifying checksums.
+When verifying checksums, don't generate an 'OK' message per successfully
+checked file. Files that fail the verification are reported in the
+default one-line-per-file format. If any files failed verification,
+a warning summarizing any failures is printed to standard error.
+
 @item -t
 @itemx --text
 @opindex -t
diff --git a/src/md5sum.c b/src/md5sum.c
index f83a7b1..8e8d1bb 100644
--- a/src/md5sum.c
+++ b/src/md5sum.c
@@ -114,6 +114,9 @@ static bool status_only = false;
improperly formatted checksum line.  */
 static bool warn = false;
 
+/* With --quiet, don't print a message for successfully verified files */
+static bool quiet = false;
+
 /* The name this program was run with.  */
 char *program_name;
 
@@ -131,6 +134,7 @@ static const struct option long_options[] =
   { status, no_argument, NULL, STATUS_OPTION },
   { text, no_argument, NULL, 't' },
   { warn, no_argument, NULL, 'w' },
+  { quiet, no_argument, NULL, 'q' },
   { GETOPT_HELP_OPTION_DECL },
   { GETOPT_VERSION_OPTION_DECL },
   { NULL, 0, NULL, 0 }
@@ -174,8 +178,9 @@ With no FILE, or when FILE is -, read standard input.\n\
 ), stdout);
   fputs (_(\
 \n\
-The following two options are useful only when verifying checksums:\n\
+The following three options are useful only when verifying checksums:\n\
   --statusdon't output anything, status code shows success\n\
+  -q, --quiet no output for successfully verified files\n\
   -w, --warn  warn about improperly formatted checksum lines\n\
 \n\
 ), stdout);
@@ -527,8 +532,10 @@ digest_check (const char *checkfile_name)
 
 	  if (!status_only)
 		{
-		  printf (%s: %s\n, filename,
-			  (cnt != digest_bin_bytes ? _(FAILED) : _(OK)));
+		  if (cnt != digest_bin_bytes)
+		printf (%s: %s\n, filename, _(FAILED));
+		  else if (!quiet)
+		printf (%s: %s\n, filename, _(OK));
 		  fflush (stdout);
 		}
 	}
@@ -609,7 +616,7 @@ main (int argc, char **argv)
 
   atexit (close_stdout);
 
-  while ((opt = getopt_long (argc, argv, bctw, long_options, NULL)) != -1)
+  while ((opt = getopt_long (argc, argv, bctwq, long_options, NULL)) != -1)
 switch (opt)
   {
   case 'b':
@@ -621,6 +628,7 @@ main (int argc, char **argv)
   case STATUS_OPTION:
 	status_only = true;
 	warn = false;
+	quiet = false;
 	break;
   case 't':
 	binary = 0;
@@ -628,6 +636,12 @@ main (int argc, char **argv)
   case 'w':
 	status_only = false;
 	warn = true;
+	quiet = false;
+	break;
+  case 'q':
+	status_only = false;
+	warn = false;
+	quiet = true;
 	break;
   case_GETOPT_HELP_CHAR;
   case_GETOPT_VERSION_CHAR (PROGRAM_NAME, AUTHORS);
@@ -659,6 +673,13 @@ main (int argc, char **argv)
   usage (EXIT_FAILURE);
 }
 
+  if (quiet  !do_check)
+{
+  error (0, 0,
+   _(the --quiet option is meaningful only when verifying checksums));
+  usage (EXIT_FAILURE);
+}
+
   if (!O_BINARY  binary  0

(no subject)

2008-04-13 Thread Erik Auerswald
From 0bd30949c1953fc5339fc5cf30cc2527d3e660d7 Mon Sep 17 00:00:00 2001
From: Erik Auerswald [EMAIL PROTECTED]
Date: Sun, 13 Apr 2008 18:12:11 +0200
Subject: [PATCH] md5sum+sha*sum: add option --quiet/-q to suppress OK messages

* src/md5sum.c: add option --quiet/-q to suppress OK messages
* doc/coreutils.texi: document option --quiet/-q
* tests/misc/md5sum: add test for option --quiet/-q
* NEWS: mention new option --quiet/-q for md5sum+sha*sum in New
  features section

Signed-off-by: Erik Auerswald [EMAIL PROTECTED]
---
 NEWS   |3 +++
 doc/coreutils.texi |   11 +++
 src/md5sum.c   |   29 +
 tests/misc/md5sum  |8 
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index e208b30..5a97f13 100644
--- a/NEWS
+++ b/NEWS
@@ -47,6 +47,9 @@ GNU coreutils NEWS-*- 
outline -*-
 
 ** New features
 
+  md5sum and sha*sum now know an option --quiet/-q to suppress the
+  printing of 'OK' messages.
+
   join now verifies that the inputs are in sorted order.  This check can
   be turned off with the --nocheck-order option.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 01c2f00..7bffd34 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -3287,6 +3287,17 @@ If all listed files are readable and are consistent with 
the associated
 MD5 checksums, exit successfully.  Otherwise exit with a status code
 indicating there was a failure.
 
[EMAIL PROTECTED] -q
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] -q
[EMAIL PROTECTED] --quiet
[EMAIL PROTECTED] verifying MD5 checksums
+This option is useful only when verifying checksums.
+When verifying checksums, don't generate an 'OK' message per successfully
+checked file. Files that fail the verification are reported in the
+default one-line-per-file format. If any files failed verification,
+a warning summarizing any failures is printed to standard error.
+
 @item -t
 @itemx --text
 @opindex -t
diff --git a/src/md5sum.c b/src/md5sum.c
index 28bde99..821a3ad 100644
--- a/src/md5sum.c
+++ b/src/md5sum.c
@@ -114,6 +114,9 @@ static bool status_only = false;
improperly formatted checksum line.  */
 static bool warn = false;
 
+/* With --quiet, don't print a message for successfully verified files */
+static bool quiet = false;
+
 /* The name this program was run with.  */
 char *program_name;
 
@@ -131,6 +134,7 @@ static const struct option long_options[] =
   { status, no_argument, NULL, STATUS_OPTION },
   { text, no_argument, NULL, 't' },
   { warn, no_argument, NULL, 'w' },
+  { quiet, no_argument, NULL, 'q' },
   { GETOPT_HELP_OPTION_DECL },
   { GETOPT_VERSION_OPTION_DECL },
   { NULL, 0, NULL, 0 }
@@ -174,8 +178,9 @@ With no FILE, or when FILE is -, read standard input.\n\
 ), stdout);
   fputs (_(\
 \n\
-The following two options are useful only when verifying checksums:\n\
+The following three options are useful only when verifying checksums:\n\
   --statusdon't output anything, status code shows success\n\
+  -q, --quiet no output for successfully verified files\n\
   -w, --warn  warn about improperly formatted checksum lines\n\
 \n\
 ), stdout);
@@ -521,8 +526,10 @@ digest_check (const char *checkfile_name)
 
  if (!status_only)
{
- printf (%s: %s\n, filename,
- (cnt != digest_bin_bytes ? _(FAILED) : _(OK)));
+ if (cnt != digest_bin_bytes)
+   printf (%s: %s\n, filename, _(FAILED));
+ else if (!quiet)
+   printf (%s: %s\n, filename, _(OK));
  fflush (stdout);
}
}
@@ -603,7 +610,7 @@ main (int argc, char **argv)
 
   atexit (close_stdout);
 
-  while ((opt = getopt_long (argc, argv, bctw, long_options, NULL)) != -1)
+  while ((opt = getopt_long (argc, argv, bctwq, long_options, NULL)) != -1)
 switch (opt)
   {
   case 'b':
@@ -615,6 +622,7 @@ main (int argc, char **argv)
   case STATUS_OPTION:
status_only = true;
warn = false;
+   quiet = false;
break;
   case 't':
binary = 0;
@@ -622,6 +630,12 @@ main (int argc, char **argv)
   case 'w':
status_only = false;
warn = true;
+   quiet = false;
+   break;
+  case 'q':
+   status_only = false;
+   warn = false;
+   quiet = true;
break;
   case_GETOPT_HELP_CHAR;
   case_GETOPT_VERSION_CHAR (PROGRAM_NAME, AUTHORS);
@@ -653,6 +667,13 @@ main (int argc, char **argv)
   usage (EXIT_FAILURE);
 }
 
+  if (quiet  !do_check)
+{
+  error (0, 0,
+   _(the --quiet option is meaningful only when verifying checksums));
+  usage (EXIT_FAILURE);
+}
+
   if (!O_BINARY  binary  0)
 binary = 0;
 
diff --git a/tests/misc/md5sum b/tests/misc/md5sum
index ca23d94..da58801 100755
--- a/tests/misc/md5sum
+++ b/tests/misc/md5sum
@@ -51,6 +51,14