bug#46815: cp integer overflow in progress (time remaining)

2021-02-27 Thread Paul Eggert

On 2/27/21 7:35 AM, Ronald Knol wrote:

This is "cp -argu  ". The source tree contains more
than 2TiB worth of data.

I believe the issue is in src/copy.c where (on line 355) an INT is used to
store "cur_size".

 int cur_size = g_iTotalWritten + *total_n_read / 1024;


GNU coreutils 'cp' lacks a 'g' option, and doesn't have the line number 
you mentioned. It sounds like you're dealing with a bug in a modified 
version of 'cp', which means you should direct your bug report to 
whoever made that modification.






bug#45358: bootstrap fails due to a certificate mismatch

2021-02-16 Thread Paul Eggert

On 2/15/21 3:07 AM, Grigoriy Sokolik wrote:


But be careful, this is really bad advice: fetching anything without
consistency ad authority validation is really insecure!


Yes, we should instead fix the underlying problem whatever it is (not 
sure what it is since that wasn't reported).






bug#46169: Parallelize merge sort

2021-01-29 Thread Paul Eggert

On 1/29/21 1:07 AM, Ole Tange wrote:

Could you consider implementing a parallel merge, so I can retire
parsort?


Yes, improving that part of 'sort' performance has been on my long list 
of things to do for quite some time. If someone else could take up the 
task it'd be done quicker Anyway, thanks for reporting it as a bug, 
so that we can track it in our bug database.






bug#46060: Offer ls --limit=...

2021-01-24 Thread Paul Eggert

On 1/23/21 1:13 PM, 積丹尼 Dan Jacobson wrote:

And any database command already has
a --limit option these days, and does not rely on a second program to
trim its output because it can't control itself. Indeed, on some remote
connections one would only want to launch one program, not two.


That argument would apply to any program, no? "cat", "diff", "sh", 
"node",


Not sure why "ls" needs a convenience flag that would complicate the 
documentation and maintenance and be so rarely useful.






bug#46048: split -n K/N loses data, sum of output files is smaller than input file.

2021-01-24 Thread Paul Eggert

On 1/24/21 8:52 AM, Pádraig Brady wrote:

-  if (lseek (STDIN_FILENO, start, SEEK_CUR) < 0)
+  if (lseek (STDIN_FILENO, start, SEEK_SET) < 0)


Dumb question: will this handle the case where you're splitting from 
stdin and stdin is a seekable file and its initial file offset is nonzero?






bug#45924: RFE: rmdir -r: recursively remove [empty] directories under the target.

2021-01-18 Thread Paul Eggert

On 1/18/21 8:08 AM, Bernhard Voelker wrote:

On 1/17/21 11:18 PM, Paul Eggert wrote:

find DIR -depth -type d -exec rmdir {} +


find(1) can also find empty directories and delete them:

   $ find DIR -type d -empty -delete


Thanks, I'd forgotten about that.

I added the attached to the manual, as the point seems worth documenting 
even if we don't change the code.
>From eebed78799a7996dd80b66c493a0fc199705dea3 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 18 Jan 2021 21:08:39 -0800
Subject: [PATCH] doc: rmdir --recursive substitutes

* doc/coreutils.texi (rmdir invocation): Add note on how to remove
empty subdirectories recursively.
---
 doc/coreutils.texi | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index fe2fc52b7..94c9fbfa5 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -11006,7 +11006,19 @@ Give a diagnostic for each successful removal.
 
 @end table
 
-@xref{rm invocation}, for how to remove non-empty directories (recursively).
+@xref{rm invocation}, for how to remove non-empty directories recursively.
+
+To remove all empty directories under @var{dirname}, including
+directories that become empty because other directories are removed,
+you can use either of the following commands:
+
+@example
+# This uses GNU extensions.
+find @var{dirname} -type d -empty -delete
+
+# This runs on any POSIX platform.
+find @var{dirname} -depth -type d -exec rmdir @{@} +
+@end example
 
 @exitstatus
 
-- 
2.27.0



bug#45924: RFE: rmdir -r: recursively remove [empty] directories under the target.

2021-01-18 Thread Paul Eggert

On 1/18/21 2:53 AM, L A Walsh wrote:

Except that 'find DIR -depth -type d -exec rmdir {} +'
is anything but simple and not something anyone outside of
a minority of *nix users would have a clue about how to create, whereas 
'rmdir -r DIR' is both direct and simple and

more easily understandable


It's not that simple. For example, it's not clear whether rmdir -r 
should also remove directories containing only empty subdirectories, 
which is what you asked for. Perhaps some people would want that, 
perhaps they'd want to remove just empty leaf directories. Or perhaps 
they'd want rmdir to remove empty subdirectories only if it has 
permission to do so. Or maybe they'd want to also remove subdirectories 
whose directory entries are all hidden (start with '.'). Or there are 
lots of other possible things people could plausibly want.


This is what 'find' is for. If people needed to do something like "rmdir 
-r" every day then it'd be plausible to add it even though there's a 
simple substitute. But people don't, so let's stick with what we have.






bug#45924: RFE: rmdir -r: recursively remove [empty] directories under the target.

2021-01-17 Thread Paul Eggert

On 1/16/21 4:29 PM, L A Walsh wrote:

Yes, you could do it some other way, like by using 'find'


That's what I'd do, yes. 'find DIR -depth -type d -exec rmdir {} +'. I 
doubt whether it's worth hacking on this at the C level (complicating 
the documentation too) when there's such a simple and portable way to do 
this unusual task already.






bug#14371: bug#45886: mkdir -m argument does not work correctly, applies incorrect permissions

2021-01-15 Thread Paul Eggert
Thanks for the bug report. I reproduced the problem and installed the 
attached patch to fix it.
>From b8375c422ffe0e018cbb4cad187d1e909195d263 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 15 Jan 2021 02:57:59 -0800
Subject: [PATCH] mkdir: fix bug when -m's more generous than umask

Problem reported by David McCall (Bug#45886).
I introduced this problem when fixing Bug#14371.
* NEWS: Mention the fix.
* src/mkdir.c (struct mkdir_options): New members umask_ancestor,
umask_self, replacing umask_value.
(make_ancestor): Use them when temporarily adjusting umask.
(main): Set them, and set the umask to umask_self instead
of leaving it alone.
* tests/mkdir/perm.sh (tests): Add test case for bug.
---
 NEWS|  3 +++
 src/mkdir.c | 30 ++
 tests/mkdir/perm.sh |  1 +
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/NEWS b/NEWS
index c2474fee3..a6ba96450 100644
--- a/NEWS
+++ b/NEWS
@@ -20,6 +20,9 @@ GNU coreutils NEWS-*- outline -*-
   ls no longer crashes when printing the SELinux context for unstatable files.
   [bug introduced in coreutils-6.9.91]
 
+  mkdir -m no longer mishandles modes more generous than the umask.
+  [bug introduced in coreutils-8.22]
+
   nl now handles single character --section-delimiter arguments,
   by assuming a second ':' character has been specified, as specified by POSIX.
   [This bug was present in "the beginning".]
diff --git a/src/mkdir.c b/src/mkdir.c
index eccc9d382..b266cee8c 100644
--- a/src/mkdir.c
+++ b/src/mkdir.c
@@ -89,8 +89,11 @@ struct mkdir_options
  made.  */
   int (*make_ancestor_function) (char const *, char const *, void *);
 
-  /* Umask value in effect.  */
-  mode_t umask_value;
+  /* Umask value for when making an ancestor.  */
+  mode_t umask_ancestor;
+
+  /* Umask value for when making the directory itself.  */
+  mode_t umask_self;
 
   /* Mode for directory itself.  */
   mode_t mode;
@@ -130,20 +133,18 @@ make_ancestor (char const *dir, char const *component, void *options)
 error (0, errno, _("failed to set default creation context for %s"),
quoteaf (dir));
 
-  mode_t user_wx = S_IWUSR | S_IXUSR;
-  bool self_denying_umask = (o->umask_value & user_wx) != 0;
-  if (self_denying_umask)
-umask (o->umask_value & ~user_wx);
+  if (o->umask_ancestor != o->umask_self)
+umask (o->umask_ancestor);
   int r = mkdir (component, S_IRWXUGO);
-  if (self_denying_umask)
+  if (o->umask_ancestor != o->umask_self)
 {
   int mkdir_errno = errno;
-  umask (o->umask_value);
+  umask (o->umask_self);
   errno = mkdir_errno;
 }
   if (r == 0)
 {
-  r = (o->umask_value & S_IRUSR) != 0;
+  r = (o->umask_ancestor & S_IRUSR) != 0;
   announce_mkdir (dir, options);
 }
   return r;
@@ -282,8 +283,7 @@ main (int argc, char **argv)
   if (options.make_ancestor_function || specified_mode)
 {
   mode_t umask_value = umask (0);
-  umask (umask_value);
-  options.umask_value = umask_value;
+  options.umask_ancestor = umask_value & ~(S_IWUSR | S_IXUSR);
 
   if (specified_mode)
 {
@@ -293,10 +293,16 @@ main (int argc, char **argv)
  quote (specified_mode));
   options.mode = mode_adjust (S_IRWXUGO, true, umask_value, change,
   _bits);
+  options.umask_self = umask_value & ~options.mode;
   free (change);
 }
   else
-options.mode = S_IRWXUGO;
+{
+  options.mode = S_IRWXUGO;
+  options.umask_self = umask_value;
+}
+
+  umask (options.umask_self);
 }
 
   return savewd_process_files (argc - optind, argv + optind,
diff --git a/tests/mkdir/perm.sh b/tests/mkdir/perm.sh
index 4d36f19b5..083a47733 100755
--- a/tests/mkdir/perm.sh
+++ b/tests/mkdir/perm.sh
@@ -35,6 +35,7 @@ tests='
 050  :   -m 312   : drwx-w-rwx : d-wx--x-w- :
 160  :   empty: drwx--xrwx : drw---xrwx :
 160  :   -m 743   : drwx--xrwx : drwxr---wx :
+022  :   -m o-w   : drwxr-xr-x : drwxrwxr-x :
 027  :   -m =+x   : drwxr-x--- : d--x--x--- :
 027  :   -m =+X   : drwxr-x--- : d--x--x--- :
 -:   -: last   : last   :
-- 
2.27.0



bug#45749: [gnu.org #1673209] [Coreutils manual] Broken page link (from and to gnu.org)

2021-01-09 Thread Paul Eggert
Thanks for reporting that. I fixed it in Coreutils master on Savannah by 
applying the attached patch, and this should propagate out to the 
website after the next Coreutils release. Closing the Coreutils bug report.
>From 86640823d63e1c881ae56c5ae0cbc5f848ce7beb Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 9 Jan 2021 13:04:40 -0800
Subject: [PATCH] doc: modernize and fix regexp xref
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* doc/coreutils.texi: Fix regexp cross-reference that had become
out-of-date (Bug#45749).  Also, fix some obsolete references to
SunOS and to /usr/dict/words, and change “Linux” to “GNU/Linux”
where appropriate.  Unfortunately the pipeline example gets more
complicated since /usr/share/dict/words is not sorted the way that
‘comm’ wants.
---
 doc/coreutils.texi | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index e9dd21c4e..fe2fc52b7 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -7714,7 +7714,7 @@ high performance (``contiguous data'') file
 @item d
 directory
 @item D
-door (Solaris 2.5 and up)
+door (Solaris)
 @c @item F
 @c semaphore, if this is a distinct file type
 @item l
@@ -7728,7 +7728,7 @@ network special file (HP-UX)
 @item p
 FIFO (named pipe)
 @item P
-port (Solaris 10 and up)
+port (Solaris)
 @c @item Q
 @c message queue, if this is a distinct file type
 @item s
@@ -11824,7 +11824,7 @@ are also listed.
 @cindex file system space, retrieving old data more quickly
 Do not invoke the @code{sync} system call before getting any usage data.
 This may make @command{df} run significantly faster on systems with many
-disks, but on some systems (notably SunOS) the results may be slightly
+disks, but on some systems (notably Solaris) the results may be slightly
 out of date.  This is the default.
 
 @item --output
@@ -11925,7 +11925,7 @@ otherwise.  @xref{Block size}.
 @opindex --sync
 @cindex file system space, retrieving current data more slowly
 Invoke the @code{sync} system call before getting any usage data.  On
-some systems (notably SunOS), doing this yields more up to date results,
+some systems (notably Solaris), doing this yields more up to date results,
 but in general this option makes @command{df} much slower, especially when
 there are many or very busy file systems.
 
@@ -11980,7 +11980,7 @@ all systems.
 @opindex xfs @r{file system type}
 @opindex btrfs @r{file system type}
 A file system on a locally-mounted hard disk.  (The system might even
-support more than one type here; Linux does.)
+support more than one type here; GNU/Linux does.)
 
 @item iso9660@r{, }cdfs
 @cindex CD-ROM file system type
@@ -13564,9 +13564,8 @@ expression operators.
 @kindex \| @r{regexp operator}
 In the regular expression, @code{\+}, @code{\?}, and @code{\|} are
 operators which respectively match one or more, zero or one, or separate
-alternatives.  SunOS and other @command{expr}'s treat these as regular
-characters.  (POSIX allows either behavior.)
-@xref{Top, , Regular Expression Library, regex, Regex}, for details of
+alternatives.  These operators are GNU extensions.  @xref{Regular Expressions,,
+Regular Expressions, grep, The GNU Grep Manual}, for details of
 regular expression syntax.  Some examples are in @ref{Examples of expr}.
 
 @item match @var{string} @var{regex}
@@ -15204,7 +15203,7 @@ Switch to a different shell layer.  Non-POSIX.
 
 @item status
 @opindex status
-Send an info signal.  Not currently supported on Linux.  Non-POSIX.
+Send an info signal.  Not currently supported on GNU/Linux.  Non-POSIX.
 
 @item start
 @opindex start
@@ -16617,8 +16616,8 @@ parsed reliably.  In the following example, @var{kernel-version} is
 
 @example
 uname -a
-@result{} Linux dumdum.example.org 5.7.9-100.fc31.x86_64@c
- #1 SMP Fri Jul 17 17:18:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
+@result{} Linux dumdum.example.org 5.9.16-200.fc33.x86_64@c
+ #1 SMP Mon Dec 21 14:08:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
 @end example
 
 
@@ -19015,7 +19014,7 @@ might be used.  What it's really about is the ``Software Tools'' philosophy
 of program development and usage.
 
 The software tools philosophy was an important and integral concept
-in the initial design and development of Unix (of which Linux and GNU are
+in the initial design and development of Unix (of which GNU/Linux and GNU are
 essentially clones).  Unfortunately, in the modern day press of
 Internetworking and flashy GUIs, it seems to have fallen by the
 wayside.  This is a shame, since it provides a powerful mental model
@@ -19443,10 +19442,7 @@ A minor modification to the above pipeline can give us a simple spelling
 checker!  To determine if you've spelled a word correctly, all you have to
 do is look it up in a dictionary.  If it is not there, then chances are
 that your spelling is incorrect.  So, we need a dictionary.
-The conventional locat

bug#45700: rm should not prompt if ! isatty(2)

2021-01-06 Thread Paul Eggert

On 1/6/21 10:56 AM, John Wiersba via GNU coreutils Bug Reports wrote:

$ touch asdf && chmod a-w asdf && rm asdf 2>&1 | catrm: remove write-protected 
regular empty file 'asdf'?  # should*not*  prompt

If the prompt cannot be seen, then it can't be properly answered, so there is 
no point in prompting and consequently leaving the user with a hanging command 
and no way to know what's being expected of them.  Instead rm should attempt to 
remove the file and succeed or fail based on the result.


POSIX requires the current behavior; see clause 3 in:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html

Although GNU rm needn't follow POSIX blindly, it's doubtful that rm 
should remove the file in this particular case, as the longstanding 
tradition is that plain "rm" does not remove unwriteable files without 
more confirmation.


Since you know about "rm -f" I suggest using that (that's what everyone 
else does...).






bug#45648: `dd` seek/skip which way is up?

2021-01-04 Thread Paul Eggert

On 1/4/21 7:44 PM, Bela Lubkin wrote:

TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document
them as pure synonyms for 'skip' and 'seek'.


Thanks for doing all that research. It's compelling, and I think your 
patch (or something like it) should go in. I'll wait for a bit to hear 
other opinions.






bug#45648: `dd` seek/skip which way is up?

2021-01-04 Thread Paul Eggert

On 1/4/21 3:07 PM, Bernhard Voelker wrote:

I previously encountered a `dd` implementation which also accepted
'oseek=N' and 'iseek=N', which I found far more natural and easy to
remember.

What 'dd' implementation was this specifically?


Solaris dd has iseek and oseek. However, they are not aliases for skip 
and seek. If coreutils dd were to add these features I expect we should 
do them the Solaris way, instead of making them aliases for skip and 
seek. This would take more work than the proposed patches.


https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html





bug#45258: mkdir man page unclear in describing -m flag

2020-12-15 Thread Paul Eggert
Thanks for your bug report. I installed the attached patch; although it 
doesn't use the exact wording you proposed, I hope it works well enough.
>From 3ee0e25426a513c5da891ce6a370abed156a3b83 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 15 Dec 2020 11:52:19 -0800
Subject: [PATCH] doc: document mkdir -m -p better
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Chris Colohan wrote that the man page did not do enough to dispel
a common misunderstanding that “contributed to one of the scariest
outages Google has ever seen” (Bug#45258).
* doc/coreutils.texi (mkdir invocation):
* src/mkdir.c (usage): Document -m vs -p better.
---
 doc/coreutils.texi | 13 +
 src/mkdir.c|  3 ++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index df0655c20..44ce7d2e0 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -10693,6 +10693,8 @@ Set the file permission bits of created directories to @var{mode},
 which uses the same syntax as
 in @command{chmod} and uses @samp{a=rwx} (read, write and execute allowed for
 everyone) for the point of the departure.  @xref{File permissions}.
+This option affects only directories given on the command line;
+it does not affect any parents that may be created via the @option{-p} option.
 
 Normally the directory has the desired file mode bits at the moment it
 is created.  As a GNU extension, @var{mode} may also mention
@@ -10708,15 +10710,18 @@ overridden in this way.
 @opindex --parents
 @cindex parent directories, creating
 Make any missing parent directories for each argument, setting their
-file permission bits to the umask modified by @samp{u+wx}.  Ignore
+file permission bits to @samp{=rwx,u+wx},
+that is, with the umask modified by @samp{u+wx}.  Ignore
 existing parent directories, and do not change their file permission
 bits.
 
-To set the file permission bits of any newly-created parent
-directories to a value that includes @samp{u+wx}, you can set the
+If the @option{-m} option is also given, it does not affect
+file permission bits of any newly-created parent directories.
+To control these bits, set the
 umask before invoking @command{mkdir}.  For example, if the shell
 command @samp{(umask u=rwx,go=rx; mkdir -p P/Q)} creates the parent
-@file{P} it sets the parent's permission bits to @samp{u=rwx,go=rx}.
+@file{P} it sets the parent's file permission bits to @samp{u=rwx,go=rx}.
+(The umask must include @samp{u=wx} for this method to work.)
 To set a parent's special mode bits as well, you can invoke
 @command{chmod} after @command{mkdir}.  @xref{Directory Setuid and
 Setgid}, for how the set-user-ID and set-group-ID bits of
diff --git a/src/mkdir.c b/src/mkdir.c
index 8f07d666e..1f4588f10 100644
--- a/src/mkdir.c
+++ b/src/mkdir.c
@@ -65,7 +65,8 @@ Create the DIRECTORY(ies), if they do not already exist.\n\
 
   fputs (_("\
   -m, --mode=MODE   set file mode (as in chmod), not a=rwx - umask\n\
-  -p, --parents no error if existing, make parent directories as needed\n\
+  -p, --parents no error if existing, make parent directories as needed,\n\
+with their file modes unaffected by any -m option.\n\
   -v, --verbose print a message for each created directory\n\
 "), stdout);
   fputs (_("\
-- 
2.27.0



bug#45093: Character 149 causing ASCII BEL output to console in Windoze port of Gnu CoreUtils

2020-12-07 Thread Paul Eggert

On 12/7/20 11:38 AM, Robert S. Kissel wrote:

If you could possibly direct me to the maintainers of the pre-compiled
Windoze port (I'm certain that I downloaded it from the gnu.org
Web-site)


Sure about that? I'm not aware of any. At any rate, whereever you downloaded it 
from should have contact info.






bug#45093: Character 149 causing ASCII BEL output to console in Windoze port of Gnu CoreUtils

2020-12-07 Thread Paul Eggert

On 12/6/20 8:23 PM, Robert S. Kissel wrote:

I'm pretty sure this is a bug in the Windoze port of head and tail,


You should have better luck writing directly to the people who prepared that 
port, as they don't hang out on this mailing list and we largely don't worry 
about MS-Windows.






bug#44763: Error when 'make'ing latest version of coreutils

2020-11-21 Thread Paul Eggert
Thanks for reporting your recipe for working around all these problems. I've 
installed patches for the problems into coreutils and gnulib and am closing the 
bug report.


On 11/21/20 3:45 PM, Chris Elvidge wrote:
git commit -m 'build: update gnulib submodule to latest' gnulib 2>&1 | tee -a 
$outfiles/out_commit.1.txt


I suggest doing the bootstrap after this 'git commit', not earlier.


Because of the abovementioned patches, you should no longer need to do the 
following steps:



# Berny's addition
git clean -xdfq && ./bootstrap 2>&1 | tee -a $outfiles/out_bootstrap.2.txt

./configure 2>&1 | tee -a $outfiles/out_configure.1.txt

# do edit to make make work
# Akim's change - make it expect a long not a long long
sed -i -e '2301s/%"PRIdMAX"/%ld/' lib/parse-datetime.y
sed -n 2301p lib/parse-datetime.y

# do three edits to make make check work
# put 'return NULL;' back before '/*NOTREACHED*/' # explained by Berny
sed -i -e '184s#\(/\*NOTREACHED\*/\)#return NULL; \1#' 
gnulib/tests/test-nl_langinfo-mt.c

sed -n 184p gnulib/tests/test-nl_langinfo-mt.c
sed -i -e '94s#\(/\*NOTREACHED\*/\)#return NULL; \1#' 
gnulib/tests/test-setlocale_null-mt-all.c

sed -n 94p gnulib/tests/test-setlocale_null-mt-all.c
sed -i -e '94s#\(/\*NOTREACHED\*/\)#return NULL; \1#' 
gnulib/tests/test-setlocale_null-mt-one.c

sed -n 94p gnulib/tests/test-setlocale_null-mt-one.c

# pause here to make sure edits done properly
read -p "Press return to continue" junk







bug#44763: Error when 'make'ing latest version of coreutils

2020-11-21 Thread Paul Eggert

On 11/21/20 6:37 AM, Chris Elvidge wrote:

parse-datetime.y: In function 'parse_datetime2':
parse-datetime.y:2301:27: error: format '%lld' expects argument of type 'long 
long int', but argument 2 has type 'time_t {aka long int}' [-Werror=format=]


That's due to a typo that I recently introduced to parse-datetime.y. Thanks for 
reporting it. (I didn't observe the problem since I tested on hosts with 64-bit 
time_t, not 32-bit.) I installed the attached patch into Gnulib and propagated 
this into Coreutils.
>From fdf0468198631a456406edc09983972edb8fa5c4 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 21 Nov 2020 19:04:10 -0800
Subject: [PATCH] parse-datetime: fix printf format typo

* lib/parse-datetime.y (parse_datetime2): Fix format typo in
previous patch to this file.  Problem reported by Chris Elvidge in
<https://bugs.gnu.org/44763#32>.
---
 ChangeLog| 5 +
 lib/parse-datetime.y | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index de92d102e..229945e86 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2020-11-21  Paul Eggert  
 
+	parse-datetime: fix printf format typo
+	* lib/parse-datetime.y (parse_datetime2): Fix format typo in
+	previous patch to this file.  Problem reported by Chris Elvidge in
+	<https://bugs.gnu.org/44763#32>.
+
 	setlocale-null-tests: work around GCC bug 44511
 	* tests/test-setlocale_null-mt-all.c:
 	* tests/test-setlocale_null-mt-one.c:
diff --git a/lib/parse-datetime.y b/lib/parse-datetime.y
index 44ae90350..e8ed691c8 100644
--- a/lib/parse-datetime.y
+++ b/lib/parse-datetime.y
@@ -2298,7 +2298,8 @@ parse_datetime2 (struct timespec *result, char const *p,
   "%+"PRIdMAX" seconds, %+d ns),\n"),
 pc.rel.hour, pc.rel.minutes, pc.rel.seconds,
 pc.rel.ns);
-dbg_printf (_("new time = %"PRIdMAX" epoch-seconds\n"), t4);
+intmax_t t4i = t4;
+dbg_printf (_("new time = %"PRIdMAX" epoch-seconds\n"), t4i);
 
 /* Warn about crossing DST due to time adjustment.
Example: https://bugs.gnu.org/8357
-- 
2.27.0



bug#44763: Error when 'make'ing latest version of coreutils

2020-11-21 Thread Paul Eggert

On 11/21/20 5:17 AM, Pádraig Brady wrote:

The info in https://bugs.gnu.org/44739 must be incorrect,
and we've two counter checks to it now.


Yes, that sounds right. Closing that bug report.





bug#44704: uniq: replace repeated lines with a message about how many repeated lines

2020-11-17 Thread Paul Eggert

On 11/17/20 5:32 AM, Brian J. Murrell wrote:
> [previous line repeated 4 times]

uniq -c already does something like that, though it outputs "5" instead of "4". 
Not sure it's worth gussying up 'uniq' to provide exactly the functionality 
requested, as output reformatting is easy enough to do yourself using awk or 
Python or whatever.






bug#44695: error - GraphClust2 docker

2020-11-16 Thread Paul Eggert

On 11/16/20 10:58 AM, Christina Palka via GNU coreutils Bug Reports wrote:

I got the following error when attempting to install GraphClust2 using
Docker on Mac.


tail: unrecognized file system type 0x794c7630 for
‘/home/galaxy/logs/uwsgi.log’. please report this to bug-coreut...@gnu.or


Thanks for reporting that. As it happens, the problem was fixed in coreutils 
8.25 (2016-01-20); see:


https://debbugs.gnu.org/cgi/bugreport.cgi?bug=27513

so I'll close the bug report and you should be able to fix the problem by 
upgrading to a more-modern coreutils.






bug#44587: ls prints garbage when listing contents of a directory without exec permissions

2020-11-11 Thread Paul Eggert

On 11/11/20 3:24 PM, Jan Schaumann wrote:

$ ls -la dir
ls: cannot access dir/.: Permission denied
ls: cannot access dir/..: Permission denied
ls: cannot access dir/file: Permission denied
total 0
d? ? ? ? ?? .
d? ? ? ? ?? ..
-? ? ? ? ?? file
$


Expected output:

$ ls -la dir
ls: cannot access dir/.: Permission denied
ls: cannot access dir/..: Permission denied
ls: cannot access dir/file: Permission denied


As Bernhard mentioned, the actual output is intentional. The expected output 
would be less useful, as it would give the user a bit less information (e.g., it 
would not tell the user where 'file' is a regular file or a directory).






bug#44248: Indentation of --help and --version

2020-10-26 Thread Paul Eggert
One way to attack the problem is (1) use only one-liners for option help, and 
(2) not worry about indentation so much (either in English or in German) as the 
excess indenting doesn't help readability enough to justify the translation 
hassle. To do that, I propose changes like the attached for comm. This will 
cause 'comm --help' output to look like the following, which is good enough and 
which will still work with help2man:


Usage: comm [OPTION]... FILE1 FILE2
Compare sorted files FILE1 and FILE2 line by line.

When FILE1 or FILE2 (not both) is -, read standard input.

With no options, produce three-column output.  Column one contains
lines unique to FILE1, column two contains lines unique to FILE2,
and column three contains lines common to both files.

  -1  suppress column 1 (lines unique to FILE1)
  -2  suppress column 2 (lines unique to FILE2)
  -3  suppress column 3 (lines that appear in both files)

  --check-order  check that the input is correctly sorted
  --nocheck-order  do not check that the input is correctly sorted
  --output-delimiter=STR  separate columns with STR
  --total  output a summary
  -z, --zero-terminated  line delimiter is NUL, not newline
  --help display this help and exit
  --version  output version information and exit

Note, comparisons honor the rules specified by 'LC_COLLATE'.

Examples:
  comm -12 file1 file2  Print only lines present in both file1 and file2.
  comm -3 file1 file2  Print lines in file1 not in file2, and vice versa.

GNU coreutils online help: 
Full documentation 
or available locally via: info '(coreutils) comm invocation'
diff --git a/src/comm.c b/src/comm.c
index 2bf8094bf..2893746cb 100644
--- a/src/comm.c
+++ b/src/comm.c
@@ -128,24 +128,23 @@ and column three contains lines common to both files.\n\
 "), stdout);
   fputs (_("\
 \n\
-  -1  suppress column 1 (lines unique to FILE1)\n\
-  -2  suppress column 2 (lines unique to FILE2)\n\
-  -3  suppress column 3 (lines that appear in both files)\n\
+  -1  suppress column 1 (lines unique to FILE1)\n\
+  -2  suppress column 2 (lines unique to FILE2)\n\
+  -3  suppress column 3 (lines that appear in both files)\n\
 "), stdout);
   fputs (_("\
 \n\
-  --check-order check that the input is correctly sorted, even\n\
-  if all input lines are pairable\n\
-  --nocheck-order   do not check that the input is correctly sorted\n\
+  --check-order  check that the input is correctly sorted\n\
+  --nocheck-order  do not check that the input is correctly sorted\n\
 "), stdout);
   fputs (_("\
   --output-delimiter=STR  separate columns with STR\n\
 "), stdout);
   fputs (_("\
-  --total   output a summary\n\
+  --total  output a summary\n\
 "), stdout);
   fputs (_("\
-  -z, --zero-terminatedline delimiter is NUL, not newline\n\
+  -z, --zero-terminated  line delimiter is NUL, not newline\n\
 "), stdout);
   fputs (HELP_OPTION_DESCRIPTION, stdout);
   fputs (VERSION_OPTION_DESCRIPTION, stdout);


bug#43828: invalid date converting from UTC, near DST

2020-10-06 Thread Paul Eggert

On 10/6/20 4:24 AM, Martin Fido wrote:

I have version 8.25:


Seems to have been fixed by coreutils 8.30:

$ TZ='Australia/Sydney' date -d '2020-10-04T02:00:00Z'
Sun 04 Oct 2020 01:00:00 PM AEDT





bug#43657: rm does not delete files

2020-09-28 Thread Paul Eggert

On 9/27/20 8:58 PM, Amit Rao wrote:

There's a limit? My first attempt didn't use a wildcard; i attempted to delete 
a directory.


'rm dir' fails because 'rm' by default leaves directories alone.


My second attempt was rm -rf dir/*


If "dir" has too many files that will fail due to shell limitations that have 
nothing to do with Coreutils. Use 'rm -rf dir' instead.






bug#43657: rm does not delete files

2020-09-27 Thread Paul Eggert

On 9/27/20 1:00 PM, Amit Rao wrote:

rm /path/*

does not delete files if there are a lot (say 2000) of them in a single
directory


What does the command do instead?

There is a limit as to how many arguments you can pass to 'rm'. If that's what 
you ran into, it's a problem with your kernel or your shell, not with 'rm'.






bug#43497: ls exit status on removed directory

2020-09-21 Thread Paul Eggert

On 9/18/20 4:15 PM, Philip Rowlands wrote:


$ mkdir /tmp/abc
$ cd /tmp/abc
$ rmdir /tmp/abc
$ ls

What happened:
no output, successful exit status

What was expected:
no output, unsuccessful exit status


POSIX says that the rmdir command is supposed to behave like the rmdir syscall. 
For the syscall, POSIX allows either of the two behaviors you mention, as 
 says 
that if the rmdir syscall's argument is "the current working directory of any 
process, it is unspecified whether the function succeeds, or whether it shall 
fail and set errno to [EBUSY]". The Linux kernel rmdir syscall succeeds, so 
coreutils rmdir succeeds.



ls tried to list the contents of . but failed to do so, at least on Linux:
open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, 0x55e10c419cf0, 32768)  = -1 ENOENT (No such file or directory)


ls doesn't use getdents directly; it uses the readdir function of the GNU C 
library, which specifically tests for this situation and sets errno to 0, with 
this comment at 
:


  /* On some systems getdents fails with ENOENT when the 

 open directory has been rmdir'd already.  POSIX.1 


 requires that we treat this condition like normal EOF.  */

It's not clear to me that this comment is correct for current POSIX, but anyway 
this is a matter for the GNU C library not for coreutils ls, so if you think 
there's a bug there I suggest filing a glibc bug report 
.






bug#43415: coreutils 8.32: install: fchmod fails with EBADF

2020-09-15 Thread Paul Eggert

On 9/14/20 6:31 PM, Cameron Nemo via GNU coreutils Bug Reports wrote:

It seems like relying on the /proc link is not ideal,
and a bug is being hidden by such behavior.
Is there any chance that this can be resolved?


It really should be fixed in the Linux kernel: it needs a proper way to 
implement POSIX fchmodat  
with the AT_SYMLINK_NOFOLLOW flag, in order to plug some security holes 
involving symlink attacks. See:


https://bugzilla.redhat.com/show_bug.cgi?id=1810141
https://lkml.org/lkml/2020/6/9/548

In the meantime, mounting /proc may be your best bet. I vaguely recall there are 
other places in glibc that assume /proc.






bug#43162: chgrp clears setgid even when group is not changed

2020-09-01 Thread Paul Eggert

On 9/1/20 3:30 PM, Karl Berry wrote:

I was on centos7.

 (I don't observe your problem on my Fedora 31 box, for example).

Maybe there is hope for a future centos, then.


Maybe. Or it could be a filesystem or mounting issue. My filesystem was ext4 
mounted rw,relatime,seclabel, for what it's worth.


Anyway, closing the bug report.





bug#43162: chgrp clears setgid even when group is not changed

2020-09-01 Thread Paul Eggert

On 9/1/20 2:25 PM, Karl Berry wrote:

Is it necessary for chgrp to clear setgid on directories even when the
group is not actually changed? In my life at least, it is rather
annoying.


The chgrp command isn't doing that directly; it's merely invoking the fchownat 
syscall, and the syscall is clearing setgid.


POSIX requires chgrp to behave like the chown syscall even if the file's group 
is already correct, and it appears that the syscall clears the setgid bit on 
your platform (a behavior that POSIX allows, and even requires for regular 
files). So partly this is a platform issue (I don't observe your problem on my 
Fedora 31 box, for example).


I don't see an easy way to change chgrp without departing from POSIX, or perhaps 
adding a run-time option to the chown and chgrp commands. Not sure it's worth it.






bug#42804: mkdir saying it can't create folder although it created it

2020-08-14 Thread Paul Eggert

On 8/11/20 3:03 PM, Nick Levinson via GNU coreutils Bug Reports wrote:


I don't know what an example transcript is


Ah, I was asking for the output of a shell session (or however you invoked 
'mkdir'). The idea is that we would like to reproduce the bug, and need a recipe 
to do that.


If you're using GNU/Linux, the strace output would be quite helpful, e.g., run 
the shell command:


strace -o tr.txt mkdir foobar

and then send us a copy of the file tr.txt, assuming 'mkdir foobar' fails in the 
way you describe.






bug#42804: mkdir saying it can't create folder although it created it

2020-08-10 Thread Paul Eggert

On 8/10/20 11:48 AM, Nick Levinson via GNU coreutils Bug Reports wrote:

When I use mkdir, when it succeeds sometimes it has no message, which is 
acceptable, but sometimes it says this:

mkdir: cannot create directory '': File exists
But the directory didn't exist before I used mkdir to make it, and I show 
hidden files, thus also hidden directories.


Unfortunately I do not understand this bug report. I don't get the connection 
between the diagnostic and hidden directories.


Can you give an example transcript of the bug and explain it a bit more? Thanks.





bug#42766: file names with spaces are quoted in the output from ls

2020-08-09 Thread Paul Eggert

On 8/8/20 9:09 AM, David Thomas wrote:

If most people think things are a bad idea, why do them?


I don't see any real evidence that most people think the change is a bad idea. 
Although there have been complaints, that doesn't mean most people are 
complaining, or that most people are unhappy about the change.


In practice I've found the new behavior to be significantly safer. I too often 
have to deal with files with shell metacharacters in their names (people send me 
all sorts of weird stuff). The old 'ls' behavior was quite dangerous in that 
respect.



at first I was typing out the quotes to cd into them. Then I discovered it 
still worked to cd into them without typing the quotes


What file names were these, exactly? If 'ls' is overquoting, that's something we 
could fix without affecting safety.






bug#42470: Help text update suggestion for "date" util

2020-07-27 Thread Paul Eggert

On 7/25/20 8:07 AM, Wes Novack wrote:

Thank you! For future reference, what is the PR process?


See:

https://debbugs.gnu.org/

and look for "read more" if you're interested.





bug#42358: mv w/mkdir -p of destination

2020-07-15 Thread Paul Eggert

On 7/14/20 3:36 PM, L A Walsh wrote:

But I've found asking for features usually doesn't work and sometimes
results in work to preclude future
implementation of the feature.  Reporting bugs also, often gets ignored
until some large company reports
the same problem or until it causes a serious enough security incident.


You've often disagreed with design decisions made by maintainers, but this is 
the first time I recall you've accused them of large-company bias. Perhaps you 
should get your other grievances off your chest while you're at it.


I haven't noticed any such bias myself. Anyway, it does help to propose good 
patches, since my volunteer time is limited.






bug#42269: Remove non-GMP code from coreutils factor.c

2020-07-08 Thread Paul Eggert

On 7/8/20 12:34 PM, Torbjörn Granlund wrote:


Any number which does not happen to be B-smooth for, say B < 2^30, will
show easily measurable performance difference of 5x to 40x IIRC.


Ah, I had tried the example in the manual, (2^31 - 1) * (2^61 - 1). Even though 
it isn't B-smooth for B < 2^30, the performance difference was only 2x on my 
machine. I just now tried 2^127 - 1 and saw a similar performance difference, 
but 2^127 - 3 had a 15x difference so it's a better example.


I installed the attached to try to document this better.


I have a patch which makes the non-GMP code some 2x - 3x faster.  It's
been maturing for several years now, so I suppose I should really finish
it.  (It got tangled with code which improves the GMP case by letting it
fall into the non-GMP code as numbers get smaller.  That sounds simple
but is quite messy for various reasons.  It is also not clear how much
complexity we could defend for this command of limited utility.)


Yes, 'factor' is just a minor utility needed for POSIX compliance. Although it'd 
be nice to get that 2x-3x improvement whenever you have the time, it's not 
urgent. Thanks for your guidance on the GMP issue.


>From ba1489d763b66dd1fcec08ecb4cba5917745f6bf Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 8 Jul 2020 18:58:18 -0700
Subject: [PATCH] factor: explain why non-GMP code (Bug#42269)

* doc/coreutils.texi (factor invocation):
* src/factor.c: Explain why the two-word algorithm is useful.
---
 doc/coreutils.texi | 24 ++--
 src/factor.c   |  5 +
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 6ec1e6c31..656b8bc79 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -18368,14 +18368,17 @@ Print the program version on standard output, then exit without further
 processing.
 @end table
 
-Factoring the product of the eighth and ninth Mersenne primes
-takes about 4 milliseconds of CPU time on an Intel Xeon Silver 4116.
+If the number to be factored is small (less than @math{2^{127}} on
+typical machines), @command{factor} uses a faster algorithm.
+For example, on a circa-2017 Intel Xeon Silver 4116, factoring the
+product of the eighth and ninth Mersenne primes (approximately
+@math{2^{92}}) takes about 4 ms of CPU time:
 
 @example
-M8=$(echo 2^31-1|bc)
-M9=$(echo 2^61-1|bc)
-n=$(echo "$M8 * $M9" | bc)
-bash -c "time factor $n"
+$ M8=$(echo 2^31-1 | bc)
+$ M9=$(echo 2^61-1 | bc)
+$ n=$(echo "$M8 * $M9" | bc)
+$ bash -c "time factor $n"
 4951760154835678088235319297: 2147483647 2305843009213693951
 
 real	0m0.004s
@@ -18383,11 +18386,12 @@ user	0m0.004s
 sys	0m0.000s
 @end example
 
-Similarly, factoring the eighth Fermat number @math{2^{256}+1} takes
-about 14 seconds on the same machine.
+For larger numbers, @command{factor} uses a slower algorithm.  On the
+same platform, factoring the eighth Fermat number @math{2^{256} + 1}
+takes about 14 seconds, and the slower algorithm would have taken
+about 750 ms to factor @math{2^{127} - 3} instead of the 50 ms needed by
+the faster algorithm.
 
-The single-precision code uses an algorithm
-designed for factoring smaller numbers.
 Factoring large numbers is, in general, hard.  The Pollard-Brent rho
 algorithm used by @command{factor} is particularly effective for
 numbers with relatively small factors.  If you wish to factor large
diff --git a/src/factor.c b/src/factor.c
index c1c35a562..1b1607f16 100644
--- a/src/factor.c
+++ b/src/factor.c
@@ -53,6 +53,11 @@
 trick of multiplying all n-residues by the word base, allowing cheap Hensel
 reductions mod n.
 
+The GMP code uses an algorithm that can be considerably slower;
+for example, on a circa-2017 Intel Xeon Silver 4116, factoring
+2^{127}-3 takes about 50 ms with the two-word algorithm but would
+take about 750 ms with the GMP code.
+
   Improvements:
 
 * Use modular inverses also for exact division in the Lucas code, and
-- 
2.17.1



bug#42269: Remove non-GMP code from coreutils factor.c

2020-07-08 Thread Paul Eggert

On 7/8/20 9:57 AM, Torbjörn Granlund wrote:


The non-GMP code of coreutils was extremely well-tuned by me and Niels
Möller a couple of years ago.


How time flies! The code was merged in 2012.


By leaving just the GMP code, you would create a pretty useless factor
command.  Any naive old factor command would often beat it.  It would
make much more sense to remove the factor command altogether.


OK, thanks. Then let's forget about the patch I just proposed.

Could you give an example of where the 128-bit code shines, compared to the GMP 
code on the same arguments? I could add the example as a comment in the factor.c 
code, to let me and future maintainers know why it's useful for performance.






bug#42211: Problem in sort

2020-07-06 Thread Paul Eggert
On 7/5/20 9:53 PM, Richard Freedman wrote:

> I discovered that trying to use -c with --debug causes an error - but not in 
> the version that I have
> on my mac laptop !

Ah, I had meant to suggest using --debug without -c.

> when I try to specify a "key" even for a file with only 1 column - the 
> program stops on consecutive entries
> that are identical.

When all keys compare equal, 'sort' falls back on a last-resort comparison of
the entire line to break ties, and it's finding that your lines are out of
order. You don't want 'sort' to do that, so you should specify the -s (--stable)
option. -s is a GNU extension.

Closing the bug report, as this should fix the problem for you.





bug#42211: Problem in sort

2020-07-05 Thread Paul Eggert
On 7/4/20 2:39 PM, Richard Freedman wrote:
> When I use sort -n -c on a specified column in a file sort reports an error 
> and then stops if two numbers are exactly the same.

Could you send us the input, and the output of "sort --debug -n -c -k3"
(assuming you're using column 3)? My guess is that the output will explain the
symptoms you're seeing, but if not then we'd like to see the test case. Thanks.





bug#8061: Introduce SEEK_DATA/SEEK_HOLE to extent_scan module

2020-06-25 Thread Paul Eggert
This email is follow up to <https://bugs.gnu.org/8601> dated 2011-05-01. Jeff,
thanks for reporting the problem. (There's a good chance this email will bounce
but I'll send it to your 2011 email address anyway.)

I recently ran into the same issue and derived the attached patches
independently. I then found your bug report, made sure the attached patches
fixed every problem that your proposal did, and installed the attached patches
into Savannah.

The attached patches 1-3 merely fix typos and refactor.

Patch 4 corresponds to your proposal; however, it differs in that its basic idea
is to use the FIEMAP code only as a fallback if SEEK_DATA doesn't work, rather
than try to add to the already-too-complicated code that fiddles with FIEMAPs.
(I don't observe any significant performance advantage to the FIEMAP stuff, but
maybe that's just me.)

Patch 5 adds opportunistic use of the copy_file_range syscall introduced in
Linux kernel 4.5 (2016) and reworked in 5.3 (2019). This should improve 'cp'
performance on kernels and file systems that support copy_file_range.
>From 4fe5259ab6c9e459a6db5938d143a9c65be113d9 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 25 Jun 2020 18:10:49 -0700
Subject: [PATCH 1/5] maint: typo fix

* NEWS: Fix typo.
---
 NEWS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index d36259641..d713fa724 100644
--- a/NEWS
+++ b/NEWS
@@ -17,7 +17,7 @@ GNU coreutils NEWS-*- outline -*-
 
   cp and install now default to copy-on-write (COW) if available.
 
-  On GNU/Linux systems, ls no longer issues an error message on
+  On GNU/Linux systems, ls no longer issues an error message on a
   directory merely because it was removed.  This reverts a change
   that was made in release 8.32.
 
-- 
2.25.4

>From 51981008f9892d44231c432535deac4f9b3cbe5e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 23 Jun 2020 19:18:04 -0700
Subject: [PATCH 2/5] cp: refactor extent_copy

* src/copy.c (extent_copy): New arg SCAN, replacing
REQUIRE_NORMAL_COPY.  All callers changed.
(enum scantype): New type.
(infer_scantype): Rename from is_probably_sparse and return
the new type.  Add args FD and SCAN.  All callers changed.
---
 src/copy.c | 119 +
 1 file changed, 55 insertions(+), 64 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index 54601ce07..f694f913f 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -422,9 +422,8 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
  size_t hole_size, off_t src_total_size,
  enum Sparse_type sparse_mode,
  char const *src_name, char const *dst_name,
- bool *require_normal_copy)
+ struct extent_scan *scan)
 {
-  struct extent_scan scan;
   off_t last_ext_start = 0;
   off_t last_ext_len = 0;
 
@@ -432,45 +431,25 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
  We may need this at the end, for a final ftruncate.  */
   off_t dest_pos = 0;
 
-  extent_scan_init (src_fd, );
-
-  *require_normal_copy = false;
   bool wrote_hole_at_eof = true;
-  do
+  while (true)
 {
-  bool ok = extent_scan_read ();
-  if (! ok)
-{
-  if (scan.hit_final_extent)
-break;
-
-  if (scan.initial_scan_failed)
-{
-  *require_normal_copy = true;
-  return false;
-}
-
-  error (0, errno, _("%s: failed to get extents info"),
- quotef (src_name));
-  return false;
-}
-
   bool empty_extent = false;
-  for (unsigned int i = 0; i < scan.ei_count || empty_extent; i++)
+  for (unsigned int i = 0; i < scan->ei_count || empty_extent; i++)
 {
   off_t ext_start;
   off_t ext_len;
   off_t ext_hole_size;
 
-  if (i < scan.ei_count)
+  if (i < scan->ei_count)
 {
-  ext_start = scan.ext_info[i].ext_logical;
-  ext_len = scan.ext_info[i].ext_length;
+  ext_start = scan->ext_info[i].ext_logical;
+  ext_len = scan->ext_info[i].ext_length;
 }
   else /* empty extent at EOF.  */
 {
   i--;
-  ext_start = last_ext_start + scan.ext_info[i].ext_length;
+  ext_start = last_ext_start + scan->ext_info[i].ext_length;
   ext_len = 0;
 }
 
@@ -498,7 +477,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
 {
   error (0, errno, _("cannot lseek %s"), quoteaf (src_name));
 fail:
-  extent_scan_free ();
+  extent_scan_free (scan);
   return false;
 }
 
@@ -539,7 +518,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
   /* For now, do not tre

bug#41944: cp: default to --reflink=auto, revisted

2020-06-18 Thread Paul Eggert
Thanks, I'd forgotten that. The performance improvement is long overdue, so I
installed the attached.
>From 25725f9d41735d176d73a757430739fb71c7d043 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 18 Jun 2020 22:16:24 -0700
Subject: [PATCH] cp: default to COW
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Likewise for ‘install’.  Proposed in Bug#24400, and long past due.
* NEWS:
* doc/coreutils.texi (cp invocation):
* src/copy.h (enum Reflink_type): Document this.
* src/cp.c (cp_option_init):
* src/install.c (cp_option_init): Implement this.
---
 NEWS   |  2 ++
 doc/coreutils.texi | 19 ---
 src/copy.h |  4 ++--
 src/cp.c   |  2 +-
 src/install.c  |  2 +-
 5 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index 8ddd0e22f..655ff779f 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,8 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Changes in behavior
 
+  cp and install now default to copy-on-write (COW) if available.
+
   On GNU/Linux systems, ls no longer issues an error message on
   directory merely because it was removed.  This reverts a change
   that was made in release 8.32.
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 3432fb294..4bbb960b7 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -8864,12 +8864,14 @@ The @var{when} value can be one of the following:
 
 @table @samp
 @item always
-The default behavior: if the copy-on-write operation is not supported
+If the copy-on-write operation is not supported
 then report the failure for each file and exit with a failure status.
+Plain @option{--reflink} is equivalent to @option{--reflink=when}.
 
 @item auto
 If the copy-on-write operation is not supported then fall back
 to the standard copy behavior.
+This is the default if no @option{--reflink} option is given.
 
 @item never
 Disable copy-on-write operation and use the standard copy behavior.
@@ -8878,12 +8880,6 @@ Disable copy-on-write operation and use the standard copy behavior.
 This option is overridden by the @option{--link}, @option{--symbolic-link}
 and @option{--attributes-only} options, thus allowing it to be used
 to configure the default data copying behavior for @command{cp}.
-For example, with the following alias, @command{cp} will use the
-minimum amount of space supported by the file system.
-
-@example
-alias cp='cp --reflink=auto --sparse=always'
-@end example
 
 @item --remove-destination
 @opindex --remove-destination
@@ -8928,6 +8924,15 @@ This is useful in creating a file for use with the @command{mkswap} command,
 since such a file must not have any holes.
 @end table
 
+For example, with the following alias, @command{cp} will use the
+minimum amount of space supported by the file system.
+(Older versions of @command{cp} can also benefit from
+@option{--reflink=auto} here.)
+
+@example
+alias cp='cp --sparse=always'
+@end example
+
 @optStripTrailingSlashes
 
 @item -s
diff --git a/src/copy.h b/src/copy.h
index 874d6f71c..a0ad494b9 100644
--- a/src/copy.h
+++ b/src/copy.h
@@ -46,10 +46,10 @@ enum Sparse_type
 /* Control creation of COW files.  */
 enum Reflink_type
 {
-  /* Default to a standard copy.  */
+  /* Do a standard copy.  */
   REFLINK_NEVER,
 
-  /* Try a COW copy and fall back to a standard copy.  */
+  /* Try a COW copy and fall back to a standard copy; this is the default.  */
   REFLINK_AUTO,
 
   /* Require a COW copy and fail if not available.  */
diff --git a/src/cp.c b/src/cp.c
index 8db2c4b9e..a4ecbbc9f 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -793,7 +793,7 @@ cp_option_init (struct cp_options *x)
   x->move_mode = false;
   x->install_mode = false;
   x->one_file_system = false;
-  x->reflink_mode = REFLINK_NEVER;
+  x->reflink_mode = REFLINK_AUTO;
 
   x->preserve_ownership = false;
   x->preserve_links = false;
diff --git a/src/install.c b/src/install.c
index 22124d51b..a94053f4d 100644
--- a/src/install.c
+++ b/src/install.c
@@ -264,7 +264,7 @@ cp_option_init (struct cp_options *x)
 {
   cp_options_default (x);
   x->copy_as_regular = true;
-  x->reflink_mode = REFLINK_NEVER;
+  x->reflink_mode = REFLINK_AUTO;
   x->dereference = DEREF_ALWAYS;
   x->unlink_dest_before_opening = true;
   x->unlink_dest_after_failed_open = false;
-- 
2.17.1



bug#41664: du give file not accessible

2020-06-02 Thread Paul Eggert
On 6/2/20 2:42 AM, Sumit Gupta wrote:
> Is this expected or shall the command be modified to ignore such files?

It does seem reasonable that 'du' should ignore a file when the system call says
ENOENT, since the file isn't there and cannot be consuming disk space.





bug#37702: Suggestion for 'df' utility

2020-05-30 Thread Paul Eggert
On 5/30/20 4:49 AM, Erik Auerswald wrote:
> I concur that a command line option to override config file (or env var)
> settings seems useful if a config file and/or env var approach is used.

In other utilities we've been moving away from environment variables and/or
config files for the usual security and other-hassle reasons. So I'd prefer
having 'df' just do the "right" thing by default, and to have an option to
override that. The "right" thing should be to ignore all these pseudofilesystems
that hardly anybody cares about.





bug#41554: chmod allows removing x bit on chmod without a force flag, which can be inconvenient to recover from

2020-05-26 Thread Paul Eggert
On 5/26/20 6:30 PM, Will Rosecrans wrote:
> The underlying safety logic is similar to that behind the
> existing "--(no-)preserve-root"

I think not. There are all sorts of other things one shouldn't chmod either, but
we can't and shouldn't maintain a long list. Let's stop with "/".





bug#41480: Chars out of order in date.c string

2020-05-23 Thread Paul Eggert
On 5/23/20 4:41 AM, Anders Jonsson wrote:

> I noticed one thing when having a look at the Swedish translation of 
> coreutils.
> 
>>#: src/date.c:196
>>msgid ""
>>"  %F   full date; like %+4Y-%m-%d\n"

There must be some confusion here, because this translation is for coreutils
8.31 and later.

> This doesn't give the expected result when I try it in coreutils 8.30 in 
> Debian
> testing:

That's because the behavior of coreutils changed in 8.31. The translation string
you're talking about was introduced in coreutils 8.31, so I'm puzzled as to why
it'd be used with coreutils 8.30.

Here's the behavior change in 8.31:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=188d87b05190690d6f8b0577ec65ef221a711d08

and here's the closely-related documentation change in 8.31:

https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=2ab2f7a422652a9ec887e08ca8935b44e9629505





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-10 Thread Paul Eggert
On 5/7/20 7:06 PM, Eric Blake wrote:
> 
> (My personal wish: I would love a variation of mkdir that returns an open fd 
> on
> the just-created directory on success in a single syscall,

Yes! That would be a worthy addition.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-02 Thread Paul Eggert
On 5/2/20 3:41 PM, Jonny Grant wrote:
> Is a more accurate strerror considered unreliable?
> 
> Current:
> mkdir: cannot create directory ‘test’: File exists
> 
> Proposed:
> mkdir: cannot create directory ‘test’: Is a directory

I don't understand this comment. As I understand it you're proposing a change to
the mkdir command not a change to the strerror library function, and the change
you're proposing would introduce a race condition to the mkdir command.

A better fix would be to change the mkdir system call so that it sets errno to
EISDIR in this situation. This would fix not only the mkdir utility, but also
lots of other programs; and it wouldn't introduce a race condition. So if you're
interested in getting the problem fixed, I suggest that you propose such a
change to the Linux kernel developers.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-02 Thread Paul Eggert
On 5/2/20 6:26 AM, Jonny Grant wrote:
> If developers have race conditions in their shell scripts

I've personally fixed a bug in the GNU mkdir command that was triggered by such
races. Core utilities should be reliable even when these races are happening.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-01 Thread Paul Eggert
On 5/1/20 1:21 PM, Jonny Grant wrote:
> yes, the fix pretty trivial for mkdir as you highlight EISDIR:
> stat(), S_ISDIR(sb.st_mode), and set errno to EISDIR or output 
> strerror(EISDIR)

That would introduce a race condition, and wouldn't behave correctly if some
other process changes the destination from a regular file to a directory between
the time we call mkdir and the time that we call stat.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-01 Thread Paul Eggert
On 5/1/20 9:16 AM, Jonny Grant wrote:
> rm: cannot remove 'test': Is a directory

That's because rm used unlink which failed with EISDIR, which is a different
error number.

Consider this example:

$ >d # Create an empty regular file.
$ mkdir d
mkdir: cannot create directory ‘d’: File exists

Here the system call mkdir("d", 0777) failed with errno == EEXIST (File exists).
Presumably you wouldn't object to the diagnostic here because d is a regular
file, not a directory. But the mkdir system call fails in exactly the same way
if d is a directory, so the error message is the same in both cases.

Directories are files, so the error message is correct even if it confused you.
I don't see any portable and efficient way to make the diagnostic less confusing
for you, without also making diagnostic incorrect in some other scenarios (such
as the scenario described above).





bug#40904: listing multiple subdirectories places filenames in different columns between each subdirectory

2020-04-27 Thread Paul Eggert
On 4/27/20 7:36 AM, Jim Clark wrote:
> When I list a hard drive "ls -AR > list.txt" and import it into Libreoffice
> Calc, then break the lines using "text-to-columns", I am not able to
> perform a fixed format break so that the filenames are placed in their own
> column.

I can't reproduce the problem. All the file names start at the beginning of the
line. Quite possibly you're using an alias, so that your 'ls' is not the plain
vanilla 'ls'. At any rate, 'find' is probably a better tool for what you want 
to d.





bug#40509: Use of fsetxattr() in cp tickles an EXT leak (possibly unnecessarily so)

2020-04-15 Thread Paul Eggert

On 4/15/20 7:11 AM, Gregg Leventhal wrote:


+xattr_size = flistxattr(src_fd, list, size);
+if ( xattr_size || errno == ERANGE )


Surely this should be 'if (flistxattr (src_fd, NULL, 0) < 0 && errno == 
ERANGE)'.


If you agree with this direction, I can continue, addressing other affected
code paths (i.e --preserve=mode).


This sounds like a good thing to do. Before you spend a lot of time on it, 
though, would you be willing to assign copyright to your work product to the FSF 
so that we could install the patch? If so, I can send you email on how to fill 
out the paperwork; if not, we'd better arrange for someone else to write the fix.






bug#40586: date and '%-N' does not appear to remove leading zeros anymore, but trailing zeros.

2020-04-12 Thread Paul Eggert

On 4/12/20 1:51 PM, Drake Jacovian wrote:

Obviously, removing trailing zeroes will changes it value.


%-N is intended to be used after a decimal point, so removing trailing zeros 
does not change its value in its intended use.






bug#40540: Faster sort with locale

2020-04-10 Thread Paul Eggert

On 4/10/20 6:19 AM, Ole Tange wrote:

But would it be possible to convert the input string1 into a string in
a generalized format, which would sort the same way as the localized
sort, but using a simple compare?


I tried doing that a long time ago by using strxfrm, but it made 'sort' 
significantly slower. You're welcome to try again; perhaps things have changed.






bug#40509: Use of fsetxattr() in cp tickles an EXT leak (possibly unnecessarily so)

2020-04-08 Thread Paul Eggert

On 4/8/20 7:15 AM, Gregg Leventhal wrote:


rsync doesn't make set/get xattr calls and purports to preserve ACLs with
-A.


I'm not quite following your bug report, but it appears that you're saying that 
cp could somehow discover that it needn't use fgetxattr and fsetxattr on files 
that lack extended attributes, and for those files cp could stick with ordinary 
POSIX syscalls (e.g., umask, chmod) to give files proper permissions, and in 
that case 'cp' would presumably (a) operate more efficiently and (b) not trigger 
a bug in the EXT filesystem.


This sounds like a worthy suggestion, though of course it would be better to 
have a concrete proposal in the form of a coreutils patch, along with a few 
performance measurements. For starters, how does rsync do it?


Also, of course EXT should be fixed regardless of what coreutils does here.





bug#40220: date command set linux epoch time failed

2020-03-30 Thread Paul Eggert

On 3/29/20 9:32 PM, Bob Proulx wrote:

Both calls from GNU date are returning EINVAL.  Those are Linux kernel
system calls.  Those Linux kernel system calls are using
CLOCK_MONOTONIC.


OK, I think I understand now. For some reason Linux prohibits you from setting 
CLOCK_REALTIME to a value less than what CLOCK_MONOTONIC would report. I don't 
know why Linux has this restriction - it violates POSIX as near as I can tell - 
but at any rate as you say it's a problem with the Linux kernel, not with GNU 
'date'.






bug#40220: date command set linux epoch time failed

2020-03-29 Thread Paul Eggert

On 3/28/20 9:12 AM, Bob Proulx wrote:

By reading the documentation for CLOCK_MONOTONIC in clock_gettime(2):


GNU 'date' doesn't use CLOCK_MONOTONIC, so why is CLOCK_MONOTONIC relevant to 
this bug report?


Is this some busybox thing? If so, user 'shy' needs to report it to the busybox 
people, not to bug-coreutils.






bug#40220: date command set linux epoch time failed

2020-03-28 Thread Paul Eggert

On 3/27/20 11:52 PM, Bob Proulx wrote:

I tested this in a victim system and if I was very quick I was able to
log in and set the time to :10 seconds but no earlier.


Sounds like some sort of atomic-time thing, since UTC and TAI differed by 10 
seconds when they started up in 1972. Perhaps the clock in question uses TAI 
internally?






bug#39929: coreutils-8.32 fails to build on aarch64

2020-03-07 Thread Paul Eggert

On 3/5/20 11:36 PM, Bernhard Voelker wrote:

s/emits/shall not emit/

P.S. Also the check for $host_triplet containing 'linux' in test is:
a) no longer needed, ...


Thanks for catching those; I installed the attached further patch.
>From ab149bd415daf1cb8ecde0b948bc0a2663611a61 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 7 Mar 2020 10:29:51 -0800
Subject: [PATCH] ls: improve removed-directory test

* tests/ls/removed-directory.sh: Remove host_triplet test.
Skip this test if one cannot remove the working directory.
>From a suggestion by Bernhard Voelker (Bug#39929).
---
 tests/ls/removed-directory.sh | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/tests/ls/removed-directory.sh b/tests/ls/removed-directory.sh
index fe8f929a1..63b209dee 100755
--- a/tests/ls/removed-directory.sh
+++ b/tests/ls/removed-directory.sh
@@ -1,7 +1,7 @@
 #!/bin/sh
-# If ls is asked to list a removed directory (e.g. the parent process's
-# current working directory that has been removed by another process), it
-# emits an error message.
+# If ls is asked to list a removed directory (e.g., the parent process's
+# current working directory has been removed by another process), it
+# should not emit an error message merely because the directory is removed.
 
 # Copyright (C) 2020 Free Software Foundation, Inc.
 
@@ -21,15 +21,10 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ ls
 
-case $host_triplet in
-  *linux*) ;;
-  *) skip_ 'non linux kernel' ;;
-esac
-
 cwd=$(pwd)
 mkdir d || framework_failure_
 cd d || framework_failure_
-rmdir ../d || framework_failure_
+rmdir ../d || skip_ "can't remove working directory on this platform"
 
 ls >../out 2>../err || fail=1
 cd "$cwd" || framework_failure_
-- 
2.17.1



bug#39929: coreutils-8.32 fails to build on aarch64

2020-03-05 Thread Paul Eggert

On 3/5/20 1:43 PM, Paul Eggert wrote:

Why is this code even there at all? If readdir(3) says that the current 
directory has no entries, shouldn't 'ls' just say that? Why should ls 
report an error simply because the current directory isn't reachable 
from the filesystem? Whether the current directory is unreachable has 
nothing to do with ls's job, which is to report whether the current 
directory has entries.


Attached is a proposed patch to fix this.
>From 511d0c323bc90a0ab7e8f3672b07a1144885a9e8 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 5 Mar 2020 17:25:29 -0800
Subject: [PATCH] ls: restore 8.31 behavior on removed directories

* NEWS: Mention this.
* src/ls.c: Do not include 
(print_dir): Don't worry about whether the directory is removed.
* tests/ls/removed-directory.sh: Adjust to match new (i.e., old)
behavior.
---
 NEWS  |  6 ++
 src/ls.c  | 22 --
 tests/ls/removed-directory.sh | 10 ++
 3 files changed, 8 insertions(+), 30 deletions(-)

diff --git a/NEWS b/NEWS
index fdc8bf5db..653e7178b 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,12 @@ GNU coreutils NEWS-*- outline -*-
 
 * Noteworthy changes in release ?.? (-??-??) [?]
 
+** Changes in behavior
+
+  On GNU/Linux systems, ls no longer issues an error message on
+  directory merely because it was removed.  This reverts a change
+  that was made in release 8.32.
+
 
 * Noteworthy changes in release 8.32 (2020-03-05) [stable]
 
diff --git a/src/ls.c b/src/ls.c
index 24b983287..4acf5f44d 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -49,10 +49,6 @@
 # include 
 #endif
 
-#ifdef __linux__
-# include 
-#endif
-
 #include 
 #include 
 #include 
@@ -2896,7 +2892,6 @@ print_dir (char const *name, char const *realname, bool command_line_arg)
   struct dirent *next;
   uintmax_t total_blocks = 0;
   static bool first = true;
-  bool found_any_entries = false;
 
   errno = 0;
   dirp = opendir (name);
@@ -2972,7 +2967,6 @@ print_dir (char const *name, char const *realname, bool command_line_arg)
   next = readdir (dirp);
   if (next)
 {
-  found_any_entries = true;
   if (! file_ignored (next->d_name))
 {
   enum filetype type = unknown;
@@ -3018,22 +3012,6 @@ print_dir (char const *name, char const *realname, bool command_line_arg)
   if (errno != EOVERFLOW)
 break;
 }
-#ifdef __linux__
-  else if (! found_any_entries)
-{
-  /* If readdir finds no directory entries at all, not even "." or
- "..", then double check that the directory exists.  */
-  if (syscall (SYS_getdents, dirfd (dirp), NULL, 0) == -1
-  && errno != EINVAL)
-{
-  /* We exclude EINVAL as that pertains to buffer handling,
- and we've passed NULL as the buffer for simplicity.
- ENOENT is returned if appropriate before buffer handling.  */
-  file_failure (command_line_arg, _("reading directory %s"), name);
-}
-  break;
-}
-#endif
   else
 break;
 
diff --git a/tests/ls/removed-directory.sh b/tests/ls/removed-directory.sh
index e8c835dab..fe8f929a1 100755
--- a/tests/ls/removed-directory.sh
+++ b/tests/ls/removed-directory.sh
@@ -26,20 +26,14 @@ case $host_triplet in
   *) skip_ 'non linux kernel' ;;
 esac
 
-LS_FAILURE=2
-
-cat <<\EOF >exp-err || framework_failure_
-ls: reading directory '.': No such file or directory
-EOF
-
 cwd=$(pwd)
 mkdir d || framework_failure_
 cd d || framework_failure_
 rmdir ../d || framework_failure_
 
-returns_ $LS_FAILURE ls >../out 2>../err || fail=1
+ls >../out 2>../err || fail=1
 cd "$cwd" || framework_failure_
 compare /dev/null out || fail=1
-compare exp-err err || fail=1
+compare /dev/null err || fail=1
 
 Exit $fail
-- 
2.24.1



bug#39929: coreutils-8.32 fails to build on aarch64

2020-03-05 Thread Paul Eggert

On 3/5/20 9:39 AM, Pádraig Brady wrote:

Ah well.
Does the attached address this for you.


Eeeuw.

Why is this code even there at all? If readdir(3) says that the current 
directory has no entries, shouldn't 'ls' just say that? Why should ls 
report an error simply because the current directory isn't reachable 
from the filesystem? Whether the current directory is unreachable has 
nothing to do with ls's job, which is to report whether the current 
directory has entries.






bug#39850: "du" command can not count some files

2020-03-01 Thread Paul Eggert
I don't see a bug there, as the files you say "du" is not counting have counts 
of zero.






bug#38627: uniq -c gets wrong count with non-ascii strings

2020-02-23 Thread Paul Eggert

On 2/23/20 11:43 AM, Pádraig Brady wrote:


 #include "hard-locale.h"
 #include "posixver.h"
 #include "stdio--.h"
-#include "xmemcoll.h"


Please also remove the '#include "hard-locale.h"' line.

Thanks for fixing this.





bug#39693: Sv: bug#39693: Any chance of fixing --rfc-3339 to conform to the standard?

2020-02-21 Thread Paul Eggert

On 2/20/20 11:56 PM, Mads Bondo Dydensborg wrote:

Your statement is in conflict with the message exchange, referenced by the bug 
I linked to, with, as I understand it, the authors of the standard:


Not really. In that email exchange one of the authors of the RFC 
mentioned a goal of the RFC. The part of the RFC that I quoted, though, 
is an explicit exception to that particular goal. The RFC had several 
goals, they sometimes conflicted, and the RFC's text was a compromise. I 
was involved with the drafting of the RFC, and remember the history 
reasonably well.



The ISO output from date can not be used, as it uses a "," as fractional 
separator


You can use the following if you want subsecond resolution with both 'T' 
and '.':


date '+%Y-%m-%dT%H:%M:%S.%N%:z'

This won't work for some historical timestamps (e.g., the Netherlands 
before 1937) but RFC 3339 doesn't support them anyway so it's probably 
good enough.






bug#39693: Any chance of fixing --rfc-3339 to conform to the standard?

2020-02-20 Thread Paul Eggert

On 2/20/20 4:39 AM, Mads Bondo Dydensborg wrote:

As have been established in 2006 and again in 2010, the rfc-3339 mandates the use of 
"T" in a single field timestamp.


No, RFC 3339 explicitly allows the use of space. It says:

  NOTE: ISO 8601 defines date and time separated by "T".
  Applications using this syntax may choose, for the sake of
  readability, to specify a full-date and full-time separated by
  (say) a space character.

This paragraph was put into the RFC at my suggestion, precisely so that GNU 
"date" output wouldn't have to contain that "T".


Tf you want GNU 'date' to output the 'T', you can use 'date --iso-8601=s' 
instead of 'date --rfc-3339=s'. That's the point of having these two options for 
GNU 'date'. If it weren't for this difference in behavior, GNU 'date' wouldn't 
have needed a --rfc-3339 option in the first place, and we shouldn't change the 
meaning of --rfc-3339 to eviscerate the whole point of the option.






bug#39611: coreutils v8.31 fails to compile with -Ofast

2020-02-14 Thread Paul Eggert

On 2/14/20 3:21 PM, zsugabubus wrote:

$ export CFLAGS=-Ofast # Works with -O3
$ ./configure && make


Coreutils (and many other programs) is not compatible with -Ofast, which is not 
surprising as -Ofast is documented to not work in many cases. The obvious 
workaround is to not use -Ofast.






bug#39236: [musl] coreutils cp mishandles error return from lchmod

2020-02-07 Thread Paul Eggert

On 1/22/20 2:05 PM, Rich Felker wrote:

I think we're approaching a consensus that glibc should fix this too,
so then it would just be gnulib matching the fix.


I installed the attached patch to Gnulib in preparation for the upcoming 
glibc fix. The patch causes fchmodat with AT_SYMLINK_NOFOLLOW to work on 
non-symlinks, and similarly for lchmod on non-symlinks. The idea is to 
avoid this sort of problem in the future, and to let Coreutils etc. work 
on older platforms as if glibc 2.32 (or whatever) is already in place.
>From b16a04394121e7396569a13161dba02c6752b19f Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 7 Feb 2020 16:34:12 -0800
Subject: [PATCH] fchmodat: AT_SYMLINK_NOFOLLOW fix for non-symlinks

Fix lchmod, and fchmodat with AT_SYMLINK_NOFOLLOW, so that
they act like chmod on non-symlinks.
* NEWS:
* doc/glibc-functions/lchmod.texi (lchmod):
* doc/posix-functions/fchmodat.texi (fchmodat):
Mention this.
* lib/fchmodat.c: Define __need_system_sys_stat_h before including
config.h, and undef it after including sys/stat.h the first time.
Include fcntl.h, stdio.h, unistd.h, intprops.h, and include
sys/stat.h a second time after defining orig_fchmodat.
(orig_fchmodat) [HAVE_FCHMODAT]: New function.
(fchmodat) [HAVE_FCHMODAT]: Work around the AT_SYMLINK_NOFOLLOW bug.
* lib/lchmod.c: New file.
* lib/sys_stat.in.h (fchmodat, lchmod):
Support replacing these functions.
* m4/fchmodat.m4 (gl_FUNC_FCHMODAT): If fchmodat exists,
test that AT_SYMLINK_NOFOLLOW works on non-symlinks.
* m4/lchmod.m4 (gl_FUNC_LCHMOD): Check for lstat.
Test that lchmod works on non-symlinks.
* m4/sys_stat_h.m4 (gl_SYS_STAT_H_DEFAULTS):
Default REPLACE_FCHMODAT and REPLACE_LCHMOD to 0.
* modules/fchmodat (Depends-on): Add fstatat, intprops, lchmod, unistd.
(Depends-on, configure.ac): Check REPLACE_FCHMODAT too.
* modules/lchmod (Files): Add lib/lchmod.c.
(Depends-on): Add errno, fcntl-h, fchmodat, intprops, lstat, unistd.
(configure.ac): Compile lchmod.c if needed.
(lib_SOURCES): Add lchmod.c.
* modules/sys_stat (sys/stat.h): Substitute REPLACE_FCHMODAT
and REPLACE_LCHMOD.
* tests/test-fchmodat.c: Include fcntl.h, sys/stat.h.
(main): Test fchmodat with AT_SYMLINK_NOFOLLOW on non-symlinks.
---
 ChangeLog | 35 
 NEWS  |  7 +++
 doc/glibc-functions/lchmod.texi   |  4 ++
 doc/posix-functions/fchmodat.texi | 11 ++--
 lib/fchmodat.c| 89 +--
 lib/lchmod.c  | 72 +
 lib/sys_stat.in.h | 41 +++---
 m4/fchmodat.m4| 48 -
 m4/lchmod.m4  | 52 --
 m4/sys_stat_h.m4  |  4 +-
 modules/fchmodat  | 10 ++--
 modules/lchmod| 13 -
 modules/sys_stat  |  2 +
 tests/test-fchmodat.c | 10 
 14 files changed, 348 insertions(+), 50 deletions(-)
 create mode 100644 lib/lchmod.c

diff --git a/ChangeLog b/ChangeLog
index 99e0c2e9e..71dcaba6c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,38 @@
+2020-02-07  Paul Eggert  
+
+	fchmodat: AT_SYMLINK_NOFOLLOW fix for non-symlinks
+	Fix lchmod, and fchmodat with AT_SYMLINK_NOFOLLOW, so that
+	they act like chmod on non-symlinks.
+	* NEWS:
+	* doc/glibc-functions/lchmod.texi (lchmod):
+	* doc/posix-functions/fchmodat.texi (fchmodat):
+	Mention this.
+	* lib/fchmodat.c: Define __need_system_sys_stat_h before including
+	config.h, and undef it after including sys/stat.h the first time.
+	Include fcntl.h, stdio.h, unistd.h, intprops.h, and include
+	sys/stat.h a second time after defining orig_fchmodat.
+	(orig_fchmodat) [HAVE_FCHMODAT]: New function.
+	(fchmodat) [HAVE_FCHMODAT]: Work around the AT_SYMLINK_NOFOLLOW bug.
+	* lib/lchmod.c: New file.
+	* lib/sys_stat.in.h (fchmodat, lchmod):
+	Support replacing these functions.
+	* m4/fchmodat.m4 (gl_FUNC_FCHMODAT): If fchmodat exists,
+	test that AT_SYMLINK_NOFOLLOW works on non-symlinks.
+	* m4/lchmod.m4 (gl_FUNC_LCHMOD): Check for lstat.
+	Test that lchmod works on non-symlinks.
+	* m4/sys_stat_h.m4 (gl_SYS_STAT_H_DEFAULTS):
+	Default REPLACE_FCHMODAT and REPLACE_LCHMOD to 0.
+	* modules/fchmodat (Depends-on): Add fstatat, intprops, lchmod, unistd.
+	(Depends-on, configure.ac): Check REPLACE_FCHMODAT too.
+	* modules/lchmod (Files): Add lib/lchmod.c.
+	(Depends-on): Add errno, fcntl-h, fchmodat, intprops, lstat, unistd.
+	(configure.ac): Compile lchmod.c if needed.
+	(lib_SOURCES): Add lchmod.c.
+	* modules/sys_stat (sys/stat.h): Substitute REPLACE_FCHMODAT
+	and REPLACE_LCHMOD.
+	* tests/test-fchmodat.c: Include fcntl.h, sys/stat.h.
+	(main): Test fchmodat with AT_SYMLINK_NOFOLLOW on non-symlinks.
+
 2020-02-05  Marc Dionne(tiny change)
 
 	mountlist: Consider AFS filesystems as remote
diff --git a/NEWS b/NEWS
index dc5cc71f9..bc81dfc28 100644
--- a/NEWS
+++ b/NEWS
@@ -58,6 +58,13 @@ User visible incompatible changes
 
 D

bug#39273: unwanted behavior in the combination of an scenario regarding btrfs, ssh, borg, and 'df' from the core utils

2020-02-04 Thread Paul Eggert
penat(AT_FDCWD, "/dev/sda1", O_RDONLY) = 4
pread64(4,
"\334\301u\237\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"
..., 4096, 65536) = 4096
close(4)= 0
openat(AT_FDCWD, "/dev/sda1", O_RDONLY) = 4
ioctl(4, BLKGETSIZE64, [412294840320])  = 0
close(4)= 0
sysinfo({uptime=27223, loads=[16736, 24192, 20704],
totalram=25104957440, freeram=17284509696, sharedram=173166592,
bufferram=2154496, totalswap=0, freeswap=0, procs=704, totalhigh=0,
freehigh=0, mem_unit=1}) = 0
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=0} => {total_spaces=4}) = 0
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=4} => {total_spaces=4,
spaces=...}) = 0
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
write(1, "Overall:\n", 9Overall:
)   = 9
write(1, "Device size:\t\t 383.98GiB\n", 29Device
size:  383.98GiB
) = 29
write(1, "Device allocated:\t\t  61.03Gi"..., 34Device
allocated: 61.03GiB
) = 34
write(1, "Device unallocated:\t\t 322.95"..., 36Device
unallocated:  322.95GiB
) = 36
write(1, "Device missing:\t\t 0.00B\n", 32Device
missing:0.00B
) = 32
write(1, "Used:\t\t\t  55.61GiB\n",
23Used:   55.61GiB
) = 23
write(1, "Free (estimated):\t\t 328.22Gi"..., 51Free
(estimated):328.22GiB  (min: 328.22GiB)
) = 51
write(1, "Data ratio:\t\t\t  1.00\n", 29Data
ratio:   1.00
) = 29
write(1, "Metadata ratio:\t\t  1.00\n", 32Metadata
ratio: 1.00
) = 32
write(1, "Global reserve:\t\t 141.11MiB\t"..., 46Global
reserve: 141.11MiB  (used: 0.00B)
) = 46
write(1, "\n", 1
)   = 1
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=0} => {total_spaces=4}) = 0
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=4} => {total_spaces=4,
spaces=...}) = 0
write(1, "Data,single: Size:59.00GiB, Used"..., 51Data,single:
Size:59.00GiB, Used:53.72GiB (91.06%)
) = 51
write(1, "   /dev/sda1\t  59.00GiB\n", 24   /dev/sda1 59.00GiB
) = 24
write(1, "\n", 1
)   = 1
write(1, "Metadata,single: Size:2.00GiB, U"..., 53Metadata,single:
Size:2.00GiB, Used:1.88GiB (94.17%)
) = 53
write(1, "   /dev/sda1\t   2.00GiB\n", 24   /dev/sda1  2.00GiB
) = 24
write(1, "\n", 1
)   = 1
write(1, "System,single: Size:32.00MiB, Us"..., 52System,single:
Size:32.00MiB, Used:16.00KiB (0.05%)
) = 52
write(1, "   /dev/sda1\t  32.00MiB\n", 24   /dev/sda1 32.00MiB
) = 24
write(1, "\n", 1
)   = 1
write(1, "Unallocated:\n", 13Unallocated:
)  = 13
write(1, "   /dev/sda1\t 322.95GiB\n", 24   /dev/sda1322.95GiB
) = 24
close(3)= 0
exit_group(0)   = ?
+++ exited with 0 +++
wismerhill:/home/lux #




wismerhill:/home/lux # btrfs fi df  /
Data, single: total=59.00GiB, used=53.72GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=2.00GiB, used=1.88GiB
GlobalReserve, single: total=141.11MiB, used=0.00B



---


I hope the output helps to track down the reason so this is solveable.
One observation is that the Error appears on filesystems whitch are
pretty empty, less than 60% filled.

Phil :-)





Am Mittwoch, den 29.01.2020, 11:18 -0800 schrieb Paul Eggert:

On 1/29/20 5:42 AM, Wismerhill wrote:


i tried to replicate the error, but i couldn´t do it exact the same
procedure,  the Disk  was all ready filled up to 70%, and was not
empty
like when the error appeared.
I couldn´t move the files away due a space problem.
I tried with my Musicarchiv (ca. 750 GB) but the error didn´t
appear.

As a workaround for my problem (how i filled up the Disk) i created
a
borg backup local on a USB 1 TB Disk (Btrfs Filesystem) without
problems and used Rsync to copy that backup to the server (happens
befor i got your mail).
On the Server i got the same confusing freespace by then, df, and
KDE
Plasma widgets show me 0 Byte left, but Rsync finished without
error,
and the borg repositorie is  working troubleless remote.

As soon as i run again into that error i will do the procedure you
described me, and send in the requested datas.



Thanks for the heads-up. Please cc 39...@debbugs.gnu.org with any
further info that you may provide.







bug#39273: unwanted behavior in the combination of an scenario regarding btrfs, ssh, borg, and 'df' from the core utils

2020-01-29 Thread Paul Eggert

On 1/29/20 5:42 AM, Wismerhill wrote:


i tried to replicate the error, but i couldn´t do it exact the same
procedure,  the Disk  was all ready filled up to 70%, and was not empty
like when the error appeared.
I couldn´t move the files away due a space problem.
I tried with my Musicarchiv (ca. 750 GB) but the error didn´t appear.

As a workaround for my problem (how i filled up the Disk) i created a
borg backup local on a USB 1 TB Disk (Btrfs Filesystem) without
problems and used Rsync to copy that backup to the server (happens
befor i got your mail).
On the Server i got the same confusing freespace by then, df, and KDE
Plasma widgets show me 0 Byte left, but Rsync finished without error,
and the borg repositorie is  working troubleless remote.

As soon as i run again into that error i will do the procedure you
described me, and send in the requested datas.



Thanks for the heads-up. Please cc 39...@debbugs.gnu.org with any 
further info that you may provide.






bug#39273: unwanted behavior in the combination of an scenario regarding btrfs, ssh, borg, and 'df' from the core utils

2020-01-24 Thread Paul Eggert

On 1/24/20 11:50 AM, Wismerhill wrote:

'df' reports a wrong space calculation


What's wrong about the space calculation?

Please give the 'df' command that you ran, its faulty output, and also 
the output of 'strace' applied to the 'df' command that you ran. For 
example, on my machine, 'strace df' outputs the line:


statfs("/tmp", {f_type=TMPFS_MAGIC, f_bsize=4096, f_blocks=1018122, 
f_bfree=1007348, f_bavail=1007348, f_files=1018122, f_ffree=1018073, 
f_fsid={val=[0, 0]}, f_namelen=255, f_frsize=4096, 
f_flags=ST_VALID|ST_NOSUID|ST_NODEV}) = 0


and those are the numbers that 'df' uses to calculate what it should 
output. If those numbers are wrong, df's output will be wrong but it's 
not df's fault - it's the kernel or btrfs or whatever. If those numbers 
are right but df's output is wrong, then df is at fault.






bug#39236: [musl] coreutils cp mishandles error return from lchmod

2020-01-22 Thread Paul Eggert

On 1/22/20 7:08 AM, Florian Weimer wrote:

I think you misread what I wrote: lchmod*always*  returns ENOSYS.  Even
if the file is not a symbolic link.  Likewise, fchmodat with
AT_SYMLINK_NOFOLLOW *always* returns ENOTSUP.


That's too bad, because coreutils (and many other applications, I 
expect) assume that lchmod (and fchmodat with AT_SYMLINK_NOFOLLOW) to 
act like chmod except not follow symlinks, in order to make it less 
likely that the application will run afoul of a symlink race and chmod 
the wrong file. Isn't that how the Linux fstatat call behaves? And if 
so, why does glibc fstatat refuse to support this behavior?


To work around this bug, I suppose coreutils etc. should do something 
like the following:


1. Never use lchmod since the porting nightmare is bad enough without it.

2. On non-glibc systems (or glibc systems where the bug is fixed), use 
fchmodat with AT_SYMLINK_NOFOLLOW.


3. On glibc systems with the bug, use openat with AT_SYMLINK_NOFOLLOW 
and O_PATH, and then fchmod the resulting file descriptor.


Does this sound right? Or is there some O_PATH gotcha that I haven't 
thought about?


Come to think of it, perhaps the best thing would be to change Gnulib's 
lchmod and fchmodat modules so that they do what applications expect, 
even on buggy glibc systems. (Which would be ironic, since Gnulib's main 
goal is to put wrappers around other libraries so that they look more 
like glibc.)






bug#38627: uniq -c gets wrong count with non-ascii strings

2019-12-16 Thread Paul Eggert
On 12/15/19 11:40 AM, Roy Smith wrote:
> With the following input:
> 
>> $ cat x
>> "ⁿᵘˡˡ"
>> "ܥܝܪܐܩ"
> 
> 
> Running "uniq -c" says there's two copies of the same line!
> 
>> $ uniq -c x
>>   2 "ⁿᵘˡˡ"

Thanks for the bug report. I expect this is because GNU 'uniq' uses the
equivalent of strcoll (locale-dependent comparison) to compare lines, whereas
macOS 'uniq' uses the equivalent of strcmp (byte comparison). Since the two
lines compare equal in your locale, GNU 'uniq' says there's just one line.

The GNU 'uniq' behavior appears to be a consequence of this commit:

commit 545c2323d493c7ed9c770d9b8e45a15db6f615bc
Author: Jim Meyering 
Date:   Fri Aug 2 14:42:37 2002 +

with a change noted this way in NEWS:

* uniq now obeys the LC_COLLATE locale, as per POSIX 1003.1-2001 TC1.

However, the 2016 edition of POSIX removed mention of LC_COLLATE from 'uniq',
and I expect this means that the 2002 commit should be reverted so that GNU
'uniq' behaves like macOS 'uniq' (a behavior that I think makes more sense 
anyway).

I'll CC: this email to Jim Meyering to see whether he has an opinion about this.

In the meantime you can work around the problem by using 'LC_ALL=C uniq' instead
of plain 'uniq' in your shell script.





bug#38299: A bug while trying to decode a non encode base64

2019-11-20 Thread Paul Eggert

On 11/20/19 6:22 AM, Martin Schulte wrote:

vardhamanbn1 is a valid encoding


Thanks for explaining; closing the bug report.





bug#38168: shred vs. SSD

2019-11-11 Thread Paul Eggert
Thanks for mentioning this. I installed the attached patch to fix the problems 
that you mentioned, except that I didn't add a section on storage media, data 
remanence, and data forensics (partly because a lot of this stuff is secret).


If someone would like to contribute text in that area, it would be a good thing 
to have (if only to discourage even more users from using 'shred' :-). In the 
meantime I'll take the liberty of closing the bug report.
>From adf41d7c1e8adf11857ee53d51419e218dcd8804 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 11 Nov 2019 16:52:47 -0800
Subject: [PATCH] shred: modernize documentation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* doc/coreutils.texi (shred invocation):
Modernize discussion to today’s technology (Bug#38168).
* src/shred.c (usage): Omit lengthy duplication of the manual’s
discussion of file systems and storage devices, as that became out
of sync with the manual.  Instead, just cite the manual.
---
 doc/coreutils.texi | 152 ++---
 src/shred.c|  42 ++---
 2 files changed, 93 insertions(+), 101 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index b552cc105..32ddba597 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9877,7 +9877,7 @@ by POSIX.
 
 @emph{Warning}: If you use @command{rm} to remove a file, it is usually
 possible to recover the contents of that file.  If you want more assurance
-that the contents are truly unrecoverable, consider using @command{shred}.
+that the contents are unrecoverable, consider using @command{shred}.
 
 The program accepts the following options.  Also see @ref{Common options}.
 
@@ -10019,51 +10019,46 @@ predates the development of the @code{getopt} standard syntax.
 @cindex erasing data
 
 @command{shred} overwrites devices or files, to help prevent even
-very expensive hardware from recovering the data.
-
-Ordinarily when you remove a file (@pxref{rm invocation}), the data is
-not actually destroyed.  Only the index listing where the file is
-stored is destroyed, and the storage is made available for reuse.
-There are undelete utilities that will attempt to reconstruct the index
-and can bring the file back if the parts were not reused.
-
-On a busy system with a nearly-full drive, space can get reused in a few
-seconds.  But there is no way to know for sure.  If you have sensitive
-data, you may want to be sure that recovery is not possible by actually
-overwriting the file with non-sensitive data.
-
-However, even after doing that, it is possible to take the disk back
-to a laboratory and use a lot of sensitive (and expensive) equipment
-to look for the faint ``echoes'' of the original data underneath the
-overwritten data.  If the data has only been overwritten once, it's not
-even that hard.
+extensive forensics from recovering the data.
+
+Ordinarily when you remove a file (@pxref{rm invocation}), its data
+and metadata are not actually destroyed.  Only the file's directory
+entry is removed, and the file's storage is reclaimed only when no
+process has the file open and no other directory entry links to the
+file.  And even if file's data and metadata's storage space is freed
+for further reuse, there are undelete utilities that will attempt to
+reconstruct the file from the data in freed storage, and that can
+bring the file back if the storage was not rewritten.
+
+On a busy system with a nearly-full device, space can get reused in a few
+seconds.  But there is no way to know for sure.  And although the
+undelete utilities and already-existing processes require insider or
+superuser access, you may be wary of the superuser,
+of processes running on your behalf, or of attackers
+that can physically access the storage device.  So if you have sensitive
+data, you may want to be sure that recovery is not possible
+by plausible attacks like these.
 
 The best way to remove something irretrievably is to destroy the media
 it's on with acid, melt it down, or the like.  For cheap removable media
-like floppy disks, this is the preferred method.  However, hard drives
-are expensive and hard to melt, so the @command{shred} utility tries
-to achieve a similar effect non-destructively.
-
-This uses many overwrite passes, with the data patterns chosen to
-maximize the damage they do to the old data.  While this will work on
-floppies, the patterns are designed for best effect on hard drives.
-For more details, see the source code and Peter Gutmann's paper
-@uref{https://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html,
-@cite{Secure Deletion of Data from Magnetic and Solid-State Memory}},
-from the proceedings of the Sixth USENIX Security Symposium (San Jose,
-California, July 22--25, 1996).
-
-@strong{Please note} that @command{shred} relies on a very important assumption:
-that the file system overwrites data in place.  This is the traditional
+this is often the preferred met

bug#37961: Bug report of date commond

2019-10-28 Thread Paul Eggert

On 10/28/19 12:34 AM, zhangzhi...@mail.iap.ac.cn wrote:

~>date  -d "1940-06-01" +"%Y-%m-%d"
date: invalid date ‘1940-06-01’


Presumably your TZ setting is Asia/Shanghai, as I see the symptoms as 
follows:


$ TZ=Asia/Shanghai date  -d "1940-06-01" +"%Y-%m-%d"
date: invalid date ‘1940-06-01’

This is because there is no instant of time 1940-06-01 00:00:00 in 
Shanghai, as the the clock ticked over from 1940-05-30 23:59:59 to 
1940-06-01 01:00:00 due to a daylight-saving time transition.


For this particular case, you'll have better luck with:

$ date -d "1940-06-01 12:00" +"%Y-%m-%d"

but this sort of approach does not work in general, because 12:00 does 
not always exist either. In other words, the 'date' command is not 
suited for calendrical arithmetic in general, only for time arithmetic.






bug#37893: fixes for 'shuf -n 0x' and similar issues

2019-10-23 Thread Paul Eggert
I installed the attached patches to fix some minor glitches with 
programs like 'shuf' failing to reject invalid arguments like '-n 0x' 
where the trailing 'x' is ignored. The last two patches do the real 
work; the others are minor cleanups.
>From d4cbfaeca1e838a0d0373adfbd133a9f5eaa8e87 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 22 Oct 2019 11:34:56 -0700
Subject: [PATCH 1/5] build: re-enable type-limits checking

* configure.ac: When --enable-gcc-warnings is used, omit
-Wno-type-limits.  The need for -Wno-type-limits has passed, now
that intprops.h uses builtin primitives for GCC 5 and later, given
that recent GCCs issue type-limits warnings only for non-constant
expressions.  --enable-gcc-warnings is not intended for use with
old compilers, so we can drop -Wno-type-limits now.
---
 configure.ac | 2 --
 1 file changed, 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index d90c710e3..292ae0bf2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -134,7 +134,6 @@ if test "$gl_gcc_warnings" = yes; then
   nw="$nw -Wswitch-enum"# Too many warnings for now
   nw="$nw -Wswitch-default" # Too many warnings for now
   nw="$nw -Wstack-protector"# not worth working around
-  nw="$nw -Wtype-limits"# False alarms for portable code
   nw="$nw -Wformat-overflow=2"  # False alarms due to GCC bug 80776
   nw="$nw -Wformat-truncation=2"# False alarm in ls.c, probably related
   # things I might fix soon:
@@ -155,7 +154,6 @@ if test "$gl_gcc_warnings" = yes; then
 gl_WARN_ADD([$w])
   done
   gl_WARN_ADD([-Wno-sign-compare]) # Too many warnings for now
-  gl_WARN_ADD([-Wno-type-limits])  # False alarms for portable code
   gl_WARN_ADD([-Wno-unused-parameter]) # Too many warnings for now
   gl_WARN_ADD([-Wno-format-nonliteral])
 
-- 
2.21.0

>From 6778871f67c6a66aacc76b0a63ff26c8b72dce87 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 22 Oct 2019 12:52:52 -0700
Subject: [PATCH 2/5] =?UTF-8?q?build:=20don=E2=80=99t=20worry=20about=20lo?=
 =?UTF-8?q?gical-op=20checking?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* configure.ac: Remove code tailoring --enable-gcc-warnings
to GCC 4.7 and earlier, as developers no longer need to worry
about GCCs that old.
---
 configure.ac | 6 --
 1 file changed, 6 deletions(-)

diff --git a/configure.ac b/configure.ac
index 292ae0bf2..18c5a99bd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -128,7 +128,6 @@ if test "$gl_gcc_warnings" = yes; then
   nw="$nw -Wunreachable-code"   # Too many warnings for now
   nw="$nw -Wpadded" # Our structs are not padded
   nw="$nw -Wredundant-decls"# openat.h declares e.g., mkdirat
-  nw="$nw -Wlogical-op" # Too many warnings until GCC 4.8.0
   nw="$nw -Wformat-nonliteral"  # who.c and pinky.c strftime uses
   nw="$nw -Wnested-externs" # use of XARGMATCH/verify_function__
   nw="$nw -Wswitch-enum"# Too many warnings for now
@@ -157,11 +156,6 @@ if test "$gl_gcc_warnings" = yes; then
   gl_WARN_ADD([-Wno-unused-parameter]) # Too many warnings for now
   gl_WARN_ADD([-Wno-format-nonliteral])
 
-  # Enable this warning only with gcc-4.8 and newer.  Before that
-  # bounds checking as done in truncate.c was incorrectly flagged.
-  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772
-  gl_GCC_VERSION_IFELSE([4], [8], [gl_WARN_ADD([-Wlogical-op])])
-
   # clang is unduly picky about some things.
   AC_CACHE_CHECK([whether the compiler is clang], [utils_cv_clang],
 [AC_COMPILE_IFELSE(
-- 
2.21.0

>From 95c705bdc03c89cdf774f90ee68452ebd46b400a Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 22 Oct 2019 12:58:07 -0700
Subject: [PATCH 3/5] shuf: improve randperm overflow checking
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* gl/lib/randperm.c: Include randperm.h first, since it’s the API.
Include stdint.h, count-leading-zeros.h, verify.h.
(floor_lg): Rename from ceil_log (which was not actually
implementing the ceiling!) and implement the floor using
count_leading_zeros.
(randperm_bound): Use floor_lg, not ceil_log.  Use uintmax_t
instead of size_t in case the size gets large on a 32-bit host.
* gl/modules/randperm (Depends-on): Add count-leading-zeros, stdint.
---
 gl/lib/randperm.c   | 27 ---
 gl/modules/randperm |  2 ++
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/gl/lib/randperm.c b/gl/lib/randperm.c
index ce69222e9..b079aba33 100644
--- a/gl/lib/randperm.c
+++ b/gl/lib/randperm.c
@@ -19,24 +19,29 @@
 
 #include 
 
-#include "hash.h"
 #include "randperm.h"
 
 #include 
+#include 
 #include 
 
+#include "count-leading-zeros.h"
+#include "hash.h

bug#37877: Fix 'shuf -n 0x' and similar problems

2019-10-22 Thread Paul Eggert
I installed into GNU coreutils the attached series of patches, to fix 
problems like 'shuf -n 0x' where shuf did not diagnose the trailing 'x', 
along with some other stuff I noticed while looking into the problem.
>From d4cbfaeca1e838a0d0373adfbd133a9f5eaa8e87 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 22 Oct 2019 11:34:56 -0700
Subject: [PATCH 1/5] build: re-enable type-limits checking

* configure.ac: When --enable-gcc-warnings is used, omit
-Wno-type-limits.  The need for -Wno-type-limits has passed, now
that intprops.h uses builtin primitives for GCC 5 and later, given
that recent GCCs issue type-limits warnings only for non-constant
expressions.  --enable-gcc-warnings is not intended for use with
old compilers, so we can drop -Wno-type-limits now.
---
 configure.ac | 2 --
 1 file changed, 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index d90c710e3..292ae0bf2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -134,7 +134,6 @@ if test "$gl_gcc_warnings" = yes; then
   nw="$nw -Wswitch-enum"# Too many warnings for now
   nw="$nw -Wswitch-default" # Too many warnings for now
   nw="$nw -Wstack-protector"# not worth working around
-  nw="$nw -Wtype-limits"# False alarms for portable code
   nw="$nw -Wformat-overflow=2"  # False alarms due to GCC bug 80776
   nw="$nw -Wformat-truncation=2"# False alarm in ls.c, probably related
   # things I might fix soon:
@@ -155,7 +154,6 @@ if test "$gl_gcc_warnings" = yes; then
 gl_WARN_ADD([$w])
   done
   gl_WARN_ADD([-Wno-sign-compare]) # Too many warnings for now
-  gl_WARN_ADD([-Wno-type-limits])  # False alarms for portable code
   gl_WARN_ADD([-Wno-unused-parameter]) # Too many warnings for now
   gl_WARN_ADD([-Wno-format-nonliteral])
 
-- 
2.21.0

>From 6778871f67c6a66aacc76b0a63ff26c8b72dce87 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 22 Oct 2019 12:52:52 -0700
Subject: [PATCH 2/5] =?UTF-8?q?build:=20don=E2=80=99t=20worry=20about=20lo?=
 =?UTF-8?q?gical-op=20checking?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* configure.ac: Remove code tailoring --enable-gcc-warnings
to GCC 4.7 and earlier, as developers no longer need to worry
about GCCs that old.
---
 configure.ac | 6 --
 1 file changed, 6 deletions(-)

diff --git a/configure.ac b/configure.ac
index 292ae0bf2..18c5a99bd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -128,7 +128,6 @@ if test "$gl_gcc_warnings" = yes; then
   nw="$nw -Wunreachable-code"   # Too many warnings for now
   nw="$nw -Wpadded" # Our structs are not padded
   nw="$nw -Wredundant-decls"# openat.h declares e.g., mkdirat
-  nw="$nw -Wlogical-op" # Too many warnings until GCC 4.8.0
   nw="$nw -Wformat-nonliteral"  # who.c and pinky.c strftime uses
   nw="$nw -Wnested-externs" # use of XARGMATCH/verify_function__
   nw="$nw -Wswitch-enum"# Too many warnings for now
@@ -157,11 +156,6 @@ if test "$gl_gcc_warnings" = yes; then
   gl_WARN_ADD([-Wno-unused-parameter]) # Too many warnings for now
   gl_WARN_ADD([-Wno-format-nonliteral])
 
-  # Enable this warning only with gcc-4.8 and newer.  Before that
-  # bounds checking as done in truncate.c was incorrectly flagged.
-  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772
-  gl_GCC_VERSION_IFELSE([4], [8], [gl_WARN_ADD([-Wlogical-op])])
-
   # clang is unduly picky about some things.
   AC_CACHE_CHECK([whether the compiler is clang], [utils_cv_clang],
 [AC_COMPILE_IFELSE(
-- 
2.21.0

>From 95c705bdc03c89cdf774f90ee68452ebd46b400a Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 22 Oct 2019 12:58:07 -0700
Subject: [PATCH 3/5] shuf: improve randperm overflow checking
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* gl/lib/randperm.c: Include randperm.h first, since it’s the API.
Include stdint.h, count-leading-zeros.h, verify.h.
(floor_lg): Rename from ceil_log (which was not actually
implementing the ceiling!) and implement the floor using
count_leading_zeros.
(randperm_bound): Use floor_lg, not ceil_log.  Use uintmax_t
instead of size_t in case the size gets large on a 32-bit host.
* gl/modules/randperm (Depends-on): Add count-leading-zeros, stdint.
---
 gl/lib/randperm.c   | 27 ---
 gl/modules/randperm |  2 ++
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/gl/lib/randperm.c b/gl/lib/randperm.c
index ce69222e9..b079aba33 100644
--- a/gl/lib/randperm.c
+++ b/gl/lib/randperm.c
@@ -19,24 +19,29 @@
 
 #include 
 
-#include "hash.h"
 #include "randperm.h"
 
 #include 
+#include 
 #include 
 
+#include "count-leading-zeros.h"
+#include "hash.h"
+#include "

bug#37859: [PATCH] shuf: fix bug with ‘-r -n 0’

2019-10-21 Thread Paul Eggert
‘shuf -r -n 0 file’ would mistakenly read from standard input.
Problem reported by my student Jingnong Qu while reimplementing a
shuf subset in Python as an exercise in UCLA Computer Science 35L:
https://web.cs.ucla.edu/classes/fall19/cs35L/assign/assign3.html
* NEWS: Mention the fix.  Also, ASCIIfy a previous item.
* src/shuf.c (main): Fix bug.
* tests/misc/shuf.sh: Add a test case for the bug.
---
 NEWS   |  5 -
 src/shuf.c | 53 +-
 tests/misc/shuf.sh |  4 
 3 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/NEWS b/NEWS
index fe38e80d4..476c02aed 100644
--- a/NEWS
+++ b/NEWS
@@ -29,6 +29,9 @@ GNU coreutils NEWS-*- 
outline -*-
   (like Solaris 10 and Solaris 11).
   [bug introduced in coreutils-8.31]
 
+  'shuf -r -n 0 file' no longer mistakenly reads from standard input.
+  [bug introduced with the --repeat feature in coreutils-8.22]
+
   split no longer reports a "output file suffixes exhausted" error
   when the specified number of files is evenly divisible by 10, 16, 26,
   for --numeric, --hex, or default alphabetic suffixes respectively.
@@ -210,7 +213,7 @@ GNU coreutils NEWS-*- 
outline -*-
   'mv -n A B' no longer suffers from a race condition that can
   overwrite a simultaneously-created B.  This bug fix requires
   platform support for the renameat2 or renameatx_np syscalls, found
-  in recent Linux and macOS kernels.  As a side effect, ‘mv -n A A’
+  in recent Linux and macOS kernels.  As a side effect, 'mv -n A A'
   now silently does nothing if A exists.
   [bug introduced with coreutils-7.1]
 
diff --git a/src/shuf.c b/src/shuf.c
index 968d3641c..6a1aa0158 100644
--- a/src/shuf.c
+++ b/src/shuf.c
@@ -493,7 +493,12 @@ main (int argc, char **argv)
 }
 
   /* Prepare input.  */
-  if (echo)
+  if (head_lines == 0)
+{
+  n_lines = 0;
+  line = NULL;
+}
+  else if (echo)
 {
   input_from_argv (operand, n_operands, eolbyte);
   n_lines = n_operands;
@@ -507,54 +512,54 @@ main (int argc, char **argv)
   else
 {
   /* If an input file is specified, re-open it as stdin.  */
-  if (n_operands == 1)
-if (! (STREQ (operand[0], "-") || ! head_lines
-   || freopen (operand[0], "r", stdin)))
-  die (EXIT_FAILURE, errno, "%s", quotef (operand[0]));
+  if (n_operands == 1
+  && ! (STREQ (operand[0], "-")
+|| freopen (operand[0], "r", stdin)))
+die (EXIT_FAILURE, errno, "%s", quotef (operand[0]));
 
   fadvise (stdin, FADVISE_SEQUENTIAL);
 
-  if (! repeat && head_lines != SIZE_MAX
-  && (! head_lines || input_size () > RESERVOIR_MIN_INPUT))
+  if (repeat || head_lines == SIZE_MAX
+  || input_size () <= RESERVOIR_MIN_INPUT)
 {
-  use_reservoir_sampling = true;
-  n_lines = SIZE_MAX;   /* unknown number of input lines, for now.  */
+  n_lines = read_input (stdin, eolbyte, _lines);
+  line = input_lines;
 }
   else
 {
-  n_lines = read_input (stdin, eolbyte, _lines);
-  line = input_lines;
+  use_reservoir_sampling = true;
+  n_lines = SIZE_MAX;   /* unknown number of input lines, for now.  */
 }
 }
 
-  if (! repeat)
-head_lines = MIN (head_lines, n_lines);
+  /* The adjusted head line count; can be less than HEAD_LINES if the
+ input is small and if not repeating.  */
+  size_t ahead_lines = repeat || head_lines < n_lines ? head_lines : n_lines;
 
   randint_source = randint_all_new (random_source,
 (use_reservoir_sampling || repeat
  ? SIZE_MAX
- : randperm_bound (head_lines, n_lines)));
+ : randperm_bound (ahead_lines, n_lines)));
   if (! randint_source)
 die (EXIT_FAILURE, errno, "%s", quotef (random_source));
 
   if (use_reservoir_sampling)
 {
   /* Instead of reading the entire file into 'line',
- use reservoir-sampling to store just "head_lines" random lines.  */
-  n_lines = read_input_reservoir_sampling (stdin, eolbyte, head_lines,
+ use reservoir-sampling to store just AHEAD_LINES random lines.  */
+  n_lines = read_input_reservoir_sampling (stdin, eolbyte, ahead_lines,
randint_source, );
-  head_lines = n_lines;
+  ahead_lines = n_lines;
 }
 
   /* Close stdin now, rather than earlier, so that randint_all_new
  doesn't have to worry about opening something other than
  stdin.  */
-  if (! (echo || input_range)
-  && (fclose (stdin) != 0))
+  if (! (head_lines == 0 || echo || input_range || fclose (stdin) == 0))
 die (EXIT_FAILURE, errno, _("read error"));
 
   if (!repeat)
-permutation = randperm_new (randint_source, head_lines, n_lines);
+  

bug#37702: Suggestion for 'df' utility

2019-10-14 Thread Paul Eggert

On 10/14/19 1:01 AM, Kamil Dudka wrote:

This is not an excuse to introduce new problems.
I'm not looking for an "excuse". df (through no fault of its own) has 
evolved into a bad program that needs fixing. Backward compatibility 
concerns are real and we should take them into account, but they should 
not be an "excuse" for refusing to fix a bad program.






bug#37702: Suggestion for 'df' utility

2019-10-14 Thread Paul Eggert

On 10/13/19 3:00 PM, Assaf Gordon wrote:


I'm not sure if it's easy to find a set of criteria
that would work well while having minimal unexpected side effects of hiding 
entries people in other systems do expect to see.


No matter what we do (even if we do nothing), there will be problems. But doing 
nothing is clearly a bad idea, as the output of plain df is quite bad right now 
in typical use. We can do better than that, even if we cannot be perfect and we 
cause problems by changing defaults.



Out of curiosity,
can you share the output of the following commands on the same system?


Sure, here it is:

$ lsblk
NAME  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0   7:00 140.7M  1 loop  /snap/gnome-3-26-1604/90
loop1   7:10  44.2M  1 loop  /snap/gtk-common-themes/1353
loop2   7:20 149.9M  1 loop  /snap/gnome-3-28-1804/71
loop3   7:30   3.7M  1 loop  /snap/gnome-system-monitor/100
loop4   7:40  14.8M  1 loop  /snap/gnome-characters/317
loop5   7:50   956K  1 loop  /snap/gnome-logs/73
loop6   7:60 149.9M  1 loop  /snap/gnome-3-28-1804/67
loop7   7:70   3.7M  1 loop  /snap/gnome-system-monitor/95
loop8   7:80 4M  1 loop  /snap/gnome-calculator/406
loop9   7:90  54.5M  1 loop  /snap/core18/1192
loop10  7:10   089M  1 loop  /snap/core/7713
loop11  7:11   0  42.8M  1 loop  /snap/gtk-common-themes/1313
loop12  7:12   0  14.8M  1 loop  /snap/gnome-characters/296
loop13  7:13   0 140.7M  1 loop  /snap/gnome-3-26-1604/92
loop14  7:14   0  89.1M  1 loop  /snap/core/7917
loop15  7:15   0   956K  1 loop  /snap/gnome-logs/81
loop16  7:16   0   4.2M  1 loop  /snap/gnome-calculator/501
loop17  7:17   0  54.4M  1 loop  /snap/core18/1144
sda 8:00 111.8G  0 disk
├─sda1  8:10  96.9G  0 part  /
├─sda2  8:20 1K  0 part
└─sda5  8:5015G  0 part  [SWAP]
sdb 8:16   0   2.7T  0 disk
└─sdb1  8:17   0   2.7T  0 part
  └─md127   9:127  0   2.7T  0 raid1
└─md127p1 259:00   2.7T  0 md/home
sdc 8:32   0   2.7T  0 disk
└─sdc1  8:33   0   2.7T  0 part
  └─md127   9:127  0   2.7T  0 raid1
└─md127p1 259:00   2.7T  0 md/home
sdd 8:48   1   7.5G  0 disk
└─sdd1  8:49   1   7.5G  0 part  /media/eggert/B827-D456
sr011:01  1024M  0 rom
$ df -x tmpfs -x devtmpfs -x squashfs
Filesystem  1K-blocks   Used  Available Use% Mounted on
/dev/sda199431552   11740452   82597212  13% /
/dev/md127p1   2884021472 1326329584 1411168908  49% /home
/dev/sdd1 781286447051363107728  61% /media/eggert/B827-D456





bug#37702: Suggestion for 'df' utility

2019-10-13 Thread Paul Eggert

On 10/13/19 2:11 PM, Assaf Gordon wrote:


This thread originated by a request to "clean up" the output on newer
ubuntu machines which use "snap" packages as /dev/loopN .

Let's not turn that into a drastic change


It could certainly be multiple sets of patches. But let's face it, df's utility 
for ordinary interactive use has degraded significantly with time due to all the 
random filesystems people have been adding, and we shouldn't keep our heads in 
the sands about this. df's default needs to change, one way or another.


I mean c'mon, here's the output of 'df' on the Ubuntu 18.04.3 LTS workstation 
I'm typing this particular message on. In any sane system there would be only 
four lines of non-header output (for tmpfs etc, /, /home, and 
/media/eggert/B827-D456), but df is outputting 28 lines. This is ridiculous.


Filesystem  1K-blocks   Used  Available Use% Mounted on
udev  7644704  07644704   0% /dev
tmpfs 1533620   19241531696   1% /run
/dev/sda199431552   11740340   82597324  13% /
tmpfs 7668096  332127634884   1% /dev/shm
tmpfs5120  8   5112   1% /run/lock
tmpfs 7668096  07668096   0% /sys/fs/cgroup
/dev/loop0 144128 144128  0 100% /snap/gnome-3-26-1604/90
/dev/loop1  45312  45312  0 100% 
/snap/gtk-common-themes/1353
/dev/loop2 153600 153600  0 100% /snap/gnome-3-28-1804/71
/dev/loop4  15104  15104  0 100% /snap/gnome-characters/317
/dev/loop6 153600 153600  0 100% /snap/gnome-3-28-1804/67
/dev/loop5   1024   1024  0 100% /snap/gnome-logs/73
/dev/loop3   3840   3840  0 100% 
/snap/gnome-system-monitor/100
/dev/loop10 91264  91264  0 100% /snap/core/7713
/dev/loop7   3840   3840  0 100% 
/snap/gnome-system-monitor/95
/dev/loop8   4224   4224  0 100% /snap/gnome-calculator/406
/dev/loop9  55808  55808  0 100% /snap/core18/1192
/dev/loop11 43904  43904  0 100% 
/snap/gtk-common-themes/1313
/dev/loop12 15104  15104  0 100% /snap/gnome-characters/296
/dev/loop13144128 144128  0 100% /snap/gnome-3-26-1604/92
/dev/loop14 91264  91264  0 100% /snap/core/7917
/dev/loop15  1024   1024  0 100% /snap/gnome-logs/81
/dev/loop17 55808  55808  0 100% /snap/core18/1144
/dev/loop16  4352   4352  0 100% /snap/gnome-calculator/501
/dev/md127p1   2884021472 1326255744 1411242748  49% /home
tmpfs 1533616 161533600   1% /run/user/121
tmpfs 1533616 601533556   1% /run/user/1000
/dev/sdd1 781286447051363107728  61% /media/eggert/B827-D456





bug#37702: Suggestion for 'df' utility

2019-10-13 Thread Paul Eggert

On 10/13/19 2:41 AM, Pádraig Brady wrote:

I wonder could we key (also) on used==0||available==0.


Yes, looking at the sample output I gave earlier, I'd say we could by default 
drop filesystems where usage is 1% or less. That would solve the problem for my 
workstation. This is roughly akin to the "used==0" test you're suggesting.


(I don't know if this would address the problem of small snap loop devices. I 
haven't seen sample df output from that.)


What you mean by "available==0"? Actual zero-size filesystems, or filesystems 
whose size is less than some epsilon? What would the epsilon be? Can you give an 
example?






bug#37702: Suggestion for 'df' utility

2019-10-11 Thread Paul Eggert

On 10/11/19 11:20 AM, Pádraig Brady wrote:


if you want to exclude nested file systems like that,
you could try:

   alias df='df -x squashfs'


On my Fedora 30 workstation that option doesn't make any difference. 
Regardless of whether '-x squashfs' is used, I see this output from 'df':


Filesystem  1K-blocks  Used  Available Use% Mounted on
devtmpfs  4065704 04065704   0% /dev
tmpfs 4081560 366164044944   1% /dev/shm
tmpfs 4081560  16964079864   1% /run
tmpfs 4081560 04081560   0% /sys/fs/cgroup
/dev/sda559614116  16910684   39645412  30% /
tmpfs 4081560   1244081436   1% /tmp
/dev/sda2  1849433716 207781976 1547682948  12% /home
/dev/sda1 50950402444684572044   6% /boot
tmpfs  81631260 816252   1% /run/user/1000

and most of these lines are useless.

For many years we've put up with the problem of too many filesystems in 
the default plain 'df' output, and now's as good a time as any to fix 
that. On my workstation there should be only four lines of information, 
one each for /, /home, /boot, and the shared tmpfs area.


Presumably readonly filesystems should also be omitted by default, since 
they're not something people ordinarily care about.


We can add a flag or two for the rare people who want to see these 
normally-useless lines.






bug#37696: Compile Coreutils without xattr but i installed

2019-10-10 Thread Paul Eggert

On 10/10/19 11:57 AM, Wei MA wrote:

I compile the source code. And when i ran 
tests/cp/capabiliy.sh, cp preserves attr failed without 
xattr support . Then i installed xattr. I deleted coreutils and downloaded it again. 
The problem still exists. I use Ubuntu 18. When i ran cp of Ubuntu, the same commands 
has no problem.


That's a little vague. Can you send us a complete, self-contained test 
case (e.g., a shell script) illustrating the problem?






bug#37650: Fw: Possible bug.

2019-10-08 Thread Paul Eggert

On 10/7/19 9:12 PM, George R Goffe wrote:


The intent of the configure directive "--enable-gcc-warnings" that states, "turn on 
many GCC warnings (for developers; best with GNU make)" is, I guess, to warn developers of 
situations they need to be aware of.


Sorry, it looks like I misunderstood your bug report.

--enable-gcc-warnings is intended for developers, that is, people who are 
maintaining or modifying coreutils. It tends to generate so many false alarms 
that it is not intended for builders (people who are compiling and installing 
coreutils). This particular diagnostic appears to be a false alarm, as coreutils 
should configure and build just find if sys/sysctl.h is missing. So I suggest 
either ignoring the warning, or omitting --enable-gcc-warnings, or (and this is 
hardest suggestion, and meant only if you want to become a developer) developing 
a patch that pacifies your compiler.






bug#37650: Fw: Possible bug.

2019-10-07 Thread Paul Eggert

On 10/7/19 2:31 PM, George R Goffe via GNU coreutils Bug Reports wrote:

I guess the builds would succeed if it weren't for the "-Werror=cpp" flag. I 
can try removing this flag if you'd like.


Yes, please do that. It's generally not a good idea to build with 
different flags than you configured with.






bug#37585: Undefined behavior in nl, print_lineno

2019-10-03 Thread Paul Eggert

On 10/2/19 7:50 AM, Roland Illig wrote:

The current code says:

   next_line_no = line_no + page_incr;
   if (next_line_no < line_no)
 die (EXIT_FAILURE, 0, _("line number overflow"));

Since intmax_t is a regular integer type, overflow invokes undefined
behavior and must therefore be checked using other means.


Thanks for the bug report. I looked for similar problems involving 
integer-overflow diagnostics in coreutils and installed the attached 
patches. The second patch should fix the bug you mentioned.
>From 1316620e81daf91317560226b2b63cbbf548c09d Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 3 Oct 2019 12:35:44 -0700
Subject: [PATCH 1/4] cp: simplify integer overflow checking

* src/copy.c (sparse_copy): Use INT_ADD_WRAPV instead
of doing overflow checking by hand.
---
 src/copy.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index 65cf65895..cd6104c7a 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -335,9 +335,7 @@ sparse_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
 }
   else  /* Coalesce writes/seeks.  */
 {
-  if (psize <= OFF_T_MAX - csize)
-psize += csize;
-  else
+  if (INT_ADD_WRAPV (psize, csize, ))
 {
   error (0, 0, _("overflow reading %s"), quoteaf (src_name));
   return false;
-- 
2.21.0

>From 89af2b307b455b53869bc9cf79af0272f7d8a1a2 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 3 Oct 2019 12:37:12 -0700
Subject: [PATCH 2/4] nl: fix integer-overflow bug
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Roland Illig (Bug#37585)
* src/nl.c (print_lineno): Don’t rely on undefined behavior when
checking for integer overflow.
---
 src/nl.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/nl.c b/src/nl.c
index 43092b4fe..d85408c8c 100644
--- a/src/nl.c
+++ b/src/nl.c
@@ -275,14 +275,10 @@ build_type_arg (char const **typep,
 static void
 print_lineno (void)
 {
-  intmax_t next_line_no;
-
   printf (lineno_format, lineno_width, line_no, separator_str);
 
-  next_line_no = line_no + page_incr;
-  if (next_line_no < line_no)
+  if (INT_ADD_WRAPV (line_no, page_incr, _no))
 die (EXIT_FAILURE, 0, _("line number overflow"));
-  line_no = next_line_no;
 }
 
 /* Switch to a header section. */
-- 
2.21.0

>From 72a348cc2d6160aa24bca93c23b1a17ffb5b1366 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 3 Oct 2019 12:38:15 -0700
Subject: [PATCH 3/4] numfmt: avoid unlikely integer overflow
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/numfmt.c (parse_format_string): Report overflow if
pad < -LONG_MAX, since that can’t be negated.
---
 src/numfmt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/numfmt.c b/src/numfmt.c
index 305a88603..c56641cfd 100644
--- a/src/numfmt.c
+++ b/src/numfmt.c
@@ -1081,7 +1081,7 @@ parse_format_string (char const *fmt)
 
   errno = 0;
   pad = strtol (fmt + i, , 10);
-  if (errno == ERANGE)
+  if (errno == ERANGE || pad < -LONG_MAX)
 die (EXIT_FAILURE, 0,
  _("invalid format %s (width overflow)"), quote (fmt));
 
-- 
2.21.0

>From d267ba04a6b4ad43e5a1311885f8ad9685502a5e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 3 Oct 2019 12:41:22 -0700
Subject: [PATCH 4/4] truncate: avoid integer-overflow assumptions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/truncate.c (do_ftruncate): Simplify overflow checking,
and don’t rely on theoretically-nonportable assumptions
like assuming that OFF_MAX < UINTMAX_MAX.
---
 src/truncate.c | 49 +++--
 1 file changed, 19 insertions(+), 30 deletions(-)

diff --git a/src/truncate.c b/src/truncate.c
index 4494ab51a..e7fb8543a 100644
--- a/src/truncate.c
+++ b/src/truncate.c
@@ -116,31 +116,29 @@ do_ftruncate (int fd, char const *fname, off_t ssize, off_t rsize,
 }
   if (block_mode)
 {
-  off_t const blksize = ST_BLKSIZE (sb);
-  if (ssize < OFF_T_MIN / blksize || ssize > OFF_T_MAX / blksize)
+  ptrdiff_t blksize = ST_BLKSIZE (sb);
+  intmax_t ssize0 = ssize;
+  if (INT_MULTIPLY_WRAPV (ssize, blksize, ))
 {
   error (0, 0,
  _("overflow in %" PRIdMAX
-   " * %" PRIdMAX " byte blocks for file %s"),
- (intmax_t) ssize, (intmax_t) blksize,
- quoteaf (fname));
+   " * %" PRIdPTR " byte blocks for file %s"),
+ ssize0, blksize, quoteaf (fname));
   return false;
 }
-  ssize *= blksize;
 }
   if (rel_mode)
 {
-  uintmax_t fsize;
+  off_t fsize;
 
 

bug#37060: bug in date / coreutils

2019-08-16 Thread Paul Eggert
I think that bug was fixed in 2017, and your coreutils version 8.23 predates the 
fix. I suggest upgrading to the current version (8.31) of coreutils, as 8.23 is 
pretty old anyway.






bug#36739: [PATCH] maint: fix issues in syntax-check

2019-08-06 Thread Paul Eggert

Thanks, I installed that.





bug#36739: maint: fix issues in syntax-check

2019-08-04 Thread Paul Eggert

Thanks, I installed that.





bug#36887: coreutils-8.31: printf chokes on \u0041

2019-08-01 Thread Paul Eggert

Ulrich Mueller wrote:

Except for the surrogates
U+D800...U+DFFF, it looks like an arbitrary restriction


It's not entirely arbitrary. Because of the restriction, coreutils printf 
doesn't have to worry about what this command should do:


  printf '\u0025d\n' 1 2

Does this print a single line "%d", or two lines "1" and "2"? There are good 
arguments either way, and one can easily construct even-stranger examples.






bug#36831: enhance 'directory not empty' message

2019-07-31 Thread Paul Eggert

Assaf Gordon wrote:

An explicit error explicitly saying "cannot move", and mention the source and
destination, and also "blames" the target directory seems the most
user-friendly and least ambiguous.


Sure, but that handles only the ENOTEMPTY/EEXIST case. How would you handle the 
EDQUOT, EISDIR, and ENOSPC cases? Will you invent a separate diagnostic for each 
case, or just treat them as in my proposed patch? I assume the latter, but 
either way I'd like to see a patch that handles these properly too. Also, please 
handle ETXTBUSY while you're at it (sorry, I missed that one).



For the second and third cases,
"No space" and "Quota exceeded" seem to me to always relate to the
destination, and I don't think users get confused about those
(other opinions of course welcomed).


What's obvious to experts like us is not always obvious to users. If users get 
confused by the current diagnostic for ENOTEMPTY/EEXIST, I don't see why they 
wouldn't also get confused for ETXTBUSY etc.



Your patch also added "EISDIR", for which rename(2) says:
 "newpath is an existing directory, but oldpath is not a directory."

But I don't think this error can happen with gnu mv.


It can, as a result of a race condition if some other process is mutating the 
file system while 'mv' is running. Admittedly unlikely, but we might as well 
improve this errno value while we're improving the others.






bug#36831: enhance 'directory not empty' message

2019-07-29 Thread Paul Eggert

On 7/29/19 1:28 AM, Assaf Gordon wrote:

+  if (rename_errno == ENOTEMPTY || rename_errno == EEXIST)
+{
+  error (0, 0, _("cannot move %s to %s: Target directory not empty"),
+ quoteaf_n (0, src_name), quoteaf_n (1, dst_name));


Although this is an improvement, it is not general enough, as other 
errno values are relevant only for the destination. Better would be to 
have a special case for errno values that matter only for the 
destination, and use the existing code for errno values where we don't 
know whether the problem is the source or the destination. Something 
like the attached, say.



diff --git a/src/copy.c b/src/copy.c
index 65cf65895..b1e4557e4 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -2477,9 +2477,18 @@ copy_internal (char const *src_name, char const *dst_name,
  If the permissions on the directory containing the source or
  destination file are made too restrictive, the rename will
  fail.  Etc.  */
-  error (0, rename_errno,
- _("cannot move %s to %s"),
- quoteaf_n (0, src_name), quoteaf_n (1, dst_name));
+  switch (errno)
+{
+case EDQUOT: case EEXIST: case EISDIR: case ENOSPC: case ENOTEMPTY:
+  error (0, rename_errno, "%s", quotearg_colon (dst_name));
+  break;
+
+default:
+  error (0, rename_errno,
+ _("cannot move %s to %s"),
+ quoteaf_n (0, src_name), quoteaf_n (1, dst_name));
+  break;
+}
   forget_created (src_sb.st_ino, src_sb.st_dev);
   return false;
 }


bug#36718: uniq treats distinct Korean characters equal

2019-07-18 Thread Paul Eggert
uniq just calls strcoll, and if strcoll (A, B) returns 0 then uniq assumes the 
lines are equal. So my guess is that your problem has something to do with 
strcoll, not with coreutils per se.






bug#35939: version sort is incorrect with hyphen-minus

2019-06-26 Thread Paul Eggert
Thanks for looking into this. Sorry about my confusion between 
strverscmp and filevercmp. As this bug report appears to be about 
filevercmp, glibc is not involved; it's only Gnulib and the utilities 
using Gnulib's filevercmp module.


As I now understand it, Gnulib filevercmp is intended to be consistent 
with Debian's version comparison (this is documented in filevercmp.c), 
so GNU Bug#35939 is therefore based on a misunderstanding, as Gnulib 
filevercmp is implementing the Debian spec correctly for this test case.


Perhaps the coreutils manual could be improved to make this all clearer, 
and perhaps it should refer to the Debian manual if it doesn't already.







bug#35939: version sort is incorrect with hyphen-minus

2019-06-26 Thread Paul Eggert
GNU sort uses the same algorithm as glibc strverscmp, and this algorithm has 
changed only once since strverscmp was added to glibc in 1997. The change was 
made in 2009, to fix this bug:


https://sourceware.org/bugzilla/show_bug.cgi?id=9913

Has the Debian version-comparison algorithm changed since 1997? If so, could you 
give details about the changes to the Debian algorithm? Perhaps glibc should be 
changed to stay consistent with Debian.






bug#36291: od --skip-bytes reads everything from the very beginning

2019-06-19 Thread Paul Eggert

On 6/19/19 12:42 PM, Pádraig Brady wrote:

Maybe we should relax the cases we do read() for,
and try to seek in block/character special files,
falling back to read() where that fails?


Sure, that's easy enough. I installed the attached patch and am marking 
this bug report as done.


From ea353844d8642b5533cbc713e0cec46addbf3907 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 19 Jun 2019 18:46:57 -0700
Subject: [PATCH] od: use fseek on non-regular files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Szőts Ákos (Bug#36291).
* NEWS: Mention this.
* src/od.c (skip): Try fseek even on files that do not have usable
sizes, falling back on fread if fseek fails.
---
 NEWS | 3 +++
 src/od.c | 8 ++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index d30711bc6..fd0543351 100644
--- a/NEWS
+++ b/NEWS
@@ -36,6 +36,9 @@ GNU coreutils NEWS-*- 
outline -*-
 
 ** New Features
 
+  od --skip-bytes now can use lseek even if the input is not a regular
+  file, greatly improving performance in some cases.
+
   stat(1) now uses the statx() system call where available, which can
   operate more efficiently by only retrieving requested attributes.
   stat(1) also supports a new --cached= option to control cache
diff --git a/src/od.c b/src/od.c
index 1a89542ee..75a402004 100644
--- a/src/od.c
+++ b/src/od.c
@@ -1033,6 +1033,8 @@ skip (uintmax_t n_skip)
 
   if (fstat (fileno (in_stream), _stats) == 0)
 {
+  bool usable_size = usable_st_size (_stats);
+
   /* The st_size field is valid for regular files.
  If the number of bytes left to skip is larger than
  the size of the current file, we can decrement n_skip
@@ -1040,8 +1042,7 @@ skip (uintmax_t n_skip)
  when st_size is no greater than the block size, because
  some kernels report nonsense small file sizes for
  proc-like file systems.  */
-  if (usable_st_size (_stats)
-  && ST_BLKSIZE (file_stats) < file_stats.st_size)
+  if (usable_size && ST_BLKSIZE (file_stats) < file_stats.st_size)
 {
   if ((uintmax_t) file_stats.st_size < n_skip)
 n_skip -= file_stats.st_size;
@@ -1056,6 +1057,9 @@ skip (uintmax_t n_skip)
 }
 }
 
+  else if (!usable_size && fseeko (in_stream, n_skip, SEEK_CUR) == 0)
+n_skip = 0;
+
   /* If it's not a regular file with nonnegative size,
  or if it's so small that it might be in a proc-like file system,
  position the file pointer by reading.  */
-- 
2.21.0



bug#36220: ls -l: maddening mixed left right justifications with numeric ids

2019-06-18 Thread Paul Eggert

積丹尼 Dan Jacobson wrote:

Indeed all you need to do would me mention the logic there on INFO "ls"
and we would realize that this was actually a feature.


OK, I installed the attached and am marking this bug as done.
>From 9ebb1c06ce13407ed321bc67685d3d90c20b62de Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 18 Jun 2019 00:31:43 -0700
Subject: [PATCH] doc: mention ls -l user/group justification

* doc/coreutils.texi (What information is listed):
Document justification of user and group columns in ls -l output
(Bug#36220).
---
 doc/coreutils.texi | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 0b71bedb4..3c2eb9750 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -7662,7 +7662,10 @@ In addition to the name of each file, print the file type, file mode bits,
 number of hard links, owner name, group name, size, and
 timestamp (@pxref{Formatting file timestamps}), normally
 the modification timestamp (the mtime, @pxref{File timestamps}).
-Print question marks for information that
+If the owner or group name cannot be determined, print
+the owner or group ID instead, right-justified as a cue
+that it is a number rather than a textual name.
+Print question marks for other information that
 cannot be determined.
 
 Normally the size is printed as a byte count without punctuation, but
@@ -7772,7 +7775,8 @@ is marked with a @samp{+} character.
 @cindex numeric uid and gid
 @cindex numeric user and group IDs
 Produce long format directory listings, but
-display numeric user and group IDs instead of the owner and group names.
+display right-justified numeric user and group IDs
+instead of left-justified owner and group names.
 
 @item -o
 @opindex -o
-- 
2.17.1



bug#36220: ls -l: maddening mixed left right justifications with numeric ids

2019-06-17 Thread Paul Eggert

On 6/17/19 8:12 AM, Pádraig Brady wrote:

Patch attached to do as described above


I prefer the current ("maddening") behavior, as it gives the reader a 
useful signal that the user is numeric rather than textual. This is 
particularly important when a user name consists entirely of digits, 
which is allowed on some systems. In that case, left alignment of a 
string like "" means it's a textual user name, whereas right 
alignment means it's a numeric user ID. Admittedly the disambiguation is 
not perfect (as there is no cue when the user name has the maximum 
width) but it's helpful in common practice.


I thought this was all documented somewhere - at least, it was a 
conscious decision when I wrote that code long ago.







bug#35291: [PATCH] split: fix incorrect suffix length computation

2019-06-08 Thread Paul Eggert

Johannes Altmanninger wrote:

Does anyone have time to review this? I think it's an evident bug.
I can try to improve the clarity of the patch if needed.


It's not clarity that needs fixing, it's also correctness. A quick look suggests 
that the proposed fix can go into an infinite loop due to unsigned integer 
overflow. This is why the current code uses division and not multiplication.






<    1   2   3   4   5   6   7   8   9   10   >