bug#59732: Problem unable to enter command

2022-12-01 Thread Paul Eggert

On 2022-12-01 01:06, human.id...@simplelogin.com wrote:

I think the issue is related to lightdm so it would be better to report the 
issue to the lightdm developers.


Thanks, I'm closing the coreutils bug report.





bug#59732: Problem unable to enter command

2022-11-30 Thread Paul Eggert

On 2022-11-30 14:04, human.idt50--- via GNU coreutils Bug Reports wrote:

Hello tty Team,

I am using an Arch based distribution. The tty1 screen was opened with the 
Ctrl+Alt+F1 key combination. When I pressed the Alt+F1 key on this screen, I 
saw the commands I saw while opening the distribution on the screen. I forcibly 
shut down the computer so as not to wait any longer. When I opened it again and 
pressed the Ctrl+Alt+F1 keys again, there was no screen to enter the username 
and password on the screen. The screen with the commands I saw while opening 
the distribution was stuck.

Can you bring an update that will fix this issue?


I think you need to report this elsewhere, as bug-coreutils is about the 
coreutils package not about tty1 screens or Ctrl+Alt+F1. (Yes, coreutils 
has a 'tty' program, but it's a very simple program that you weren't 
running and almost surely is unrelated to your problem.)


You might ask on an Arch mailing list where to report bugs of the form 
that you discovered.






bug#59382: cp(1) tries to allocate too much memory if filesystem blocksizes are unusual

2022-11-20 Thread Paul Eggert

On 2022-11-19 22:43, Korn Andras wrote:

the same file can contain records of different
sizes. Reductio ad absurdum: the "optimal" blocksize for reading may in fact
depend on the position within the file (and only apply to the next read).


This sort of problem exists on traditional devices as well. A tape drive 
can have records of different sizes. For these devices, the best 
approach is to allocate a buffer of the maximum blocksize the drive 
supports.


For the file you describe the situation is different, since ZFS will 
straddle small blocks during I/O. Although there's no single "best" I 
would guess that it'd typically be better to report the blocksize 
currently in use for creating new blocks (which would be a power of two 
for ZFS), as that will map better to how programs like cp deal with 
blocksizes. This may not be perfect but it'd be better than what ZFS 
does now, at least for the instances of 'cp' that are already out there.







bug#59382: cp(1) tries to allocate too much memory if filesystem blocksizes are unusual

2022-11-19 Thread Paul Eggert
The block size for filesystems can also be quite large (currently, up 
to 16M).


It seems ZFS tries to "help" apps by reporting misinformation (namely a 
smaller block size than actually preferred) when the file is small. This 
is unfortunate, since it messes up cp and similar programs that need to 
juggle multiple block sizes. Plus, it messes up any program that assumes 
st_blksize is constant for the life of a file descriptor, which "cp" 
does assume elsewhere.


GNU cp doesn't need ZFS's "help", as it's already smart enough to not 
over-allocate a buffer when the input file is small but its blocksize is 
large. Instead, this "help" from ZFS causes GNU cp to over-allocate 
because it naively trusts the blocksize ZFS that reports.




The proposed patch attached removes the use of buffer_lcm()
and just picks the largest st_blksize, which would be 4MiB in your case.
It also limits the max buffer size to 32MiB in the edge case
where st_blksize returns a larger value that this.


I suppose this could break cp if st_blksize is not a power of 2, and if 
the file is not a regular file, and reads must be a multiple of the 
block size. POSIX allows such things though I expect nowadays it'd be 
limited to weird devices.


Although we inadvertently removed support for weird devices in 2009 by 
commit 55efc5f3ee485b3e31a91c331f07c89aeccc4e89, and nobody seems to 
care (because people use dd or whatever to deal with weird devices), I 
think it'd be better to limit the fix to regular files. And while we're 
at it we might as well resurrect support for weird devices.




+#include 


No need for this, as static_assert works without  in C23, and 
Gnulib's assert-h module support this.




+/* Set a max constraint to avoid excessive mem usage or type overflow.  */
+enum { IO_BUFSIZE_MAX = 128 * IO_BUFSIZE };
+static_assert (IO_BUFSIZE_MAX <= MIN (IDX_MAX, SIZE_MAX) / 2 + 1);


I'm leery of putting in a maximum as low as 16 MiB. Although that's OK 
now (it matches OpenZFS's current maximum), cp in the future will surely 
deal with bigger block sizes. Instead, how about if we stick with GNU's 
"no arbitrary limits" policy and work around the ZFS bug instead?


Something like the attached patch, perhaps?From 551f3f55180669ab0bfd6c5d9e3e0f38cb035172 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 19 Nov 2022 19:04:36 -0800
Subject: [PATCH] cp: work around ZFS misinformation

Problem reported by Korn Andras (Bug#59382).
* bootstrap.conf (gnulib_modules): Add count-leading-zeros,
which was already an indirect dependency, since ioblksize.h
now uses it directly.
* src/ioblksize.h: Include count-leading-zeros.h.
(io_blksize): Treat impossible blocksizes as IO_BUFSIZE.
When growing a blocksize to IO_BUFSIZE, keep it a multiple of the
stated blocksize.  Work around the ZFS performance bug.
---
 NEWS|  3 +++
 bootstrap.conf  |  1 +
 src/ioblksize.h | 28 +++-
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index b6b5201e7..9282352c8 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,9 @@ GNU coreutils NEWS-*- outline -*-
   'cp -rx / /mnt' no longer complains "cannot create directory /mnt/".
   [bug introduced in coreutils-9.1]
 
+  cp, mv, and install no longer use overly large I/O buffers when ZFS
+  misinforms them about IO block sizes.
+
   'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
   [bug introduced in coreutils-9.1]
 
diff --git a/bootstrap.conf b/bootstrap.conf
index 8e257a254..f8715068e 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -59,6 +59,7 @@ gnulib_modules="
   config-h
   configmake
   copy-file-range
+  count-leading-zeros
   crypto/md5
   crypto/sha1
   crypto/sha256
diff --git a/src/ioblksize.h b/src/ioblksize.h
index 8bd18ba05..aa367aa4e 100644
--- a/src/ioblksize.h
+++ b/src/ioblksize.h
@@ -18,6 +18,7 @@
 
 /* sys/stat.h and minmax.h will already have been included by system.h. */
 #include "idx.h"
+#include "count-leading-zeros.h"
 #include "stat-size.h"
 
 
@@ -75,8 +76,33 @@ enum { IO_BUFSIZE = 128 * 1024 };
 static inline idx_t
 io_blksize (struct stat sb)
 {
+  /* Treat impossible blocksizes as if they were IO_BUFSIZE.  */
+  idx_t blocksize = ST_BLKSIZE (sb) <= 0 ? IO_BUFSIZE : ST_BLKSIZE (sb);
+
+  /* Use a blocksize of at least IO_BUFSIZE bytes, keeping it a
+ multiple of the original blocksize.  */
+  blocksize += (IO_BUFSIZE - 1) - (IO_BUFSIZE - 1) % blocksize;
+
+  /* For regular files we can ignore the blocksize if we think we know better.
+ ZFS sometimes understates the blocksize, because it thinks
+ apps stupidly allocate a block that large even for small files.
+ This misinformation can cause coreutils to use wrong-sized blocks.
+ Work around some of the performance bug by substituting the next
+ power of two when the reported blocksize is not a p

bug#59262: Dash instead of two hyphens in manual

2022-11-15 Thread Paul Eggert

On 2022-11-15 05:36, Pádraig Brady wrote:

A few  more instances are fixed in the attached,
and a new syntax check to avoid future occurrences.


Thanks, I installed the attached to fix a comment, and fix some more 
hyphen-vs-dash issues that I noticed while looking at your patch. These 
are harder to automate, unfortunately.From 4f43143ab17b3b7646b75838ebcc769854eb7906 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 15 Nov 2022 10:51:47 -0800
Subject: [PATCH 1/2] maint: fix cfg.mk comment

* cfg.mk (sc_texi_long_option_escaped): Fix comment.
---
 cfg.mk | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/cfg.mk b/cfg.mk
index 992aabc86..4040d6846 100644
--- a/cfg.mk
+++ b/cfg.mk
@@ -366,10 +366,7 @@ sc_option_desc_uppercase: $(ALL_MANS)
 	@grep '^\\fB\\-' -A1 man/*.1 | LC_ALL=C grep '\.1.[A-Z][a-z]'	\
 	  && { echo 1>&2 '$@: found initial capitals in --help'; exit 1; } || :
 
-# Option descriptions should not start with a capital letter.
-# One could grep source directly as follows:
-# grep -E " {2,6}-.*[^.]  [A-Z][a-z]" $$($(VC_LIST_EXCEPT) | grep '\.c$$')
-# but that would miss descriptions not on the same line as the -option.
+# '--' should not be treated as '–' (U+2013 EN DASH) in long option names.
 sc_texi_long_option_escaped: doc/coreutils.info
 	@grep ' –[^ ]' '$<'		\
 	  && { echo 1>&2 '$@: found unquoted --long-option'; exit 1; } || :
-- 
2.37.2

From 7bb940ccedb722dedf8a8f9fd1c7dd2225252824 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 15 Nov 2022 10:55:23 -0800
Subject: [PATCH 2/2] doc: more dash fixes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* doc/coreutils.texi, doc/sort-version.texi: Prefer on "x -- y" to
"x---y" in prose, as the result is more readable in Emacs.
Fix some instances of unescaped ‘-’ that should be minus, not
hyphen. Fix some other instances that should be en dash.  No
spaces around en dash when it’s a range.
---
 doc/coreutils.texi| 150 +-
 doc/sort-version.texi |   4 +-
 2 files changed, 78 insertions(+), 76 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 9121f48b7..fca7f6961 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -1171,7 +1171,7 @@ This is troublesome when you want to specify a numeric ID, say 42,
 and it must work even in a pathological situation where
 @samp{42} is a user name that maps to some other user ID, say 1000.
 Simply invoking @code{chown 42 F}, will set @file{F}s owner ID to
-1000---not what you intended.
+1000 -- not what you intended.
 
 GNU @command{chown}, @command{chgrp}, @command{chroot}, and @command{id}
 provide a way to work around this, that at the same time may result in a
@@ -1491,7 +1491,7 @@ and a nonzero value indicates failure.
 Nearly every command invocation yields an integral @dfn{exit status}
 that can be used to change how other commands work.
 For the vast majority of commands, an exit status of zero indicates
-success.  Failure is indicated by a nonzero value---typically
+success.  Failure is indicated by a nonzero value -- typically
 @samp{1}, though it may differ on unusual platforms as POSIX
 requires only that it be nonzero.
 
@@ -2465,7 +2465,7 @@ spaces or end of line, ignoring any intervening parentheses or quotes.
 Like @TeX{}, @command{fmt} reads entire ``paragraphs'' before choosing line
 breaks; the algorithm is a variant of that given by Donald E. Knuth
 and Michael F. Plass in ``Breaking Paragraphs Into Lines'',
-@cite{Software---Practice & Experience} @b{11}, 11 (November 1981),
+@cite{Software: Practice & Experience} @b{11}, 11 (November 1981),
 1119--1184.
 
 The program accepts the following options.  Also see @ref{Common options}.
@@ -3130,8 +3130,8 @@ operand specified as @samp{-}, when standard input is a FIFO or a pipe.
 
 With kernel inotify support, output is triggered by file changes
 and is generally very prompt.
-Otherwise, @command{tail} sleeps for one second between checks---
-use @option{--sleep-interval=@var{n}} to change that default---which can
+Otherwise, @command{tail} sleeps for one second between checks --
+use @option{--sleep-interval=@var{n}} to change that default -- which can
 make the output appear slightly less responsive or bursty.
 When using tail without inotify support, you can make it more responsive
 by using a sub-second sleep interval, e.g., via an alias like this:
@@ -3974,7 +3974,7 @@ next section) is preferable in new applications.
 for each given @var{file}, or standard input if none are given or for a
 @var{file} of @samp{-}.
 
-cksum also supports the @option{-a,--algorithm} option to select the
+cksum also supports the @option{-a/--algorithm} option to select the
 digest algorithm to use. @command{cksum} is the preferred interface
 to these digests, subsuming the other standalone checksumming utilities,
 which can be emul

bug#59262: Dash instead of two hyphens in manual

2022-11-14 Thread Paul Eggert
Thanks for reporting that. Fixed by installing the attached. This should 
propagate to the web pages after the next release.From b73888b12caa359c93d05aa7ff7c3a66a74b5f7b Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 14 Nov 2022 19:00:06 -0800
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnulib b/gnulib
index e441260ea..08ba9aaeb 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit e441260eab816e9b6d202fe7ac288ec2a7b72f34
+Subproject commit 08ba9aaebff69a02cbb794c6213314fd09dd5ec5
-- 
2.38.1

From 2fce39eb3a720009edc0e85ddff5b879ac599e16 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 14 Nov 2022 19:08:19 -0800
Subject: [PATCH 2/2] doc: fix markup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Antonio Diaz Diaz (bug#59262).
* doc/coreutils.texi: Use markup in menus to prevent
‘--’ from turning into an em dash, and to be more
consistent.
---
 doc/coreutils.texi | 56 +++---
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index d82c86709..ebd096cda 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -379,9 +379,9 @@ Conditions
 
 @command{expr}: Evaluate expression
 
-* String expressions::   + : match substr index length
-* Numeric expressions::  + - * / %
-* Relations for expr::   | & < <= = == != >= >
+* String expressions::   @code{+ : match substr index length}
+* Numeric expressions::  @code{+ - * / %}
+* Relations for expr::   @code{| & < <= = == != >= >}
 * Examples of expr:: Examples of using @command{expr}
 
 Redirection
@@ -486,15 +486,15 @@ File permissions
 Date input formats
 
 * General date syntax::  Common rules
-* Calendar date items::  21 Jul 2020
-* Time of day items::9:20pm
-* Time zone items::  UTC, -0700, +0900, @dots{}
-* Combined date and time of day items:: 2020-07-21T20:02:00,00-0400
-* Day of week items::Monday and others
-* Relative items in date strings:: next tuesday, 2 years ago
-* Pure numbers in date strings:: 20200721, 1440
-* Seconds since the Epoch::  @@1595289600
-* Specifying time zone rules::   TZ="America/New_York", TZ="UTC0"
+* Calendar date items::  @samp{14 Nov 2022}
+* Time of day items::@samp{9:02pm}
+* Time zone items::  @samp{UTC}, @samp{-0700}, @samp{+0900}, @dots{}
+* Combined date and time of day items:: @samp{2022-11-14T21:02:42,00-0500}
+* Day of week items::@samp{Monday} and others
+* Relative items in date strings:: @samp{next tuesday, 2 years ago}
+* Pure numbers in date strings:: @samp{20221114}, @samp{2102}
+* Seconds since the Epoch::  @samp{@@1668477762}
+* Specifying time zone rules::   @samp{TZ="America/New_York"}, @samp{TZ="UTC0"}
 * Authors of parse_datetime::Bellovin, Eggert, Salz, Berets, et al.
 
 Version sorting order
@@ -793,16 +793,16 @@ name.
 
 @menu
 * Exit status:: Indicating program success or failure.
-* Backup options::  -b -S, in some programs.
-* Block size::  BLOCK_SIZE and --block-size, in some programs.
+* Backup options::  @option{-b} @option{-S}, in some programs.
+* Block size::  BLOCK_SIZE and @option{--block-size}, in some programs.
 * Floating point::  Floating point number representation.
-* Signal specifications::   Specifying signals using the --signal option.
+* Signal specifications::   Specifying signals using @option{--signal}.
 * Disambiguating names and IDs:: chgrp, chown, chroot, id: user and group syntax
-* Random sources::  --random-source, in some programs.
+* Random sources::  @option{--random-source}, in some programs.
 * Target directory::Specifying a target directory, in some programs.
-* Trailing slashes::--strip-trailing-slashes, in some programs.
-* Traversing symlinks:: -H, -L, or -P, in some programs.
-* Treating / specially::--preserve-root and --no-preserve-root.
+* Trailing slashes::@option{--strip-trailing-slashes}, in some programs.
+* Traversing symlinks:: @option{-H}, @option{-L}, or @option{-P}, in some programs.
+* Treating / specially::@option{--preserve-root} and @option{--no-preserve-root}.
 * Special built-in utilities::  @command{break}, @command{:}, @dots{}
 * Standards conformance::   Conformance to the POSIX standard.
 * Multi-call invocation::   Multi-call program invocation.
@@ -13428,12 +13428,12 @@ Exit status:
 @end display
 
 @menu
-* File type tests:: -[bcdfhLpSt]
-* Access permission tests:: -[gkruwxOG]
-* File characteristic tests::   -e 

bug#58881: Question: df Size

2022-10-29 Thread Paul Eggert

On 2022-10-29 12:31, linux wrote:

Can you write in  man  why df shows  different result than  lsblk ?


Not easily, as that depends on the internals of the filesystem, which is 
out of coreutils's control and/or view. df is simply repeating what the 
kernel reports about the filesystem. If the filesystem has some internal 
overhead, df will report fewer blocks than what's physically present on 
the underlying device.







bug#58599: `date -d $(date)` error for non en_* locale

2022-10-17 Thread Paul Eggert

On 10/17/22 07:44, Ruslan Kovtun wrote:

According to "do one thing and do it well" and to the fact of '-d/--date'
option existence, `date` should be able to parse its default output in any
locale.


Patches would be welcome. Good luck getting it to work, though. Many 
date formats are ambiguous, and I don't see how you'd address that.


In the meantime, I suggest sticking to ISO format dates and times with 
UTC, e.g.:


date -d "$(date -u +'%Y-%m-%d %H:%M:%S.%NZ')"





bug#58494: touch resists date 2022-09-11T00:00:00

2022-10-13 Thread Paul Eggert

On 2022-10-12 18:58, Felix Freeman via GNU coreutils Bug Reports wrote:

 $ touch -t 20220911 algo
 touch: invalid date format ‘20220911’
 $ touch -d 2022-09-11T00:00:00 algo
 touch: invalid date format ‘2022-09-11T00:00:00’


In Santiago, Chile, that timestamp does not exist. Is your timezone set 
to Santiago time? That would explain your symptoms.






bug#58163: coreutils instalation failure x86_64

2022-10-01 Thread Paul Eggert
Thanks for sending the extra data. Oh, I see you are running GCC 5.4. 
You have run into a GCC bug that I just now filed here:


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107116

This bug was in GCC 5 and is still present in GCC 12.

To work around the GCC bug, please use './configure 
--disable-gcc-warnings' instead of plain './configure'. Or (and this may 
be simpler) please use the latest coreutils release from 
 rather than trying to bootstrap. 
Another possibility is to upgrade to GCC 7 (2017) or later, as these GCC 
versions have __builtin_sub_overflow which means the code in question 
will not be compiled.


I doubt whether we should modify Gnulib or coreutils to work around the 
GCC bug, as we expect developers who are bootstrapping to use up-to-date 
tools and GCC 6 is not up-to-date. I'm therefore closing the coreutils 
bug report.






bug#58163: coreutils instalation failure x86_64

2022-09-30 Thread Paul Eggert

On 9/29/22 04:38, ripspin-004--- via GNU coreutils Bug Reports wrote:

  git submodule foreach git pull origin master

  git commit -m 'build: update gnulib submodule to latest' gnulib

  ./bootstrap

./configure -quiet

make


That's a bit too terse, unfortunately. What distro and compiler are you 
using? Since I cannot reproduce the problem on Fedora 36 x86-64, we'll 
need more details.


I suggest sending all the output, interleaved rather than separating 
stdout from stderr, and using the following three commands instead of 
the last two commands mentioned above:


   ./configure
   make
   make V=1

The last "make" should retry the failed compilation, but more verbosely.

For comparison I am attaching the compressed output of the above 
commands on Fedora 36, which has gcc (GCC) 12.2.1 20220819 (Red Hat 
12.2.1-2)

, and where the build is successful.

coreutils-build.txt.gz
Description: application/gzip


bug#58050: [INSTALLED] rm: fix diagnostics on I/O error

2022-09-25 Thread Paul Eggert

On 9/25/22 07:25, Pádraig Brady wrote:

How about the attached to add a NEWS entry,
and add DS_EMPTY, DS_NONEMPTY enums to make the code easier to read?


Sure, that looks good; thanks.

Oh, I forgot that via code inspection I found a theoretical portability 
bug in fts while I was looking into Bug#58050. I fixed that by 
installing the attached into Gnulib.From e00de604fd7012fd912f7580cd658ed9363ed6ad Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sun, 25 Sep 2022 18:33:49 -0700
Subject: [PATCH] fts: fix errno handling if dirfd fails
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/fts.c (fts_build): Use proper errno if dirfd failed.
Although I don’t know of any platform where dirfd can fail here,
we might as well get it right.
---
 ChangeLog | 7 +++
 lib/fts.c | 3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 24553445f6..6027e5ed94 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-09-25  Paul Eggert  
+
+	fts: fix errno handling if dirfd fails
+	* lib/fts.c (fts_build): Use proper errno if dirfd failed.
+	Although I don’t know of any platform where dirfd can fail here,
+	we might as well get it right.
+
 2022-09-25  Bruno Haible  
 
 	stdbool: Mostly revert last patch.
diff --git a/lib/fts.c b/lib/fts.c
index 954cbb7b40..5811f6ea20 100644
--- a/lib/fts.c
+++ b/lib/fts.c
@@ -1290,11 +1290,12 @@ fts_build (register FTS *sp, int type)
 dir_fd = dirfd (dp);
 if (dir_fd < 0)
   {
+int dirfd_errno = errno;
 closedir_and_clear (cur->fts_dirp);
 if (type == BREAD)
   {
 cur->fts_info = FTS_DNR;
-cur->fts_errno = errno;
+cur->fts_errno = dirfd_errno;
   }
 return NULL;
   }
-- 
2.37.3



bug#58050: [INSTALLED] rm: fix diagnostics on I/O error

2022-09-24 Thread Paul Eggert
I ran into this problem when attempting to recursively
remove a directory in a filesystem on flaky hardware.
Although the underlying readdir syscall failed with errno == EIO,
rm issued no diagnostic about the I/O error.

Without this patch I see this behavior:

  $ rm -fr baddir
  rm: cannot remove 'baddir': Directory not empty
  $ rm -ir baddir
  rm: descend into directory 'baddir'? y
  rm: remove directory 'baddir'? y
  rm: cannot remove 'baddir': Directory not empty

With this patch I see the following behavior, which
lets the user know about the I/O error when rm tries
to read baddir's directory entries:

  $ rm -fr baddir
  rm: cannot remove 'baddir': Input/output error
  $ rm -ir baddir
  rm: cannot remove 'baddir': Input/output error

* src/remove.c (Ternary): Remove.  All uses removed.
(get_dir_status): New static function.
(prompt): Last arg is now directory status, not ternary.
Return RM_USER_ACCEPTED if user explicitly accepted.
All uses changed.
Report any significant error in directory status right away.
(prompt, rm_fts): Use get_dir_status to get directory status lazily.
(excise): Treat any FTS_DNR errno as being more descriptive, not
just EPERM and EACCESS.  For example, EIO is more descriptive.
(rm_fts): Distinguish more clearly between explicit and implied
user OK.
* src/remove.h (RM_USER_ACCEPTED): New constant.
(VALID_STATUS): Treat it as valid.
* src/system.h (is_empty_dir): Remove, replacing with ...
(directory_status): ... this more-general function.
All uses changed.  Avoid undefined behavior of looking at
a non-null readdir pointer after corresponding closedir.
* tests/rm/rm-readdir-fail.sh: Adjust test of internals
to match current behavior.
---
 src/remove.c| 80 +++--
 src/remove.h|  4 +-
 src/rmdir.c |  3 +-
 src/system.h| 25 ++--
 tests/rm/rm-readdir-fail.sh |  1 +
 5 files changed, 58 insertions(+), 55 deletions(-)

diff --git a/src/remove.c b/src/remove.c
index 6756c409d..0b6754bf7 100644
--- a/src/remove.c
+++ b/src/remove.c
@@ -33,14 +33,6 @@
 #include "xfts.h"
 #include "yesno.h"
 
-enum Ternary
-  {
-T_UNKNOWN = 2,
-T_NO,
-T_YES
-  };
-typedef enum Ternary Ternary;
-
 /* The prompt function may be called twice for a given directory.
The first time, we ask whether to descend into it, and the
second time, we ask whether to remove it.  */
@@ -168,9 +160,23 @@ write_protected_non_symlink (int fd_cwd,
   }
 }
 
-/* Prompt whether to remove FILENAME (ent->, if required via a combination of
+/* Return the status of the directory identified by FTS and ENT.
+   This is -1 if the directory is empty, 0 if it is nonempty,
+   and a positive error number if there was trouble determining the status,
+   e.g., it is not a directory, or permissions problems, or I/O errors.
+   Use *DIR_STATUS is a cache for the status.  */
+static int
+get_dir_status (FTS const *fts, FTSENT const *ent, int *dir_status)
+{
+  if (*dir_status < -1)
+*dir_status = directory_status (fts->fts_cwd_fd, ent->fts_accpath);
+  return *dir_status;
+}
+
+/* Prompt whether to remove FILENAME, if required via a combination of
the options specified by X and/or file attributes.  If the file may
-   be removed, return RM_OK.  If the user declines to remove the file,
+   be removed, return RM_OK or RM_USER_ACCEPTED, the latter if the user
+   was prompted and accepted.  If the user declines to remove the file,
return RM_USER_DECLINED.  If not ignoring missing files and we
cannot lstat FILENAME, then return RM_ERROR.
 
@@ -178,20 +184,16 @@ write_protected_non_symlink (int fd_cwd,
 
Depending on MODE, ask whether to 'descend into' or to 'remove' the
directory FILENAME.  MODE is ignored when FILENAME is not a directory.
-   Set *IS_EMPTY_P to T_YES if FILENAME is an empty directory, and it is
-   appropriate to try to remove it with rmdir (e.g. recursive mode).
-   Don't even try to set *IS_EMPTY_P when MODE == PA_REMOVE_DIR.  */
+   Use and update *DIR_STATUS as needed, via the conventions of
+   get_dir_status.  */
 static enum RM_status
 prompt (FTS const *fts, FTSENT const *ent, bool is_dir,
 struct rm_options const *x, enum Prompt_action mode,
-Ternary *is_empty_p)
+int *dir_status)
 {
   int fd_cwd = fts->fts_cwd_fd;
   char const *full_name = ent->fts_path;
   char const *filename = ent->fts_accpath;
-  if (is_empty_p)
-*is_empty_p = T_UNKNOWN;
-
   struct stat st;
   struct stat *sbuf = 
   cache_stat_init (sbuf);
@@ -199,13 +201,6 @@ prompt (FTS const *fts, FTSENT const *ent, bool is_dir,
   int dirent_type = is_dir ? DT_DIR : DT_UNKNOWN;
   int write_protected = 0;
 
-  bool is_empty = false;
-  if (is_empty_p)
-{
-  is_empty = is_empty_dir (fd_cwd, filename);
-  *is_empty_p = is_empty ? T_YES : T_NO;
-}
-
   /* When nonzero, this indicates that we failed to remove a child entry,
  either because the user declined an 

bug#57946: ls indenting broken if executed without color flag after i set tabs to 4

2022-09-20 Thread Paul Eggert via GNU coreutils Bug Reports

On 9/19/22 20:19, galih surya wrote:

Actually, I don't know if this is a bug.


It's not something 'ls' can easily fix, because 'ls' can't deduce from 
the operating system that you have installed nonstandard tab stops.


I installed the attached to try to document the issue.

Messing with hardware tab stops is typically more trouble than it's 
worth. I think the last time I did it was back in the 1970s, with a IBM 
029 keypunch drum card. Back then it sort of made sense, if you were 
programming in assembler or FORTRAN 66. Nowadays, not so much.From 4cbe227fa0b1bfd05b10245a3466ed99413e3a15 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 20 Sep 2022 00:09:42 -0700
Subject: [PATCH] doc: warn about tabs command (bug#57946)

---
 doc/coreutils.texi | 9 +
 1 file changed, 9 insertions(+)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index e6eae44dc..adf957e61 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -8295,6 +8295,15 @@ TAB following a non-ASCII byte.  You can avoid that issue by using the
 @option{-T0} option or put @code{TABSIZE=0} in your environment, to tell
 @command{ls} to align using spaces, not tabs.
 
+If set a terminal's hardware tabs to anything other than the default,
+you should also use a @command{--tabsize} option or @env{TABSIZE}
+environment variable either to match the hardware tabs, or to disable
+the use of hardware tabs.  Otherwise, the output of @command{ls} may
+not line up.  For example, if you run the shell command @samp{tabs -4}
+to set hardware tabs to every four columns, you should also run
+@samp{export TABSIZE=4} or @samp{export TABSIZE=0}, or use the
+corresponding @option{--tabsize} options.
+
 @item -w @var{cols}
 @itemx --width=@var{cols}
 @opindex -w
-- 
2.37.3



bug#56512: URLs in coreutils manuals documentation should use HTTPS

2022-09-18 Thread Paul Eggert via GNU coreutils Bug Reports

Thanks, I installed that.





bug#57785: [PATCH] doc: minor grammar correction

2022-09-13 Thread Paul Eggert via GNU coreutils Bug Reports

Thanks for the fix; I installed it.





bug#57631: Coreutils 9.1 build error with glibc 2.23

2022-09-06 Thread Paul Eggert via GNU coreutils Bug Reports
Thanks for the bug report. Please try this Gnulib patch, which should 
appear in the next Coreutils release:


https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=84863a1c4dc8cca8fb0f6f670f67779cdd2d543b






bug#56710: ls vs. stat display of st_size

2022-07-24 Thread Paul Eggert

On 7/24/22 01:48, Pádraig Brady wrote:


Well ls(1) was explicitly changed to assuming only positive,
citing POSIX (though I can't see it in POSIX myself):
https://github.com/coreutils/coreutils/commit/67ba4ac01


I vaguely recall being involved with that decades-old change. The POSIX 
requirement is here:


https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html#tag_20_73_10

(look for "%u").



Also ls(1) can sort by size, which gives a little more
credence to assuming positive only size.


I don't see why; negative sizes sort just as well as positive ones do.



For these reasons I would keep ls(1) as is (assuming positive).

As for stat(1), it's now consistent with ls(1) which has some benefit.
It is lower level though, so in my mind it might be better
to output the raw value, especially since it's such an edge case.

So I'd leave ls(1) as is, and I'll leave it up to you
how to handle stat(1) given the above points.


Consistency is reasonably important here (as per the original bug 
report), so if those are the choices let's leave things as-is.






bug#56710: ls vs. stat display of st_size

2022-07-23 Thread Paul Eggert

On 7/23/22 05:17, Pádraig Brady wrote:


BTW I see we've code in cache_fstatat() that assumes
st_size can't have such large values, which contradicts a bit.


Good catch. I installed the first attached patch.


> This is only a real consideration for virtual files I think
> since off_t is signed, and so impractical for a real file system
> to support files > OFF_T_MAX.

Yes, that sounds right.

You've convinced me that 'ls' should switch to the way 'stat' behaves 
rather than vice versa; that's more useful anyway. How about the 
attached second patch, which I haven't installed? (I was actually 
inclined this way originally but got lazy.)From c2056a320b38126bf5566c2ce94e2c2b25243f66 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 23 Jul 2022 12:11:49 -0700
Subject: [PATCH 1/2] =?UTF-8?q?rm:=20don=E2=80=99t=20assume=20st=5Fsize=20?=
 =?UTF-8?q?is=20nonnegative?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/remove.c: Include stat-time.h.
(cache_fstatat, cache_stat_init): Use negative st->st_atim.tv_sec to
determine whether the stat is cached, not negative st->st_size.
On non-POSIX platforms that lack st_atim.tv_sec, don’t bother to cache.
---
 src/remove.c | 29 +++--
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/remove.c b/src/remove.c
index b5d1ea8a2..e2f27ca4f 100644
--- a/src/remove.c
+++ b/src/remove.c
@@ -28,6 +28,7 @@
 #include "ignore-value.h"
 #include "remove.h"
 #include "root-dev-ino.h"
+#include "stat-time.h"
 #include "write-any-file.h"
 #include "xfts.h"
 #include "yesno.h"
@@ -62,29 +63,37 @@ enum Prompt_action
 # define DT_LNK 2
 #endif
 
-/* Like fstatat, but cache the result.  If ST->st_size is -1, the
-   status has not been gotten yet.  If less than -1, fstatat failed
-   with errno == ST->st_ino.  Otherwise, the status has already
-   been gotten, so return 0.  */
+/* Like fstatat, but cache on POSIX-compatible systems.  */
 static int
 cache_fstatat (int fd, char const *file, struct stat *st, int flag)
 {
-  if (st->st_size == -1 && fstatat (fd, file, st, flag) != 0)
+#if HAVE_STRUCT_STAT_ST_ATIM_TV_NSEC
+  /* If ST->st_atim.tv_nsec is -1, the status has not been gotten yet.
+ If less than -1, fstatat failed with errno == ST->st_ino.
+ Otherwise, the status has already been gotten, so return 0.  */
+  if (0 <= st->st_atim.tv_nsec)
+return 0;
+  if (st->st_atim.tv_nsec == -1)
 {
-  st->st_size = -2;
+  if (fstatat (fd, file, st, flag) == 0)
+return 0;
+  st->st_atim.tv_nsec = -2;
   st->st_ino = errno;
 }
-  if (0 <= st->st_size)
-return 0;
-  errno = (int) st->st_ino;
+  errno = st->st_ino;
   return -1;
+#else
+  return fstatat (fd, file, st, flag);
+#endif
 }
 
 /* Initialize a fstatat cache *ST.  Return ST for convenience.  */
 static inline struct stat *
 cache_stat_init (struct stat *st)
 {
-  st->st_size = -1;
+#if HAVE_STRUCT_STAT_ST_ATIM_TV_NSEC
+  st->st_atim.tv_nsec = -1;
+#endif
   return st;
 }
 
-- 
2.34.1

From 03cc716cb1d6d69dfdb9038a6889035ab957f201 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 23 Jul 2022 11:00:33 -0700
Subject: [PATCH 2/2] ls: print negative file sizes as negative

This is more useful in practice (Bug#56710).
However, if POSIXLY_CORRECT is set, print them as positive.
* src/ls.c (abs_file_size, human_file_size): New functions.
(gobble_file, print_long_format): Use them.
* src/stat.c: Revert previous change, so that stat and ls agree.
---
 NEWS   |  6 --
 src/ls.c   | 53 +++--
 src/stat.c | 10 +-
 3 files changed, 48 insertions(+), 21 deletions(-)

diff --git a/NEWS b/NEWS
index 816025255..d76946eb8 100644
--- a/NEWS
+++ b/NEWS
@@ -21,12 +21,14 @@ GNU coreutils NEWS-*- outline -*-
   'cp --reflink=always A B' no longer leaves behind a newly created
   empty file B merely because copy-on-write clones are not supported.
 
+  Unless POSIXLY_CORRECT is set, 'ls -l' no longer prints negative
+  file sizes as huge positive numbers.  This is more consistent with
+  how 'stat -c %s' treats virtual files like /proc/kcore.
+
   'ls -v' and 'sort -V' go back to sorting ".0" before ".A",
   reverting to the behavior in coreutils-9.0 and earlier.
   This behavior is now documented.
 
-  ’stat -c %s' now prints sizes as unsigned, consistent with 'ls'.
-
 ** New Features
 
   factor now accepts the --exponents (-h) option to print factors
diff --git a/src/ls.c b/src/ls.c
index d48892be7..475fb2719 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -3142,6 +3142,19 @@ file_ignored (char const *name)
   || patterns_match (ignore_patterns, name));
 }
 
+/* The following functions assumes typical implementations
+   where off_t is no wider than uintm

bug#56710: ls vs. stat display of st_size

2022-07-22 Thread Paul Eggert

Thanks for reporting that. I installed the attached.From 34a93b971dd68ab8ff96aa20bf2f39374ab3a443 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 22 Jul 2022 13:50:31 -0700
Subject: [PATCH] stat: -c %s now prints unsigned

* src/stat.c (unsigned_file_size): New static function,
copied from src/ls.c.
(print_stat): %s prints an unsigned value now (Bug#56710).
---
 NEWS   |  2 ++
 src/stat.c | 10 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index b4e3cf83a..816025255 100644
--- a/NEWS
+++ b/NEWS
@@ -25,6 +25,8 @@ GNU coreutils NEWS-*- outline -*-
   reverting to the behavior in coreutils-9.0 and earlier.
   This behavior is now documented.
 
+  ’stat -c %s' now prints sizes as unsigned, consistent with 'ls'.
+
 ** New Features
 
   factor now accepts the --exponents (-h) option to print factors
diff --git a/src/stat.c b/src/stat.c
index 3765a8f65..549762aba 100644
--- a/src/stat.c
+++ b/src/stat.c
@@ -1492,6 +1492,14 @@ do_stat (char const *filename, char const *format,
 }
 #endif /* USE_STATX */
 
+/* POSIX requires 'ls' to print file sizes without a sign, even
+   when negative.  Be consistent with that.  */
+
+static uintmax_t
+unsigned_file_size (off_t size)
+{
+  return size + (size < 0) * ((uintmax_t) OFF_T_MAX - OFF_T_MIN + 1);
+}
 
 /* Print stat info.  Return zero upon success, nonzero upon failure.  */
 static bool
@@ -1575,7 +1583,7 @@ print_stat (char *pformat, size_t prefix_len, char mod, char m,
   fail |= out_mount_point (filename, pformat, prefix_len, statbuf);
   break;
 case 's':
-  out_int (pformat, prefix_len, statbuf->st_size);
+  out_uint (pformat, prefix_len, unsigned_file_size (statbuf->st_size));
   break;
 case 'r':
   if (mod == 'H')
-- 
2.37.1



bug#56524: doc: timezone offset conversion/info

2022-07-13 Thread Paul Eggert

On 7/13/22 14:31, Karl Berry wrote:

 +Simple POSIX rules like this can also specify nonzero Greenwich offsets.

Nothing about this seems "simple" to me :).


I meant "simple" in comparison to the rules like 
TZ="<-05>+5<-04>,M3.2.0/2,M11.1.0/2".


Fixed by installing the attached further patch, which also omits that 
hyphen - though it keeps another similar hyphen that you didn't mention. 
"Most style guides do advise against linking 'more' to an adjective with 
a hyphen, but most also recognize that sometimes a hyphen may be 
necessary for clarity." 
<https://www.dailywritingtips.com/hyphenating-more-adjective/>From 5336cb27ab42f27b8b8ac31982e8215fe5af6f34 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 13 Jul 2022 18:54:56 -0700
Subject: [PATCH] * doc/parse-datetime.texi: Tweak wording again.

---
 doc/parse-datetime.texi | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/doc/parse-datetime.texi b/doc/parse-datetime.texi
index 7939273691..e1ce97220a 100644
--- a/doc/parse-datetime.texi
+++ b/doc/parse-datetime.texi
@@ -551,31 +551,34 @@ location name in a @env{TZ} setting, e.g.,
 @samp{TZ=":America/New_York"}.
 
 The @samp{tz} database includes a wide variety of locations ranging
-from @samp{Arctic/Longyearbyen} to @samp{Antarctica/South_Pole}, but
+from @samp{Africa/Abidjan} to @samp{Pacific/Tongatapu}, but
 if you are at sea and have your own private time zone, or if you are
 using a non-GNU host that does not support the @samp{tz}
 database, you may need to use a POSIX rule instead.
 The previously-mentioned POSIX rule @samp{UTC0} says that the time zone
 abbreviation is @samp{UTC}, the zone is zero hours away from
 Greenwich, and there is no daylight saving time.
-Simple POSIX rules like this can also specify nonzero Greenwich offsets.
+POSIX rules can also specify nonzero Greenwich offsets.
 For example, the following shell transcript answers the question
 ``What time is it five and a half hours east of Greenwich when a clock
 seven hours west of Greenwich shows 9:50pm on July 12, 2022?''
 
 @example
-$ TZ="<+0530>-5:30" date --date='TZ="<-07>7" 2022-07-12 21:50'
+$ TZ="<+0530>-5:30" date --date='TZ="<-07>+7" 2022-07-12 21:50'
 Wed Jul 13 10:20:00 +0530 2022
 @end example
 
 @noindent
-This example uses the somewhat-confusing POSIX convention for TZ strings.
-@samp{TZ="<-07>7"} says that the time zone abbreviation is @samp{-07}
+This example uses the somewhat-confusing POSIX convention for rules.
+@samp{TZ="<-07>+7"} says that the time zone abbreviation is @samp{-07}
 and the time zone is 7 hours west of Greenwich, and
 @samp{TZ="<+0530>-5:30"} says that the time zone abbreviation is @samp{+0530}
 and the time zone is 5 hours 30 minutes east of Greenwich.
-More-complex POSIX TZ strings can specify simple daylight saving
-regimes.  @xref{TZ Variable,, Specifying the Time Zone with @code{TZ},
+Although trickier POSIX @env{TZ} settings like
+@samp{TZ="<-05>+5<-04>,M3.2.0/2,M11.1.0/2"} can specify some daylight
+saving regimes, location-based settings like
+@samp{TZ="America/New_York"} are typically simpler and more accurate
+historically.  @xref{TZ Variable,, Specifying the Time Zone with @code{TZ},
 libc, The GNU C Library}.
 
 @node Authors of parse_datetime
-- 
2.34.1



bug#56524: doc: timezone offset conversion/info

2022-07-12 Thread Paul Eggert

On 7/12/22 15:57, Karl Berry wrote:


$ TZ=UTC-4 date -d 'TZ="UTC" 2022-07-24 15:00'


This doesn't mean what you want, because TZ=UTC-4 means "My time zone is 
abbreviated 'UTC', and it's four hours east of Greenwich" which is not a 
useful setting.


You're not the first person to run afoul of POSIX TZ strings, which are 
poorly designed. I installed the attached patch to Gnulib to give 
another example, which I hope clarifies things a bit. I'll cc this email 
to bug-gnulib since the problem is in Gnulib not Coreutils proper.



If the offset syntax is documented anywhere, I couldn't find it. Sorry.


It's documented in the glibc manual, and this part of the Coreutils 
manual (actually, taken from Gnulib) has a cross-reference to that.



BTW, in neither case did --debug clarify anything for me. In fact, it
confused me more, because the output seemingly did not include anything
about the offset at all, just reporting "UTC".


It'd be nice if --debug could diagnose invalid TZ settings. However, 
this would likely require glibc support along the lines of what's in 
tzcode and NetBSD (the tzalloc function).From f65d00ebacc891e57cca729041d028d07d1883bb Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 12 Jul 2022 17:11:26 -0700
Subject: [PATCH] parse-datetime: improve doc for TZ="<-07>7" etc.

* doc/parse-datetime.texi (Specifying time zone rules):
Give examples of POSIX TZ strings that specify UTC offsets (Bug#56524).
---
 ChangeLog   |  6 ++
 doc/parse-datetime.texi | 24 +---
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index cd01e0208e..f245082aa6 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-07-12  Paul Eggert  
+
+	parse-datetime: improve doc for TZ="<-07>7" etc.
+	* doc/parse-datetime.texi (Specifying time zone rules):
+	Give examples of POSIX TZ strings that specify UTC offsets (Bug#56524).
+
 2022-07-10  Bruno Haible  
 
 	sigsegv: Optimize stackvma implementation for AIX 7.
diff --git a/doc/parse-datetime.texi b/doc/parse-datetime.texi
index 44305d136c..7939273691 100644
--- a/doc/parse-datetime.texi
+++ b/doc/parse-datetime.texi
@@ -554,9 +554,27 @@ The @samp{tz} database includes a wide variety of locations ranging
 from @samp{Arctic/Longyearbyen} to @samp{Antarctica/South_Pole}, but
 if you are at sea and have your own private time zone, or if you are
 using a non-GNU host that does not support the @samp{tz}
-database, you may need to use a POSIX rule instead.  Simple
-POSIX rules like @samp{UTC0} specify a time zone without
-daylight saving time; other rules can specify simple daylight saving
+database, you may need to use a POSIX rule instead.
+The previously-mentioned POSIX rule @samp{UTC0} says that the time zone
+abbreviation is @samp{UTC}, the zone is zero hours away from
+Greenwich, and there is no daylight saving time.
+Simple POSIX rules like this can also specify nonzero Greenwich offsets.
+For example, the following shell transcript answers the question
+``What time is it five and a half hours east of Greenwich when a clock
+seven hours west of Greenwich shows 9:50pm on July 12, 2022?''
+
+@example
+$ TZ="<+0530>-5:30" date --date='TZ="<-07>7" 2022-07-12 21:50'
+Wed Jul 13 10:20:00 +0530 2022
+@end example
+
+@noindent
+This example uses the somewhat-confusing POSIX convention for TZ strings.
+@samp{TZ="<-07>7"} says that the time zone abbreviation is @samp{-07}
+and the time zone is 7 hours west of Greenwich, and
+@samp{TZ="<+0530>-5:30"} says that the time zone abbreviation is @samp{+0530}
+and the time zone is 5 hours 30 minutes east of Greenwich.
+More-complex POSIX TZ strings can specify simple daylight saving
 regimes.  @xref{TZ Variable,, Specifying the Time Zone with @code{TZ},
 libc, The GNU C Library}.
 
-- 
2.34.1



bug#56520: Security vulnerabilities at coreutils version for CentOS 7.9

2022-07-12 Thread Paul Eggert

On 7/12/22 05:43, Meirav Rath via GNU coreutils Bug Reports wrote:

It looks like coreutils available rpm for CentOS 7.9 (8.22) has the vulnerability 
CVE-2017-18018.

When can we expect an updated RPM of a more advanced version with fixes for 
this issues, aimed for CentOS7.9?


CentOS is downstream from the Coreutils project, so I suggest asking the 
CentOS maintainers instead of this mailing list.






bug#54586: dd conv options doc

2022-07-06 Thread Paul Eggert

On 3/26/22 15:29, Karl Berry wrote:

why would I want data to be synced and not metadata?


Performance, in apps that don't care about the metadata. Admittedly for 
dd the use case is rare; it's mostly present so that dd exports all the 
open flags to the user.


I installed the attached to try to document this better.
From 1efce5663554619db34d2722be7d6e5a14404065 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 6 Jul 2022 23:42:19 -0500
Subject: [PATCH] dd: doc improvement (Bug#54586)

* doc/coreutils.texi (dd invocation): Explain
fdatasync and fsync better.
---
 doc/coreutils.texi | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 7bca37b71..e0c87d1ad 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9466,7 +9466,13 @@ Continue after read errors.
 @cindex synchronized data writes, before finishing
 Synchronize output data just before finishing,
 even if there were write errors.
-This forces a physical write of output data.
+This forces a physical write of output data,
+so that even if power is lost the output data will be preserved.
+If neither this nor @samp{fsync} are specified, output is treated as
+usual with file systems, i.e., output data and metadata may be cached
+in primary memory for some time before the operating system physically
+writes it, and thus output data and metadata may be lost if power is lost.
+@xref{sync invocation}.
 This conversion is a GNU extension to POSIX.
 
 @item fsync
@@ -9474,7 +9480,10 @@ This conversion is a GNU extension to POSIX.
 @cindex synchronized data and metadata writes, before finishing
 Synchronize output data and metadata just before finishing,
 even if there were write errors.
-This forces a physical write of output data and metadata.
+This acts like @samp{fdatasync} except it also preserves output metadata,
+such as the last-modified time of the output file; for this reason it
+may be a bit slower than @samp{fdatasync} although the performance
+difference is typically insignificant for @command{dd}.
 This conversion is a GNU extension to POSIX.
 
 @end table
-- 
2.36.1



bug#56391: `cp --reflink=always` creates empty file on failure

2022-07-06 Thread Paul Eggert

On 7/6/22 06:17, Pádraig Brady wrote:

This will usually work, but there are cases where this may lose data,
as previously discussed at:

https://bugzilla.redhat.com/show_bug.cgi?id=921708
http://lists.gnu.org/archive/html/coreutils/2013-03/msg00056.html

I'm not sure cp can robustly clean up in this situation? 


Thanks for pointing me to those old discussions. As I understand it, the 
worry is that FICLONE will only partly succeed, causing the destination 
file to contain some (but not all) the input data, and then if we remove 
the output file we'll lose the newly-made partial clone. I don't know 
whether FICLONE can do that, but it sounds like a reasonable worry.


If that understanding is correct, then the attached further patch should 
suffice, so I boldly installed it.
From 123ed2df4c23e12b08e1d18245f3a0b47508496f Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 6 Jul 2022 14:29:12 -0500
Subject: [PATCH] =?UTF-8?q?cp:=20don=E2=80=99t=20remove=20nonempty=20clone?=
 =?UTF-8?q?d=20dest?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This follows up on comments by Pádraig Brady (bug#56391).
* src/copy.c (copy_reg): When --reflink=always removes a file
due to an FICLONE failure, do not remove a nonempty file.
---
 NEWS   |  3 +++
 src/copy.c | 12 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index a3a55541e..b4e3cf83a 100644
--- a/NEWS
+++ b/NEWS
@@ -18,6 +18,9 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Changes in behavior
 
+  'cp --reflink=always A B' no longer leaves behind a newly created
+  empty file B merely because copy-on-write clones are not supported.
+
   'ls -v' and 'sort -V' go back to sorting ".0" before ".A",
   reverting to the behavior in coreutils-9.0 and earlier.
   This behavior is now documented.
diff --git a/src/copy.c b/src/copy.c
index eaed148b4..e465271ef 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1279,9 +1279,17 @@ copy_reg (char const *src_name, char const *dst_name,
 {
   error (0, errno, _("failed to clone %s from %s"),
  quoteaf_n (0, dst_name), quoteaf_n (1, src_name));
-  if (*new_dst && unlinkat (dst_dirfd, dst_relname, 0) != 0
-  && errno != ENOENT)
+
+  /* Remove the destination if cp --reflink=always created it
+ but cloned no data.  If clone_file failed with
+ EOPNOTSUPP, EXDEV or EINVAL no data were copied so do not
+ go to the expense of lseeking.  */
+  if (*new_dst
+  && (is_ENOTSUP (errno) || errno == EXDEV || errno == EINVAL
+  || lseek (dest_desc, 0, SEEK_END) == 0)
+  && unlinkat (dst_dirfd, dst_relname, 0) != 0 && errno != ENOENT)
 error (0, errno, _("cannot remove %s"), quoteaf (dst_name));
+
   return_val = false;
   goto close_src_and_dst_desc;
 }
-- 
2.36.1



bug#56391: `cp --reflink=always` creates empty file on failure

2022-07-05 Thread Paul Eggert

Thanks for reporting that. I installed the attached patch.From 08f14d9492e35188b7ed85eb59b7e605285d8b09 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 5 Jul 2022 09:34:17 -0500
Subject: [PATCH] =?UTF-8?q?cp:=20don=E2=80=99t=20create=20empty=20file=20i?=
 =?UTF-8?q?f=20cannot=20clone?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/copy.c (copy_reg): With --reflink=always, if FICLONE fails
on a file we just created, clean up by removing the file (Bug#56391).
---
 src/copy.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/copy.c b/src/copy.c
index 0c368d0e4..eaed148b4 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1279,6 +1279,9 @@ copy_reg (char const *src_name, char const *dst_name,
 {
   error (0, errno, _("failed to clone %s from %s"),
  quoteaf_n (0, dst_name), quoteaf_n (1, src_name));
+  if (*new_dst && unlinkat (dst_dirfd, dst_relname, 0) != 0
+  && errno != ENOENT)
+error (0, errno, _("cannot remove %s"), quoteaf (dst_name));
   return_val = false;
   goto close_src_and_dst_desc;
 }
-- 
2.36.1



bug#56017: [gnu.org #1845594] coreutils POC?

2022-06-16 Thread Paul Eggert
Thanks for the proposal. You've obviously spent some time writing it up. 
However, I'm not entirely sold on the idea being worth the effort. The 
point of the currently-supported approach is that one can and should 
communicate checksums by a different (and hopefully more reliable) means 
than what's used for the checksummed data. That advantage is lost if 
checksums are communicated as part of the data. The proposed passphrases 
attempt to work around this, but if they're evanescent (as in the 
proposal) then they're unsuitable for archival data, and if they're 
permanent they take on the role of the checksums so we're no better off 
than before.






bug#55937: [PATCH] touch: create parent directories if needed

2022-06-14 Thread Paul Eggert

On 6/14/22 19:20, Alan Rosenthal wrote:

`touch -p a/b/c/d/e` will now be the same as running:
`mkdir -p a/b/c/d && touch a/b/c/d/e`.


I don't see how this useful enough to merit a change, since one can 
achieve the effect of the proposed "touch -p" with the already-existing 
"mkdir -p" followed by plain "touch". mkdir -p already exists and should 
work everywhere that's POSIX-compatible. We don't need -p for other 
commands that create files (e.g., cp, mv, ln); what's special about 'touch'?







bug#55895: [PATCH] maint: Fix ptr_align signature to silence -Wmaybe-uninitialized

2022-06-11 Thread Paul Eggert

On 6/11/22 14:18, Anders Kaseorg wrote:


if you align the mutable pointer, you don’t need to separately align a const 
pointer.


Of course one can alter code by aligning a mutable pointer first and 
then converting it to a const pointer, instead of first converting it to 
const and then aligning the result. But that might not be convenient. 
The part of the code that needs alignment might have access to only the 
const pointer, and should be able to align the const pointer on its own 
without bothering the part of the code that doesn't need alignment.


Anyway, we're getting a long way from the original bug report. I'll let 
you have the last word on this tangential topic, if you like.






bug#55910: cp error

2022-06-11 Thread Paul Eggert
Thanks for the bug report. I installed the attached patch, which should 
fix it. Please give it a try.From b54da709a1f3a6f10ed3150b0ae5269002a1053c Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 11 Jun 2022 10:49:18 -0700
Subject: [PATCH] =?UTF-8?q?cp:=20fix=20=E2=80=98cp=20-rx=20/=20/mnt?=
 =?UTF-8?q?=E2=80=99?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by pkor...@gmail.com (Bug#55910).
* src/copy.c (copy_internal): Treat a relative destination name ""
as if it were "." for the purpose of directory-relative syscalls
like fstatat that might might refer to the destination directory.
---
 NEWS   |  3 +++
 src/copy.c | 50 +++---
 2 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/NEWS b/NEWS
index dd37e1525..a3a55541e 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,9 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Bug fixes
 
+  'cp -rx / /mnt' no longer complains "cannot create directory /mnt/".
+  [bug introduced in coreutils-9.1]
+
   'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
   [bug introduced in coreutils-9.1]
 
diff --git a/src/copy.c b/src/copy.c
index b15d91990..edc822134 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1954,6 +1954,7 @@ copy_internal (char const *src_name, char const *dst_name,
   bool restore_dst_mode = false;
   char *earlier_file = NULL;
   char *dst_backup = NULL;
+  char const *drelname = *dst_relname ? dst_relname : ".";
   bool delayed_ok;
   bool copied_as_regular = false;
   bool dest_is_symlink = false;
@@ -1971,7 +1972,7 @@ copy_internal (char const *src_name, char const *dst_name,
   if (x->move_mode)
 {
   if (rename_errno < 0)
-rename_errno = (renameatu (AT_FDCWD, src_name, dst_dirfd, dst_relname,
+rename_errno = (renameatu (AT_FDCWD, src_name, dst_dirfd, drelname,
RENAME_NOREPLACE)
 ? errno : 0);
   nonexistent_dst = *rename_succeeded = new_dst = rename_errno == 0;
@@ -1983,7 +1984,7 @@ copy_internal (char const *src_name, char const *dst_name,
 {
   char const *name = rename_errno == 0 ? dst_name : src_name;
   int dirfd = rename_errno == 0 ? dst_dirfd : AT_FDCWD;
-  char const *relname = rename_errno == 0 ? dst_relname : src_name;
+  char const *relname = rename_errno == 0 ? drelname : src_name;
   int fstatat_flags
 = x->dereference == DEREF_NEVER ? AT_SYMLINK_NOFOLLOW : 0;
   if (follow_fstatat (dirfd, relname, _sb, fstatat_flags) != 0)
@@ -2051,8 +2052,7 @@ copy_internal (char const *src_name, char const *dst_name,
   int fstatat_flags = use_lstat ? AT_SYMLINK_NOFOLLOW : 0;
   if (!use_lstat && nonexistent_dst < 0)
 new_dst = true;
-  else if (follow_fstatat (dst_dirfd, dst_relname, _sb,
-   fstatat_flags)
+  else if (follow_fstatat (dst_dirfd, drelname, _sb, fstatat_flags)
== 0)
 {
   have_dst_lstat = use_lstat;
@@ -2077,7 +2077,7 @@ copy_internal (char const *src_name, char const *dst_name,
   bool return_now = false;
 
   if (x->interactive != I_ALWAYS_NO
-  && ! same_file_ok (src_name, _sb, dst_dirfd, dst_relname,
+  && ! same_file_ok (src_name, _sb, dst_dirfd, drelname,
  _sb, x, _now))
 {
   error (0, 0, _("%s and %s are the same file"),
@@ -2140,7 +2140,7 @@ copy_internal (char const *src_name, char const *dst_name,
  cp and mv treat -i and -f differently.  */
   if (x->move_mode)
 {
-  if (abandon_move (x, dst_name, dst_dirfd, dst_relname, _sb))
+  if (abandon_move (x, dst_name, dst_dirfd, drelname, _sb))
 {
   /* Pretend the rename succeeded, so the caller (mv)
  doesn't end up removing the source file.  */
@@ -2321,14 +2321,11 @@ copy_internal (char const *src_name, char const *dst_name,
  Otherwise, use AT_SYMLINK_NOFOLLOW, in case dst_name is a symlink.  */
   if (have_dst_lstat)
 dst_lstat_sb = _sb;
+  else if (fstatat (dst_dirfd, drelname, _buf, AT_SYMLINK_NOFOLLOW)
+   == 0)
+dst_lstat_sb = _buf;
   else
-{
-  if (fstatat (dst_dirfd, dst_relname, _buf,
-   AT_SYMLINK_NOFOLLOW) == 0)
-dst_lstat_sb = _buf;
-  else
-lstat_ok = false;
-}
+lstat_ok = false;
 
   /* Never copy through a symlink we've just created.  */
   if (lstat_ok
@@ -2475,8 +2472,7 @@ copy_internal (char const *src_name, char const *dst_name,
   if (x->move_mode)
 {
   if (rename_errno == EEXIST)
-rename

bug#55895: [PATCH] maint: Fix ptr_align signature to silence -Wmaybe-uninitialized

2022-06-11 Thread Paul Eggert

On 6/11/22 09:30, Anders Kaseorg wrote:
A pointer to uninitialized (or zero-initialized) memory that won’t be 
written is valid but not _useful_.


But in the example I gave, the memory *is* written to later.

A const * pointer lets a C program have a read-only window into memory 
that other parts of the program can write to, which can be a useful 
thing to have. In C and C++, "const *" doesn't mean a pointer to storage 
that does not change; it merely means a pointer that can't be used to 
write the referenced storage.






bug#55895: [PATCH] maint: Fix ptr_align signature to silence -Wmaybe-uninitialized

2022-06-11 Thread Paul Eggert

On 6/10/22 21:11, Anders Kaseorg wrote:
It seems the important step I should 
have included was CFLAGS=-O0.


Ah, OK. Since you're building from Git, I can refer you to 
README-hacking which is intended for that. It says, "If you get warnings 
with other configurations, you can run

 './configure --disable-gcc-warnings' or 'make WERROR_CFLAGS='
 to build quietly or verbosely, respectively.
" Here, "other configurations" refers to what you're doing.

(With GCC 12.1.1 I get the same error and also additional errors that might merit further investigation.) 


Like most static analysis tools, GCC generates a bunch of false 
positives unless you baby it just right. We do the babying only for the 
latest GCC with the default configuration; otherwise, it's typically not 
worth the trouble. Feel free to investigate the other warnings, but 
they're important only if they're true positives (and most likely 
they're not, because gcc -O0 is dumber than gcc -O2).



there’s never a reason to call ptr_align with a const pointer, because if the 
memory is initialized the pointer would have already been aligned


First, a const pointer can point to uninitialized storage. Second, even 
if the referenced memory is initialized the pointer need not be aligned 
already. For example, this is valid:


char *p = malloc (1024);
if (!p) return;
char const *q = p; // q points to uninitialized storage
char const *r = ptr_align (q, 512); // q is not aligned already
memset (p, 127, 1024);
...

Replacing 'malloc (1024)' with 'calloc (1024, 1)' (thus initializing the 
storage before aligning the pointer) wouldn't affect the validity of the 
code.



Also, the current signature converts a const pointer to a mutable pointer.


Yes, it's like strchr which is annoying but that's the best C can do.

You're right that changing it from void const * to void * won't hurt 
coreutils' current callers but I'd rather not massage the code merely to 
pacify nondefault configurations. There are too many nondefault 
configurations to worry about and massaging the code to pacify them all 
would waste our time and confuse the code. Instead, we pacify only 
default configurations with current GCC.






bug#55895: [PATCH] maint: Fix ptr_align signature to silence -Wmaybe-uninitialized

2022-06-10 Thread Paul Eggert

On 6/10/22 15:24, Anders Kaseorg wrote:

ptr_align is always called with a pointer to uninitialized memory, so
it does not make sense for that pointer to be const.


I don't see why not. ptr_align does not modify the referenced storage so 
"void const *" is appropriate. If we changed the type to "void *" we'd 
unnecessarily limit ptr_align's applicability.




This change avoids -Wmaybe-uninitialized warnings from GCC 11.


I can't reproduce the problem with the latest coreutils commit 
93e099e4c3b659b2e329f655fbdc73fdf594a66e on Savannah master on Ubuntu 
22.04 LTS x86-64, I don't get warnings with either gcc (Ubuntu 
11.2.0-19ubuntu1) 11.2.0 or with gcc-12 (Ubuntu 12-20220319-1ubuntu1) 
12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]. I also 
don't see a problem on Fedora 36 x86-64 with gcc (GCC) 12.1.1 20220507 
(Red Hat 12.1.1-1)
. In all cases I configured and built with "./configure 
--enable-gcc-warnings; make".


Perhaps the problem (whatever it was) has already been fixed in 
coreutils. If not, I guess that the issue has something to do with the 
particular platform and options you configured with. If it's an older 
compiler like GCC 11.3.0 I wouldn't worry too much about it, as older 
GCCs are notorious for false alarms and not worth the trouble of pacifying.


Most likely the warnings are false positives (though I haven't checked 
this).






bug#55724: cp --reflink=always failing when --reflink=auto reflinks successfully on OpenZFS

2022-05-30 Thread Paul Eggert

On 5/30/22 08:04, Pádraig Brady wrote:

Really the kernel has to behave appropriately there
and not do the blanket assumption with EXDEV.


I agree. VFS should be willing to try a cross-filesystem FICLONE. Not 
only does copy_file_range not guarantee cloning; it is less efficient 
even when it does clone, due to the need to find the holes in the source 
file.






bug#55622: Bug in sort with keys and reverse, and version-sort and reverse

2022-05-25 Thread Paul Eggert

I installed the attached spelling fix to a comment in my previous patch.From 8c1a447a3790ec74ef919c60d46673e7be061c72 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 25 May 2022 11:49:13 -0700
Subject: [PATCH] maint: spelling fix

---
 src/sort.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/sort.c b/src/sort.c
index 0a6b557ac..c850656ef 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2601,7 +2601,7 @@ key_warnings (struct keyfield const *gkey, bool gkey_only)
 error (0, 0, _("option '-r' only applies to last-resort comparison"));
 }
 
-/* Return either the sense of DIFF or its reverse, depnding on REVERSED.
+/* Return either the sense of DIFF or its reverse, depending on REVERSED.
If REVERSED, do not simply negate DIFF as that can mishandle INT_MIN.  */
 
 static int
-- 
2.34.1



bug#55622: Bug in sort with keys and reverse, and version-sort and reverse

2022-05-25 Thread Paul Eggert

Thanks, Pádraig, for fixing that. And thanks, Larry, for reporting that.


The existing tests are sufficient to catch this.


Yes, evidently I forgot to run 'make check', which I usually do. I'll 
try to not forget next time


I installed the attached further patches to (1) coalesce duplicate code 
and explain why it's needed and (2) tweak performance a tiny bit.
From 15627794459933d293547c2bf7d77ab196ae73a3 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 25 May 2022 11:19:08 -0700
Subject: [PATCH 1/2] sort: refactor tricky diff reversal

* src/sort.c (diff_reversed): New function, to make the intent clearer.
(keycompare, compare): Use it.
---
 src/sort.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/sort.c b/src/sort.c
index dbe456038..0cd22f931 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2601,6 +2601,15 @@ key_warnings (struct keyfield const *gkey, bool gkey_only)
 error (0, 0, _("option '-r' only applies to last-resort comparison"));
 }
 
+/* Return either the sense of DIFF or its reverse, depnding on REVERSED.
+   If REVERSED, do not simply negate DIFF as that can mishandle INT_MIN.  */
+
+static int
+diff_reversed (int diff, bool reversed)
+{
+  return reversed ? (diff < 0 ? 1 : -diff) : diff;
+}
+
 /* Compare two lines A and B trying every key in sequence until there
are no more keys or a difference is found. */
 
@@ -2793,9 +2802,7 @@ keycompare (struct line const *a, struct line const *b)
 }
 }
 
-  if (key->reverse)
-diff = diff < 0 ? 1 : -diff;
-  return diff;
+  return diff_reversed (diff, key->reverse);
 }
 
 /* Compare two lines A and B, returning negative, zero, or positive
@@ -2840,9 +2847,7 @@ compare (struct line const *a, struct line const *b)
 diff = (alen > blen) - (alen < blen);
 }
 
-  if (reverse)
-diff = diff < 0 ? 1 : -diff;
-  return diff;
+  return diff_reversed (diff, reverse);
 }
 
 /* Write LINE to output stream FP; the output file's name is
-- 
2.34.1

From 85ddde23116e578768f90bad6899340da5394b75 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 25 May 2022 11:23:39 -0700
Subject: [PATCH 2/2] sort: tune diff_reversed

* src/sort.c (diff_reversed): Tune.  On x86-64 with GCC, this
saves a conditional branch and shortens the generated machine code.
---
 src/sort.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/sort.c b/src/sort.c
index 0cd22f931..0a6b557ac 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2607,7 +2607,7 @@ key_warnings (struct keyfield const *gkey, bool gkey_only)
 static int
 diff_reversed (int diff, bool reversed)
 {
-  return reversed ? (diff < 0 ? 1 : -diff) : diff;
+  return reversed ? (diff < 0) - (diff > 0) : diff;
 }
 
 /* Compare two lines A and B trying every key in sequence until there
-- 
2.34.1



bug#55487: chmod to +w is not defaulting to ALL target in Debian 11.3

2022-05-17 Thread Paul Eggert

On 5/17/22 10:51, Corey H wrote:

sudo chmod +w /etc/whatever/whatever.conf #doesn't work
sudo chmod a+w /etc/whatever/whatever.conf #does work


It sounds like you're misunderstanding what "chmod +w" means. It doesn't 
mean "turn on all the w bits". It means "turn on the w bits enabled by 
the current umask". So, for example, this is expected behavior:


$ umask
0022
$ touch foo
$ ls -l foo
-rw-r--r--. 1 eggert eggert 0 May 17 14:37 foo
$ chmod +w foo
$ ls -l foo
-rw-r--r--. 1 eggert eggert 0 May 17 14:37 foo
$ umask 0
$ chmod +w foo
$ ls -l foo
-rw-rw-rw-. 1 eggert eggert 0 May 17 14:37 foo





bug#55212: GNU Linux "sort -g" can hang indefinitely when run on standard input if NaNs are involved

2022-05-10 Thread Paul Eggert

On 5/10/22 12:08, coreut...@tlinx.org wrote:

    Unless there is some magic about -n1238095,


The test is random and there's no magic, just luck. The larger the 
random test, the more likely you'll run into the unlucky situation where 
the unpatched 'sort' infloops.






bug#55225: GNU Linux "sort -g" can hang indefinitely when run on standard input (on Ubuntu)

2022-05-02 Thread Paul Eggert

Thanks for reporting that. Fixed as described here:

https://bugs.gnu.org/55212





bug#55226: Bug Found at Uname at Red hat linux

2022-05-02 Thread Paul Eggert

On 5/2/22 04:43, Sasi Kiran wrote:

Respected GNU Team
*Iam K.sasi kiran*
While iam executing uname -v itself shows the Date of the linux
But has i seen on uname --help it shows the *uname -v* gives the* kernel
version..*...


The kernel version contains the date, so there's no bug here.





bug#55212: GNU Linux "sort -g" can hang indefinitely when run on standard input if NaNs are involved

2022-05-02 Thread Paul Eggert

On 5/2/22 06:31, Pádraig Brady wrote:

This is a bit slower of course, but since an edge case not a big concern:


Yes, my thoughts too. There are ways to speed up common lots-o-NaN cases 
portably (I toyed with the idea of using ieee754.h), but I went with the 
simple approach for now.


A nit: 'time' needs to be at the end of the pipeline:

$ yes nan | head -n128095 | bash -c 'time src/sort -g' >/dev/null

real0m0.552s
user0m0.551s
sys 0m0.001s
$ yes nan | head -n128095 | bash -c 'time sort -g' >/dev/null

real0m0.392s
user0m0.382s
sys 0m0.009s
512-day $ yes nan | head -n128095 | bash -c 'time sort -g' >/dev/null
[Here I had to control-C since 'sort' inflooped.]





bug#55212: GNU Linux "sort -g" can hang indefinitely when run on standard input if NaNs are involved

2022-05-02 Thread Paul Eggert
Thanks for the bug report. This bug is entertaining, as it comes from 
GCC now being so smart that it optimizes away a memset that cleared 
padding bits. We added the memset in coreutils 8.14 (2011) to try to fix 
the sort -g infinite loop bug (introduced in 1999), but the memset isn't 
guaranteed to fix the bug because the memset can be optimized away.


If the padding bits happen to be clear already sort is OK, but if not 
the results can be inconsistent when you compare two NaNs to each other, 
and inconsistent results can make sort infloop.


The C standard allows this level of intelligence in the compiler, so 
it's a bug in GNU 'sort'.


I installed the attached patch; please give it a try. For now I'll 
boldly close the bug report; we can easily reopen it if this patch 
doesn't actually fix the problem.From 2f56f5a42033dc6db15d8963e54566f01fa0d61d Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sun, 1 May 2022 22:46:21 -0700
Subject: [PATCH] sort: fix sort -g infloop again

Problem reported by Giulio Genovese (Bug#55212).
* src/sort.c (nan_compare): To compare NaNs, simply printf+strcmp.
This avoids the problem of padding bits and unspecified behavior.
Args are now long double instead of char *; caller changed.
---
 NEWS   |  6 ++
 src/sort.c | 21 ++---
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/NEWS b/NEWS
index 26eb52ca0..3a9148637 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,12 @@ GNU coreutils NEWS-*- outline -*-
   'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
   [bug introduced in coreutils-9.1]
 
+  'sort -g' no longer infloops when given multiple NaNs on platforms
+  like x86-64 where 'long double' has padding bits in memory.
+  Although the fix alters sort -g's NaN ordering, that ordering has
+  long been documented to be platform-dependent.
+  [bug introduced 1999-05-02 and only partly fixed in coreutils-8.14]
+
 
 * Noteworthy changes in release 9.1 (2022-04-15) [stable]
 
diff --git a/src/sort.c b/src/sort.c
index 3b775d6bb..b2a465cf5 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2003,22 +2003,13 @@ numcompare (char const *a, char const *b)
   return strnumcmp (a, b, decimal_point, thousands_sep);
 }
 
-/* Work around a problem whereby the long double value returned by glibc's
-   strtold ("NaN", ...) contains uninitialized bits: clear all bytes of
-   A and B before calling strtold.  FIXME: remove this function if
-   gnulib guarantees that strtold's result is always well defined.  */
 static int
-nan_compare (char const *sa, char const *sb)
+nan_compare (long double a, long double b)
 {
-  long double a;
-  memset (, 0, sizeof a);
-  a = strtold (sa, NULL);
-
-  long double b;
-  memset (, 0, sizeof b);
-  b = strtold (sb, NULL);
-
-  return memcmp (, , sizeof a);
+  char buf[2][sizeof "-nan()" + CHAR_BIT * sizeof a];
+  snprintf (buf[0], sizeof buf[0], "%Lf", a);
+  snprintf (buf[1], sizeof buf[1], "%Lf", b);
+  return strcmp (buf[0], buf[1]);
 }
 
 static int
@@ -2046,7 +2037,7 @@ general_numcompare (char const *sa, char const *sb)
   : a == b ? 0
   : b == b ? -1
   : a == a ? 1
-  : nan_compare (sa, sb));
+  : nan_compare (a, b));
 }
 
 /* Return an integer in 1..12 of the month name MONTH.
-- 
2.34.1



bug#54785: for floating point, printf should use double like in C instead of long double

2022-04-30 Thread Paul Eggert

On 4/30/22 05:48, Vincent Lefevre wrote:

Yes, but to be clear, POSIX says:

   shall be evaluated as if by the strtod() function if the
   corresponding conversion specifier is a, A, e, E, f, F, g, or G

so the number should be regarded as a double-precision number
(type double).


Yes, but POSIX does not require the C type 'double' and the C function 
strtod to be implemented via IEEE-754 64-bit floating point. POSIX 
allows 'double' and 'strtod' to be implemented via x86-64 
extended-precision (80-bit) floating point, or by any other 
floating-point type that satisfies some (weak) properties. I see no 
requirement that the shell must be implemented as if by the standard c99 
command with the default options.


The POSIX requirements on the implementation of 'double' and 'strtod' 
are so lax that Bash 'printf' could even use IEEE-754 32-bit floating 
point, if it wanted to. One could build Bash with 'gcc -mlong-double=32 
-mdouble=32' assuming these options work, and the result would conform 
to POSIX. (Not that I'm suggesting this!)




Concerning the compatibility, the question is: with what?


I agree that it'd be a net win for Bash to use plain 'double' here; your 
discussion of the various compatibility plusses of doing that is 
compelling to me.






bug#54785: for floating point, printf should use double like in C instead of long double

2022-04-29 Thread Paul Eggert

On 4/29/22 13:04, Chet Ramey wrote:


I think I'm going to stick with the behavior I proposed, fixing the POSIX
conformance issue and preserving backwards compatibility, until I hear more
about whether backwards compatibility is an issue here.


Come to think of it, as far as POSIX is concerned Bash doesn't need to 
change what it does. POSIX doesn't require that the shelll printf 
command be compiled with any particular environment. It would conform to 
POSIX, for example, if Bash's printf were compiled with an IBM floating 
point implementation rather than with an IEEE floating point 
implementation, so long as Bash's printf parses floating-point strings 
the way strtod is supposed to parse strings on an IBM mainframe. 
Similarly, Bash's printf can use an 80-bit floating point format if 
available; it will still conform to POSIX.


So this isn't a POSIX conformance issue; only a compatibility issue. Is 
it more important for the Bash printf to behave like most other shells 
and other programs, or is it more important for Bash printf to behave 
like it has for the last 18 years or so?






bug#55023: Issue with CP empty folder after y2038 on 32-bits Kernel

2022-04-28 Thread Paul Eggert

On 4/27/22 09:42, Pádraig Brady wrote:

Marking this as done in the coreutils bug tracker,
now that this is being tracked in glibc.


This could also be worked around Gnulib for the benefit of 32-bit apps 
running with unpatched glibc 2.34 and 2.35, or glibc older than 2.34. 
Not sure it's worth the trouble, though, as the fixed glibc should be 
universal pretty much everywhere before the year 2038 rolls around.






bug#54785: for floating point, printf should use double like in C instead of long double

2022-04-27 Thread Paul Eggert

On 4/27/22 05:10, Glenn Golden wrote:

Ok, I see what you mean, thanks for the explanation. I'll pose the question (or 
maybe file a bug report) on the glibc list.


By the way I now think I see a reason for why glibc does things the way 
it does: it minimizes output size.


'double' has 53 bits counting the hidden bit, and with 53/4 you have 13 
hex digits plus one leading digit that is either 0 (unnormalized) or 1 
(normalized).


'long double' has 64 bits and with 64/4 you have 16 hex digits, where 
the leading digit is 0-7 (unnormalized), 8-f (normalized).


Any proposal to change 'long double' to always output leading 0 or 1 
needs to deal with the fact that this'd lengthen the output string.






bug#54785: for floating point, printf should use double like in C instead of long double

2022-04-25 Thread Paul Eggert

On 4/25/22 16:50, Glenn Golden wrote:



On Mon, Apr 25, 2022, at 13:06, Paul Eggert wrote:


I'd like coreutils printf to stay compatible with Bash printf. Thanks.



Is there any interest/motivation for consistentizing {coreutils printf, bash 
printf} with glibc printf? There's a minor but notable inconsistency between 
them for %a format. See

 https://lists.gnu.org/archive/html/coreutils/2022-04/msg00020.html

I asked about this on the coreutils list, but no response.


To some extent it's the same problem. If Bash and coreutils printf 
change to use 'double', they'll output the same thing that C printf outputs.


But to some extent it's a different problem, as the Bash and coreutils 
printf use glibc printf with long double, and the latter isn't working 
consistently with double. I suppose filing a glibc bug report might 
address this different problem.






bug#54785: for floating point, printf should use double like in C instead of long double

2022-04-25 Thread Paul Eggert

On 4/25/22 11:22, Chet Ramey wrote:

Thanks for the input.


You're welcome. Whenever you decide what to do about this, could you 
please let us know? I'd like coreutils printf to stay compatible with 
Bash printf. Thanks.






bug#54785: for floating point, printf should use double like in C instead of long double

2022-04-25 Thread Paul Eggert

On 4/11/22 11:52, Chet Ramey wrote:

On 4/9/22 3:31 PM, Paul Eggert wrote:



It sounds like there are three cases.

1. If the `L' modifier is supplied, as an extension (POSIX doesn't allow
    length modifiers for the printf utility), use long double. This would
    work in both default and posix modes.

2. In posix mode, use strtod() and double.

3. In default mode, use the existing code to get the highest possible
    precision, as the code has done for over 20 years.


That'll fix the POSIX compatibility bug. However, it may be better for 
Bash to just do (1) if 'L' is supplied and (2) otherwise, even if this 
is less precise than (3). Doing it this simpler way will likely be more 
useful for the small number of people who care whether 'printf' uses 
'double' or 'long double' internally (and nobody else will care).


Doing it this way is what BSD sh does, and it's good to be compatible 
with BSD sh. Similarly for dash and for other shells.


It's also what other GNU programs do. For example, on x86-64:

  $ awk 'BEGIN {printf "%.100g\n", 0.1}'
  0.155511151231257827021181583404541015625
  $ emacs -batch -eval '(message "%.100g" 0.1)'
  0.155511151231257827021181583404541015625
  $ printf "%.100g\n" 0.1
  0.100013552527156068805425093160010874271392822265625

printf is the outlier here, and although its answer is closer to the 
mathematical value, that's not as useful as being closer to what most 
other apps do.


Perhaps it was OK for sh printf to use long double 20 years ago. I even 
had a hand in implementing that.[1] But nowadays it feels like a 
misfire. The overwhelming majority of apps that have developed over the 
past 20 years that use newer languages like JavaScript and Java, do not 
support 'long double'; when interoperating with these apps, using 'long 
double' in Bash printf likely causes more trouble than it cures.


[1]: 
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=830de2708207b7e48464e4778b55e582bac49832






bug#55093: "split -n K/N " BUG: Last Chunk incomplete if input file >= 262144 bytes

2022-04-24 Thread Paul Eggert

On 4/24/22 07:40, Adam Holt wrote:


split (GNU coreutils) 8.32


That's an old version, dated 2020. Please try the current version 
coreutils 9.1, which has bug fixes in this area.


Also, there's no need to cc. rms and tg; they're not working on 'split' 
any more.


Thanks.





bug#55029: Simple backup swaps source and destination files

2022-04-20 Thread Paul Eggert

On 4/19/22 16:05, Steve Ward wrote:

When doing mv or cp with --backup=simple, if an existing file in
DIRECTORY has the same name as SOURCE, the files appear to be swapped
instead of an in-place backup of the original file in DIRECTORY being
made.


Thanks for the bug report. That's new to coreutils 9.1, and is a big 
enough fail that it suggests we'll need a 9.2 sooner rather than later. 
I introduced the bug when fixing an earlier bug (sorry).


I installed the attached Gnulib patch, which should fix the bug in 
Coreutils, with the attached two Coreutils patches to update to the 
latest Gnulib, and to add a test case for the bug.From 7347caeb9d902d3fca2c11f69a55a3e578d93bfe Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 20 Apr 2022 19:34:57 -0700
Subject: [PATCH] backupfile: fix bug when renaming simple backups

* lib/backupfile.c (backupfile_internal): Fix bug when RENAME
and when doing simple backups.  Problem reported by Steve Ward in:
https://bugs.gnu.org/55029
---
 ChangeLog| 5 +
 lib/backupfile.c | 7 +++
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4b39a6a443..cd16bbe0cd 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2022-04-20  Paul Eggert  
 
+	backupfile: fix bug when renaming simple backups
+	* lib/backupfile.c (backupfile_internal): Fix bug when RENAME
+	and when doing simple backups.  Problem reported by Steve Ward in:
+	https://bugs.gnu.org/55029
+
 	gettime-res: more-robust sampling
 	* lib/gettime-res.c (gettime_res): If adjacent timestamps are
 	identical search for a differing timestamp.  Also, stop collecting
diff --git a/lib/backupfile.c b/lib/backupfile.c
index 1e9290a187..d9f465a3e0 100644
--- a/lib/backupfile.c
+++ b/lib/backupfile.c
@@ -332,7 +332,7 @@ backupfile_internal (int dir_fd, char const *file,
 return s;
 
   DIR *dirp = NULL;
-  int sdir = AT_FDCWD;
+  int sdir = dir_fd;
   idx_t base_max = 0;
   while (true)
 {
@@ -371,10 +371,9 @@ backupfile_internal (int dir_fd, char const *file,
   if (! rename)
 break;
 
-  int olddirfd = sdir < 0 ? dir_fd : sdir;
-  idx_t offset = sdir < 0 ? 0 : base_offset;
+  idx_t offset = backup_type == simple_backups ? 0 : base_offset;
   unsigned flags = backup_type == simple_backups ? 0 : RENAME_NOREPLACE;
-  if (renameatu (olddirfd, file + offset, sdir, s + offset, flags) == 0)
+  if (renameatu (sdir, file + offset, sdir, s + offset, flags) == 0)
 break;
   int e = errno;
   if (! (e == EEXIST && extended))
-- 
2.35.1

From d1be566b18b9df34a22d61c9aa92bde00a4a6f0e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 20 Apr 2022 19:36:44 -0700
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnulib b/gnulib
index 58c597d13..7347caeb9 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 58c597d13bc57dce3e97ea97856573f2d68ccb8c
+Subproject commit 7347caeb9d902d3fca2c11f69a55a3e578d93bfe
-- 
2.35.1

From 56b314b384192ab75c23c281968a38ac2cb31617 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 20 Apr 2022 19:44:56 -0700
Subject: [PATCH 2/2] mv: test Bug#55029

* tests/mv/backup-dir.sh: New test for Bug#55029,
reported by Steve Ward.
---
 NEWS   | 5 +
 tests/mv/backup-dir.sh | 6 ++
 2 files changed, 11 insertions(+)

diff --git a/NEWS b/NEWS
index 7bedb0617..26eb52ca0 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ GNU coreutils NEWS-*- outline -*-
 
 * Noteworthy changes in release ?.? (-??-??) [?]
 
+** Bug fixes
+
+  'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
+  [bug introduced in coreutils-9.1]
+
 
 * Noteworthy changes in release 9.1 (2022-04-15) [stable]
 
diff --git a/tests/mv/backup-dir.sh b/tests/mv/backup-dir.sh
index 84c51afc8..2f708b5b6 100755
--- a/tests/mv/backup-dir.sh
+++ b/tests/mv/backup-dir.sh
@@ -36,4 +36,10 @@ mkdir C D E || framework_failure_
 mv -T --backup=numbered C E/ || fail=1
 mv -T --backup=numbered D E/ || fail=1
 
+# Bug#55029
+mkdir F && echo 1 >1 && echo 2 >2 && cp 1 F/X && cp 2 X || framework_failure_
+mv --backup=simple X F/ || fail=1
+compare 1 F/X~ || fail=1
+compare 2 F/X || fail=1
+
 Exit $fail
-- 
2.35.1



bug#55010: Compiling from git clone

2022-04-20 Thread Paul Eggert

On 4/19/22 22:54, Ken Ingram wrote:

So I guess I'm on an "old" compiler compared to 11.2.1


Yes, so let's not worry about the warning, as it suggests adding clutter 
unnecessary in modern compilers, it's easy to ignore the warnings in 
older compilers, and this is an issue only when using 
--enable-gcc-warnings or when building from git and not using 
--disable-gcc-warnings.






bug#55010: Compiling from git clone

2022-04-19 Thread Paul Eggert

On 4/18/22 14:47, Ken Ingram wrote:

Making all in .
make[2]: Entering directory '/home/kingram/src/coreutils'
   CC   lib/libcoreutils_a-randperm.o
lib/randperm.c: In function 'sparse_new':
lib/randperm.c:111:1: error: function might be candidate for attribute
'malloc' if it is known to return normally
[-Werror=suggest-attribute=malloc]
  sparse_new (size_t size_hint)
  ^~


I'm not seeing that on Fedora 35 x86-64, with GCC 11.2.1 20220127 (Red 
Hat 11.2.1-9). If you're using an older compiler, I suggest configuring 
with --disable-gcc-warnings, or building with "make WERROR_CFLAGS=", so 
that the unnecessary warnings don't break the build. If you're not, 
please specify the platform and GCC you're using, and how you ran 
'configure' and 'make'. Thanks.






bug#40586: date and '%-N' does not appear to remove leading zeros anymore, but trailing zeros.

2022-04-14 Thread Paul Eggert

On 4/14/22 09:48, joerg.boeh...@snafu.de wrote:


%N nanoseconds (0..9)

The current description gives the impression that nanoseconds are an 
integral quantity like seconds and minutes. This leads the user to 
assume that leading zeros are being removed.


Similar wording is used elsewhere:

  %M   minute (00..59)
  %m   month (01..12)
  %H   hour (00..23)
  %W   week number of year, with Monday as first day of week (00..53)

It's true that nanoseconds are more complicated than the others. 
However, it's not clear whether all the little details need to be in the 
man page, or how to summarize those details concisely.






bug#54916: 4 invalid dates reported by "date"

2022-04-13 Thread Paul Eggert

On 4/13/22 06:30, Martins Ozolins via GNU coreutils Bug Reports wrote:

ozoma@ozoma-ThinkPad-X250:$ date +%s --date="1981-04-01"
date: invalid date ‘1981-04-01’


This is because your invocation is equivalent to:

date +%s --date="1981-04-01 00:00:00"

and there was no midnight at that date in Latvia.





bug#54785: for floating point, printf should use double like in C instead of long double

2022-04-09 Thread Paul Eggert

Vincent Lefevre wrote in :


$ zsh -fc '/usr/bin/printf "%a\n" $((43./2**22))'
0xa.c00025cp-20

instead of

0xa.cp-20


To summarize, this test case is:

printf '%a\n' 1.0251998901367188e-05

and the problem is that converting 1.0251998901367188e-05 to long double 
prints the too-precise "0xa.c00025cp-20", whereas you want it to 
convert to double (which matches what most other programs do) and to 
print "0xa.cp-20" or equivalent.



(Note that ksh uses long double internally, but does not ensure the
round trip back to long double


Yes, ksh messes up here. However, it's more important for Coreutils 
printf to be compatible with the GNU shell, and Bash uses long double:


$ echo $BASH_VERSION
5.1.8(1)-release
$ /usr/bin/printf --version | head -n1
printf (GNU coreutils) 8.32
$ printf '%a\n' 1.0251998901367188e-05
0xa.c00025cp-20
$ /usr/bin/printf '%a\n' 1.0251998901367188e-05
0xa.c00025cp-20



I suggest to parse the argument as a "long double" only if the "L"
length modifier is provided, like in C.

Thanks, good idea.

I checked, and this also appears to be a POSIX conformance issue. POSIX 
 says that floating point operands "shall be evaluated as if by the 
strtod() function". This means double, not long double.


Whatever decision we make here, we should be consistent with Bash so 
I'll cc this email to bug-bash.


I propose that we change both coreutils and Bash to use 'double' rather 
than 'long double' here, unless the user specifies the L modifier (e.g., 
"printf '%La\n' ...". I've written up a patch (attached) to Bash 5.2 
alpha to do that. Assuming the Bash maintainer likes this proposal, I 
plan to implement something similar for Coreutils printf.diff '-x*~' -pru bash-5.2-alpha/builtins/printf.def bash-5.2-alpha-double/builtins/printf.def
--- bash-5.2-alpha/builtins/printf.def	2021-12-29 13:09:20.0 -0800
+++ bash-5.2-alpha-double/builtins/printf.def	2022-04-09 12:02:35.330476097 -0700
@@ -215,13 +215,14 @@ static uintmax_t getuintmax PARAMS((void
 
 #if defined (HAVE_LONG_DOUBLE) && HAVE_DECL_STRTOLD && !defined(STRTOLD_BROKEN)
 typedef long double floatmax_t;
-#  define FLOATMAX_CONV	"L"
+#  define USE_LONG_DOUBLE 1
 #  define strtofltmax	strtold
 #else
 typedef double floatmax_t;
-#  define FLOATMAX_CONV	""
+#  define USE_LONG_DOUBLE 0
 #  define strtofltmax	strtod
 #endif
+static double getdouble PARAMS((void));
 static floatmax_t getfloatmax PARAMS((void));
 
 static intmax_t asciicode PARAMS((void));
@@ -247,7 +248,7 @@ printf_builtin (list)
  WORD_LIST *list;
 {
   int ch, fieldwidth, precision;
-  int have_fieldwidth, have_precision;
+  int have_fieldwidth, have_precision, use_Lmod;
   char convch, thisch, nextch, *format, *modstart, *precstart, *fmt, *start;
 #if defined (HANDLE_MULTIBYTE)
   char mbch[25];		/* 25 > MB_LEN_MAX, plus can handle 4-byte UTF-8 and large Unicode characters*/
@@ -422,8 +423,12 @@ printf_builtin (list)
 
 	  /* skip possible format modifiers */
 	  modstart = fmt;
+	  use_Lmod = 0;
 	  while (*fmt && strchr (LENMODS, *fmt))
-	fmt++;
+	{
+	  use_Lmod |= USE_LONG_DOUBLE && *fmt == 'L';
+	  fmt++;
+	}
 	
 	  if (*fmt == 0)
 	{
@@ -694,11 +699,24 @@ printf_builtin (list)
 #endif
 	  {
 		char *f;
-		floatmax_t p;
 
-		p = getfloatmax ();
-		f = mklong (start, FLOATMAX_CONV, sizeof(FLOATMAX_CONV) - 1);
-		PF (f, p);
+		if (use_Lmod)
+		  {
+		floatmax_t p;
+
+		p = getfloatmax ();
+		f = mklong (start, "L", 1);
+		PF (f, p);
+		  }
+		else
+		  {
+		double p;
+
+		p = getdouble ();
+		f = mklong (start, "", 0);
+		PF (f, p);
+		  }
+
 		break;
 	  }
 
@@ -1248,35 +1266,40 @@ getuintmax ()
   return (ret);
 }
 
+#define getfloat(ret, convert) \
+  char *ep; \
+  if (garglist == 0) \
+return 0; \
+  if (garglist->word->word[0] == '\'' || garglist->word->word[0] == '"') \
+return asciicode (); \
+  errno = 0; \
+  (ret) = (convert) (garglist->word->word, ); \
+  if (*ep) \
+{ \
+  sh_invalidnum (garglist->word->word); \
+  /* Same thing about POSIX.2 conversion error requirements. */ \
+  if (0) \
+(ret) = 0; \
+  conversion_error = 1; \
+} \
+  else if (errno == ERANGE) \
+printf_erange (garglist->word->word); \
+  garglist = garglist->next
+
+static double
+getdouble ()
+{
+  double ret;
+  getfloat (ret, strtod);
+  return ret;
+}
+
 static floatmax_t
 getfloatmax ()
 {
   floatmax_t ret;
-  char *ep;
-
-  if (garglist == 0)
-return (0);
-
-  if (garglist->word->word[0] == '\'' || garglist->word->word[0] == '"')
-return asciicode ();
-
-  errno = 0;
-  ret = strtofltmax (garglist->word->word, );
-
-  if (*ep)
-{
-  sh_invalidnum (garglist->word->word);
-#if 0
-  /* Same thing about POSIX.2 conversion error requirements. */
-  ret = 0;
-#endif
-  conversion_error = 1;
-}
-  else if (errno == ERANGE)
-printf_erange (garglist->word->word);
-
-  

bug#54587: chroot: incorrectly reporting ": no such file or directory"

2022-03-26 Thread Paul Eggert

On 3/26/22 18:54, Kyle Glaws wrote:

When the shared library is missing, execvp will set errno to ENOENT. In
that event, it might be possible to stat the path to the command to confirm
that the command does not exist.


That would lead to a race condition, in case the file in question is 
removed or created between the time that execvp fails and stat is called.


Plus: why should chroot be any different from other commands that 
invoke execvp and then rely on errno to say why execvp failed? Will we 
need to modify every command that invokes execvp? If so, this would 
indicate a bug in execvp rather than in every command that uses execvp.









bug#54587: chroot: incorrectly reporting ": no such file or directory"

2022-03-26 Thread Paul Eggert

On 3/26/22 16:16, Kyle Glaws wrote:

Looking at the source code in chroot.c, it doesn't seem impossible to add
some logic that makes this error message more accurate (i.e. that a shared
library is missing, not the executable itself).


How? More details, please.






bug#54519: YES

2022-03-22 Thread Paul Eggert

On 3/22/22 06:16, bdmalex--- via GNU coreutils Bug Reports wrote:

yes | zypper up;
  
yes y | zypper up;
  
yes 'y' | zypper up;


Surely this is a problem with zypper, not with 'yes'.

If it is a problem with 'yes', please give a way to reproduce the 
problem that doesn't involve zypper (a program the bug-report reader 
probably doesn't have).






bug#54286: [PATCH] fcntl-h: add AT_NO_AUTOMOUNT

2022-03-07 Thread Paul Eggert

On 3/7/22 06:08, Pádraig Brady wrote:

* lib/fcntl.in.h: Define AT_NO_AUTOMOUNT to 0 where not defined.
This is available on Linux since 2.6.38.


Looks good.

Please feel free to install this sort of thing without waiting for review.





bug#44770: chown: warn when encountering deprecated dot separator

2022-02-24 Thread Paul Eggert

Thanks for the suggestion. I installed the attached patches to do that.From 320b3f8c96fc69670475c7a39d5818c5b4755912 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 24 Feb 2022 18:05:03 -0800
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnulib b/gnulib
index 06b2e943b..23cca8268 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 06b2e943be39284783ff81ac6c9503200f41dba3
+Subproject commit 23cca8268d21f5d58ed0209002d5673d0518c426
-- 
2.35.1

From aac2a3cff4f94df899483da7963f4fc983c7c6b0 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 24 Feb 2022 18:17:23 -0800
Subject: [PATCH 2/2] chown: warn about USER.GROUP

Suggested by Dan Jacobson (Bug#44770).
* src/chown.c, src/chroot.c (main):
Issue warnings if obsolete USER.GROUP notation is present.
---
 NEWS   |  5 +
 doc/coreutils.texi |  3 ++-
 src/chown.c| 17 ++---
 src/chroot.c   |  9 +
 4 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/NEWS b/NEWS
index eec705b2f..af6596b06 100644
--- a/NEWS
+++ b/NEWS
@@ -51,6 +51,11 @@ GNU coreutils NEWS-*- outline -*-
   simple copies between regular files.  This may be more efficient, by avoiding
   user space copies, and possibly employing copy offloading or reflinking.
 
+  chown and chroot now warn about usages like "chown root.root f",
+  which have the nonstandard and long-obsolete "." separator that
+  causes problems on platforms where user names contain ".".
+  Applications should use ":" instead of ".".
+
   cksum no longer allows abbreviated algorithm names,
   so that forward compatibility and robustness is improved.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 641680e11..e9be0993a 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -11318,7 +11318,8 @@ or group ID, then you may specify it with a leading @samp{+}.
 Some older scripts may still use @samp{.} in place of the @samp{:} separator.
 POSIX 1003.1-2001 (@pxref{Standards conformance}) does not
 require support for that, but for backward compatibility GNU
-@command{chown} supports @samp{.} so long as no ambiguity results.
+@command{chown} supports @samp{.} so long as no ambiguity results,
+although it issues a warning and support may be removed in future versions.
 New scripts should avoid the use of @samp{.} because it is not
 portable, and because it has undesirable results if the entire
 @var{owner@samp{.}group} happens to identify a user whose name
diff --git a/src/chown.c b/src/chown.c
index 329b0f4dc..07cc907a4 100644
--- a/src/chown.c
+++ b/src/chown.c
@@ -227,11 +227,12 @@ main (int argc, char **argv)
 
 case FROM_OPTION:
   {
-char const *e = parse_user_spec (optarg,
- _uid, _gid,
- NULL, NULL);
+bool warn;
+char const *e = parse_user_spec_warn (optarg,
+  _uid, _gid,
+  NULL, NULL, );
 if (e)
-  die (EXIT_FAILURE, 0, "%s: %s", e, quote (optarg));
+  error (warn ? 0 : EXIT_FAILURE, 0, "%s: %s", e, quote (optarg));
 break;
   }
 
@@ -297,10 +298,12 @@ main (int argc, char **argv)
 }
   else
 {
-  char const *e = parse_user_spec (argv[optind], , ,
-   _name, _name);
+  bool warn;
+  char const *e = parse_user_spec_warn (argv[optind], , ,
+_name,
+_name, );
   if (e)
-die (EXIT_FAILURE, 0, "%s: %s", e, quote (argv[optind]));
+error (warn ? 0 : EXIT_FAILURE, 0, "%s: %s", e, quote (argv[optind]));
 
   /* If a group is specified but no user, set the user name to the
  empty string so that diagnostics say "ownership :GROUP"
diff --git a/src/chroot.c b/src/chroot.c
index 1cd04300c..be9601304 100644
--- a/src/chroot.c
+++ b/src/chroot.c
@@ -354,10 +354,11 @@ main (int argc, char **argv)
  Diagnose any failures.  If any have failed, exit before execvp.  */
   if (userspec)
 {
-  char const *err = parse_user_spec (userspec, , , NULL, NULL);
-
-  if (err && uid_unset (uid) && gid_unset (gid))
-die (EXIT_CANCELED, errno, "%s", (err));
+  bool warn;
+  char const *err = parse_user_spec_warn (userspec, , ,
+  NULL, NULL, );
+  if (err)
+error (warn ? 0 : EXIT_CANCELED, 0, "%s", err);
 }
 
   /* If no gid is supplied or looked up, do so now.
-- 
2.35.1



bug#54124: fmt inserts garbage in certain cases?

2022-02-23 Thread Paul Eggert

On 2/23/22 17:29, Pádraig Brady wrote:
Given isspace('\n') returns true, then it makes some sense that 
isspace("Next Line")

would return true,


POSIX says that the application must insure that argument to isspace is 
either EOF or "a character representable as an unsigned char", and 
arguably since 0x85 not either one of those things the behavior of 
isspace(0x85) is undefined.


However, the C standard does not have this wording, and since POSIX is 
supposed to defer to the C standard here, this appears to be a bug in 
POSIX (as well as a bug in macOS). It's understandable if the Apple C 
library's developers got confused by the POSIX wording.






bug#46808: Man page of "tail"

2022-02-23 Thread Paul Eggert

On 2/26/21 23:32, Reuti wrote:


I noticed some formatting issues in the man page of `tail`, and I wonder 
whether they are intentional as they occur at some places. They happen up to 
version 8.32:

line: "-c, --bytes=[+]NUM" the "[" is in italic: \fB\-c\fR, 
\fB\-\-bytes\fR=\fI\,[\/\fR+]NUM
line: "-f, --follow[={name|descriptor}]" the "[=" is in bold: \fB\-f\fR, 
\fB\-\-follow[=\fR{name|descriptor}]
line: "-n, --lines=[+]NUM" the "[" is "[" is in italic: \fB\-n\fR, 
\fB\-\-lines\fR=\fI\,[\/\fR+]NUM

I'm not a `groff` expert, but the sequence "\,some-text\/" appears a couple of times. What effect 
does it have for the formatting as the "," and "/" are not output?


Thanks for the bug report. Those lines are automatically generated by 
help2man, so I'm cc'ing this to bug-help2...@gnu.org and will close the 
coreutils bug report.


The problem is that "tail --help" outputs this:

  -c, --bytes=[+]NUM   output the last NUM bytes; or use -c +NUM to
 output starting with byte NUM of each file
  -f, --follow[={name|descriptor}]
   output appended data as the file grows;
 an absent option argument means 'descriptor'
  -F   same as --follow=name --retry
  -n, --lines=[+]NUM   output the last NUM lines, instead of the 
last 10;
 or use -n +NUM to output starting with 
line NUM



and help2man transforms that into:

.TP
\fB\-c\fR, \fB\-\-bytes\fR=\fI\,[\/\fR+]NUM
output the last NUM bytes; or use \fB\-c\fR +NUM to
output starting with byte NUM of each file
.TP
\fB\-f\fR, \fB\-\-follow[=\fR{name|descriptor}]
output appended data as the file grows;
.IP
an absent option argument means 'descriptor'
.TP
\fB\-F\fR
same as \fB\-\-follow\fR=\fI\,name\/\fR \fB\-\-retry\fR
.TP
\fB\-n\fR, \fB\-\-lines\fR=\fI\,[\/\fR+]NUM
output the last NUM lines, instead of the last 10;
or use \fB\-n\fR +NUM to output starting with line NUM





bug#54112: dd seek_bytes etc. is confusing

2022-02-22 Thread Paul Eggert

On 2/22/22 09:29, Pádraig Brady wrote:

That is a more concise and direct way to achieve the same functionality.
+1

I guess we should remove docs for the other options,
but leave support there for backwards compat.


Sounds good, I installed the attached and am closing the bug report.From 155cc945db54ab541594f3a59cfe808bc9aea3fd Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 22 Feb 2022 18:27:09 -0800
Subject: [PATCH] dd: counts ending in "B" now count bytes

This implements my suggestion in Bug#54112.
* src/dd.c (usage): Document the change.
(parse_integer, scanargs): Implement the change.
Omit some now-obsolete checks for invalid flags.
* tests/dd/bytes.sh: Test the new behavior, while retaining
checks for the now-obsolete usage.
* tests/dd/nocache_eof.sh: Avoid now-obsolete usage.
---
 NEWS|   6 +++
 doc/coreutils.texi  |  53 ++-
 src/dd.c| 114 
 tests/dd/bytes.sh   |  67 ---
 tests/dd/nocache_eof.sh |   2 +-
 5 files changed, 116 insertions(+), 126 deletions(-)

diff --git a/NEWS b/NEWS
index de03f0d47..b6713bfc5 100644
--- a/NEWS
+++ b/NEWS
@@ -60,6 +60,12 @@ GNU coreutils NEWS-*- outline -*-
   dd now supports the aliases iseek=N for skip=N, and oseek=N for seek=N,
   like FreeBSD and other operating systems.
 
+  dd now counts bytes instead of blocks if a block count ends in "B".
+  For example, 'dd count=100KiB' now copies 100 KiB of data, not
+  102,400 blocks of data.  The flags count_bytes, skip_bytes and
+  seek_bytes are therefore obsolescent and are no longer documented,
+  though they still work.
+
   timeout --foreground --kill-after=... will now exit with status 137
   if the kill signal was sent, which is consistent with the behavior
   when the --foreground option is not specified.  This allows users to
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 5419c61ef..641680e11 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9268,9 +9268,9 @@ use @var{bytes} as the fixed record length.
 @opindex skip
 @opindex iseek
 Skip @var{n} @samp{ibs}-byte blocks in the input file before copying.
-With @samp{iflag=skip_bytes}, interpret @var{n}
+If @var{n} ends in the letter @samp{B}, interpret @var{n}
 as a byte count rather than a block count.
-(The @samp{iseek=} spelling is an extension to POSIX.)
+(@samp{B} and the @samp{iseek=} spelling are GNU extensions to POSIX.)
 
 @item seek=@var{n}
 @itemx oseek=@var{n}
@@ -9278,16 +9278,17 @@ as a byte count rather than a block count.
 @opindex oseek
 Skip @var{n} @samp{obs}-byte blocks in the output file before
 truncating or copying.
-With @samp{oflag=seek_bytes}, interpret @var{n}
+If @var{n} ends in the letter @samp{B}, interpret @var{n}
 as a byte count rather than a block count.
-(The @samp{oseek=} spelling is an extension to POSIX.)
+(@samp{B} and the @samp{oseek=} spelling are GNU extensions to POSIX.)
 
 @item count=@var{n}
 @opindex count
 Copy @var{n} @samp{ibs}-byte blocks from the input file, instead
 of everything until the end of the file.
-With @samp{iflag=count_bytes}, interpret @var{n}
-as a byte count rather than a block count.
+If @var{n} ends in the letter @samp{B},
+interpret @var{n} as a byte count rather than a block count;
+this is a GNU extension to POSIX.
 If short reads occur, as could be the case
 when reading from a pipe for example, @samp{iflag=fullblock}
 ensures that @samp{count=} counts complete input blocks
@@ -9627,27 +9628,6 @@ as they may return short reads. In that case,
 this flag is needed to ensure that a @samp{count=} argument is
 interpreted as a block count rather than a count of read operations.
 
-@item count_bytes
-@opindex count_bytes
-Interpret the @samp{count=} operand as a byte count,
-rather than a block count, which allows specifying
-a length that is not a multiple of the I/O block size.
-This flag can be used only with @code{iflag}.
-
-@item skip_bytes
-@opindex skip_bytes
-Interpret the @samp{skip=} or @samp{iseek=} operand as a byte count,
-rather than a block count, which allows specifying
-an offset that is not a multiple of the I/O block size.
-This flag can be used only with @code{iflag}.
-
-@item seek_bytes
-@opindex seek_bytes
-Interpret the @samp{seek=} or @samp{oseek=} operand as a byte count,
-rather than a block count, which allows specifying
-an offset that is not a multiple of the I/O block size.
-This flag can be used only with @code{oflag}.
-
 @end table
 
 These flags are all GNU extensions to POSIX.
@@ -9680,23 +9660,22 @@ should not be too large---values larger than a few megabytes
 are generally wasteful or (as in the gigabyte..exabyte case) downright
 counterproductive or error-inducing.
 
-To process data that is at an offset or size that is not a
-multiple of the I/O@ block size, you can use the @samp{skip_bytes},
-@samp{seek_bytes} and @samp{count_bytes} flags.  Alternativ

bug#45648: `dd` seek/skip which way is up?

2022-02-22 Thread Paul Eggert

On 1/4/21 20:08, Paul Eggert wrote:

On 1/4/21 7:44 PM, Bela Lubkin wrote:

TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document
them as pure synonyms for 'skip' and 'seek'.


Thanks for doing all that research. It's compelling, and I think your 
patch (or something like it) should go in. I'll wait for a bit to hear 
other opinions.


After thinking about the patch a bit more, let's omit the part about 
adding new conversions iseek_bytes etc., as I think there's a better way 
to address that issue. I proposed something in <https://bugs.gnu.org/54112>.


So instead of your patch, I installed the attached patches. The first 
one adds the iseek and oseek operands that you suggested; the second one 
clarifies dd documentation, as I found several things were confusing 
when rereading it carefully. Something like these patches should appear 
in the next coreutils release.From 6ad981900cc170258d4914197e2796fc94a37863 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 21 Feb 2022 11:23:02 -0800
Subject: [PATCH 1/2] dd: support iseek= and oseek=

Alias iseek=N to skip=N, oseek=N to seek=N (Bug#45648).
* src/dd.c (scanargs): Parse iseek= and oseek=.
* tests/dd/skip-seek.pl (sk-seek5): New test case.
---
 NEWS  |  3 +++
 doc/coreutils.texi| 16 ++--
 src/dd.c  |  8 
 tests/dd/skip-seek.pl | 10 ++
 4 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/NEWS b/NEWS
index ef65b4ab8..de03f0d47 100644
--- a/NEWS
+++ b/NEWS
@@ -57,6 +57,9 @@ GNU coreutils NEWS-*- outline -*-
   dd conv=fsync now synchronizes output even after a write error,
   and similarly for dd conv=fdatasync.
 
+  dd now supports the aliases iseek=N for skip=N, and oseek=N for seek=N,
+  like FreeBSD and other operating systems.
+
   timeout --foreground --kill-after=... will now exit with status 137
   if the kill signal was sent, which is consistent with the behavior
   when the --foreground option is not specified.  This allows users to
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 8d2974bde..4ec998802 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9189,8 +9189,7 @@ Read from @var{file} instead of standard input.
 @item of=@var{file}
 @opindex of
 Write to @var{file} instead of standard output.  Unless
-@samp{conv=notrunc} is given, @command{dd} truncates @var{file} to zero
-bytes (or the size specified with @samp{seek=}).
+@samp{conv=notrunc} is given, truncate @var{file} before writing it.
 
 @item ibs=@var{bytes}
 @opindex ibs
@@ -9230,15 +9229,20 @@ When converting variable-length records to fixed-length ones
 use @var{bytes} as the fixed record length.
 
 @item skip=@var{n}
+@itemx iseek=@var{n}
 @opindex skip
+@opindex iseek
 Skip @var{n} @samp{ibs}-byte blocks in the input file before copying.
 If @samp{iflag=skip_bytes} is specified, @var{n} is interpreted
 as a byte count rather than a block count.
 
 @item seek=@var{n}
+@itemx oseek=@var{n}
 @opindex seek
-Skip @var{n} @samp{obs}-byte blocks in the output file before copying.
-if @samp{oflag=seek_bytes} is specified, @var{n} is interpreted
+@opindex oseek
+Skip @var{n} @samp{obs}-byte blocks in the output file before
+truncating or copying.
+If @samp{oflag=seek_bytes} is specified, @var{n} is interpreted
 as a byte count rather than a block count.
 
 @item count=@var{n}
@@ -9588,14 +9592,14 @@ This flag can be used only with @code{iflag}.
 
 @item skip_bytes
 @opindex skip_bytes
-Interpret the @samp{skip=} operand as a byte count,
+Interpret the @samp{skip=} or @samp{iseek=} operand as a byte count,
 rather than a block count, which allows specifying
 an offset that is not a multiple of the I/O block size.
 This flag can be used only with @code{iflag}.
 
 @item seek_bytes
 @opindex seek_bytes
-Interpret the @samp{seek=} operand as a byte count,
+Interpret the @samp{seek=} or @samp{oseek=} operand as a byte count,
 rather than a block count, which allows specifying
 an offset that is not a multiple of the I/O block size.
 This flag can be used only with @code{oflag}.
diff --git a/src/dd.c b/src/dd.c
index 7360a4973..1c30e414d 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -562,8 +562,8 @@ Copy a file, converting and formatting according to the operands.\n\
   obs=BYTES   write BYTES bytes at a time (default: 512)\n\
   of=FILE write to FILE instead of stdout\n\
   oflag=FLAGS write as per the comma separated symbol list\n\
-  seek=N  skip N obs-sized blocks at start of output\n\
-  skip=N  skip N ibs-sized blocks at start of input\n\
+  seek=N  (or oseek=N) skip N obs-sized output blocks\n\
+  skip=N  (or iseek=N) skip N ibs-sized input blocks\n\
   status=LEVELThe LEVEL of information to print to stderr;\n\
   'none' suppresses everything but error messages,\n\
   'noxfer' suppresses the final transfer statistics,\n\
@@ -1564,9 +1564,9 @@ scanargs (in

bug#54112: dd seek_bytes etc. is confusing

2022-02-22 Thread Paul Eggert
While looking into Bug#45648 I noticed that the GNU extensions 
count_bytes, seek_bytes, and skip_bytes are confusing, and the proposed 
fix to bug#45648 would make them even more confusing. To fix this 
confusion, we should deprecate these options, and instead say that if 
you want to use byte counts you should use a number string ending in "B".


Here's another way to put it.  Currently this:

   dd oseek=100KiB

means "seek 102,400 blocks". It should simply mean "seek 102,400 bytes", 
which is what it says. And if we change oseek's meaning this way, we 
don't need "oseek_bytes".


Although this is an incompatible change to GNU dd, I don't think it'll 
affect real-world uses (who would use oseek in such a confusing way 
now?) and overall it will be a win.






bug#47151: closed (Re: bug#47151: cp --recursive funky behaviour)

2022-02-21 Thread Paul Eggert

On 2/21/22 10:49, Tomas wrote:


I found this, I am not sure whether it's the right specs.

https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/


Yes, or more precisely for 'cp':

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/cp.html


"2) f) The files in the directory source_file shall be copied to the directory 
dest_file, taking the four steps (1 to 4)
listed here with the files as source_files."

Is this the relevant part? It seems to me that this would support copying src/a 
and src/b (the files in the directory source_file) to dest/a and dest/b (copied 
to the directory dest_file) rather than to dest/src/a and dest/src/b (which 
would be copying the directory, not the files in the directory). But maybe I'm 
missing a part of the spec or am interpreting it differently.


Yes, I think the part you're missing is below. Your first "cp -r src 
dest" uses the 3rd paragraph quoted below; your second "cp -r src dst" 
uses the 2nd paragraph.


=

The third synopsis form is denoted by two or more operands where the -R 
option is specified. The cp utility shall copy each file in the file 
hierarchy rooted in each source_file to a destination path named as follows:


If target exists and names an existing directory, the name of the 
corresponding destination path for each file in the file hierarchy shall 
be the concatenation of target, a single  character if target did 
not end in a , and the pathname of the file relative to the 
directory containing source_file.


If target does not exist and two operands are specified, the name 
of the corresponding destination path for source_file shall be target; 
the name of the corresponding destination path for all other files in 
the file hierarchy shall be the concatenation of target, a  
character, and the pathname of the file relative to source_file.


=





bug#47151: cp --recursive funky behaviour

2022-02-21 Thread Paul Eggert

On 3/14/21 17:47, Tomas wrote:

cd /tmp
mkdir src
touch src/a
cp -r src dest

#damn, I forgot a file
touch src/b
cp -r src dest
ls dest
# a dest


cp has behaved that way for ages, and is required to behave that way by 
POSIX, so this is not a bug. To sidestep the issue, you can use the -T 
option of GNU cp. Closing the bug report.






bug#47891: sort --numeric-sort-Extra-Strength

2022-02-21 Thread Paul Eggert

On 4/19/21 06:15, 積丹尼 Dan Jacobson wrote:

Let's face it, sort, no matter what --option, or LC_... value,
just can't achieve this order:

3-1號邊
3號之1
3號之2
30


Plain 'sort -n' works for me, in my en_US.utf8 locale. :-)

I do take your point, though, that GNU 'sort' does not support sorting 
by Taiwanese house address. The usual way to handle this sort of thing 
is to transform the house address into a form that GNU 'sort' can 
handle, sort, then transform back. I doubt whether we'll be adding a 
--taiwanese-address-sort option any time soon so I'm taking the liberty 
of closing the bug report.






bug#47059: bug in cp removing destination file when it can't be replaced due to cross-volume linking

2022-02-21 Thread Paul Eggert
I can't reproduce the problem with either coreutils 8.23 or 9.0. 
Unfortunately, the original bug report does not have a recipe for 
reproducing the problem from scratch, without having access to your 
system. If you could come up with the a self-contained way to reproduce 
the problem with current coreutils, that would be helpful.



When creating a link to a local file, I
first create the link to a temporary name to ensure I have
appropriate access (or that its not cross-linked in this
case).  Apparently 'cp' doesn't exercise the same caution.


Actually, cp -l is even more cautious than the procedure you describe. 
If the destination already exists, cp -l fails without altering the 
destination.


$ echo a >abc
$ echo bb >/tmp/def
$ cp -l abc /tmp/def
cp: cannot create hard link '/tmp/def' to 'abc': File exists
$ ls -l abc /tmp/def
-rw-rw-r-- 1 eggert eggert 3 Feb 21 01:13 /tmp/def
-rw-rw-r-- 1 eggert eggert 2 Feb 21 01:13 abc

Hence the symptoms you reported are mysterious; I don't see how they 
could have happened.






bug#48002: unmerging separate bug reports about cp etc.

2022-02-21 Thread Paul Eggert
A while back I merged GNU Coreutils bug reports 47059, 47883, and 48002. 
I now see that that was a mistake as they're about three different 
issues. So, I'm unmerging the bug reports and will look at each separately.


The main topic of bug#48002 is not a bug in Coreutils; it's about the 
bug-reporting process, and this would better be addressed in another 
forum. One possible forum would be gnu-misc-disc...@gnu.org.






bug#48085: date -d greater than 23 years ago gives error invalid date

2022-02-19 Thread Paul Eggert

On 4/28/21 16:23, Mark Krenz wrote:

So I'm not sure if this is a problem with coreutils or a change in the
zoneinfo database. Any ideas?


This appears to be a problem in the GNU C library, when its mktime 
deciphers the relatively unusual time zone history of Indiana.


I installed the attached patch into Gnulib and propagated it into 
Coreutils, so the issue should be fixed in the next release of GNU 
Coreutils. Eventually this patch should migrate from Gnulib to glibc so 
that other apps get the fix. Thanks for reporting the issue.From 06b2e943be39284783ff81ac6c9503200f41dba3 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 19 Feb 2022 15:04:43 -0800
Subject: [PATCH] mktime: improve heuristic for ca-1986 Indiana DST

Problem reported by Mark Krenz <https://bugs.gnu.org/48085>.
* lib/mktime.c (__mktime_internal): Be more generous about
accepting arguments with the wrong value of tm_isdst, by falling
back to a one-hour DST difference if we find no nearby DST that is
unusual.  This fixes a problem where "1986-04-28 00:00 EDT" was
rejected when TZ="America/Indianapolis" because the nearest DST
timestamp occurred in 1970, a temporal distance too great for the
old heuristic.  This also also narrows the search a bit, which
is a minor performance win.
* m4/mktime.m4 (gl_FUNC_MKTIME_WORKS):
Check for putenv failures and for Bug#48085.
* tests/test-parse-datetime.c (main):
Test for setenv failures and for Bug#48085.
---
 ChangeLog   | 17 +
 lib/mktime.c| 28 
 m4/mktime.m4| 29 +
 tests/test-parse-datetime.c | 21 +++--
 4 files changed, 81 insertions(+), 14 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4bf0cec7f0..4d56be83d4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,20 @@
+2022-02-19  Paul Eggert  
+
+	mktime: improve heuristic for ca-1986 Indiana DST
+	Problem reported by Mark Krenz <https://bugs.gnu.org/48085>.
+	* lib/mktime.c (__mktime_internal): Be more generous about
+	accepting arguments with the wrong value of tm_isdst, by falling
+	back to a one-hour DST difference if we find no nearby DST that is
+	unusual.  This fixes a problem where "1986-04-28 00:00 EDT" was
+	rejected when TZ="America/Indianapolis" because the nearest DST
+	timestamp occurred in 1970, a temporal distance too great for the
+	old heuristic.  This also also narrows the search a bit, which
+	is a minor performance win.
+	* m4/mktime.m4 (gl_FUNC_MKTIME_WORKS):
+	Check for putenv failures and for Bug#48085.
+	* tests/test-parse-datetime.c (main):
+	Test for setenv failures and for Bug#48085.
+
 2022-02-12  Paul Eggert  
 
 	filevercmp: fix several unexpected results
diff --git a/lib/mktime.c b/lib/mktime.c
index aa12e28e16..7dc9d67ef9 100644
--- a/lib/mktime.c
+++ b/lib/mktime.c
@@ -429,8 +429,13 @@ __mktime_internal (struct tm *tp,
 	 time with the right value, and use its UTC offset.
 
 	 Heuristic: probe the adjacent timestamps in both directions,
-	 looking for the desired isdst.  This should work for all real
-	 time zone histories in the tz database.  */
+	 looking for the desired isdst.  If none is found within a
+	 reasonable duration bound, assume a one-hour DST difference.
+	 This should work for all real time zone histories in the tz
+	 database.  */
+
+  /* +1 if we wanted standard time but got DST, -1 if the reverse.  */
+  int dst_difference = (isdst == 0) - (tm.tm_isdst == 0);
 
   /* Distance between probes when looking for a DST boundary.  In
 	 tzdata2003a, the shortest period of DST is 601200 seconds
@@ -441,12 +446,14 @@ __mktime_internal (struct tm *tp,
 	 periods when probing.  */
   int stride = 601200;
 
-  /* The longest period of DST in tzdata2003a is 536454000 seconds
-	 (e.g., America/Jujuy starting 1946-10-01 01:00).  The longest
-	 period of non-DST is much longer, but it makes no real sense
-	 to search for more than a year of non-DST, so use the DST
-	 max.  */
-  int duration_max = 536454000;
+  /* In TZDB 2021e, the longest period of DST (or of non-DST), in
+	 which the DST (or adjacent DST) difference is not one hour,
+	 is 457243209 seconds: e.g., America/Cambridge_Bay with leap
+	 seconds, starting 1965-10-31 00:00 in a switch from
+	 double-daylight time (-05) to standard time (-07), and
+	 continuing to 1980-04-27 02:00 in a switch from standard time
+	 (-07) to daylight time (-06).  */
+  int duration_max = 457243209;
 
   /* Search in both directions, so the maximum distance is half
 	 the duration; add the stride to avoid off-by-1 problems.  */
@@ -483,6 +490,11 @@ __mktime_internal (struct tm *tp,
 	  }
 	  }
 
+  /* No unusual DST offset was found nearby.  Assume one-hour DST.  */
+  t += 60 * 60 * dst_difference;
+  if (mktime_min <= t && t <= mktime_max && convert_time (convert, t, ))
+	goto offset_found;

bug#53977: Improve markup in man pages

2022-02-14 Thread Paul Eggert

On 2/14/22 15:00, Pádraig Brady wrote:


I see Paul added the grep markup recently in a seemingly unrelated change:
https://git.savannah.gnu.org/gitweb/?p=grep.git;a=commit;h=fe630c9f
In the old days man pages' SEE ALSO sections mostly didn't use markup 
for references to other man pages. I see only one exception in 7th 
edition UNIX (1979): its man page for yacc used ".IR lex (1)" instead of 
plain "lex(1)".


Nowadays it seems that ".BR lex (1)" is what's preferred for this sort 
of thing, so I've been switching to this style desultorily in man pages 
when someone points it out, most recently in diffutils today:


https://git.savannah.gnu.org/gitweb/?p=diffutils.git;a=commitdiff;h=dd9deb765548679e821be565229bb2e142d93573

As usual man pages are low priority for the GNU project. That being 
said, this sort of thing is an easy change.






bug#48248: tr docs: mention what to expect now vs. future

2022-02-14 Thread Paul Eggert
Thanks for the bug report. This stuff about "the future" has been there 
for years so it's time to remove any predictions. I installed the 
attached revamp of the tr documentation to try to address the problems 
you mentioned, plus some others I noticed while in the neighborhood.From 8d3dce9861c15f06a014c91fa29c15143fd27127 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 14 Feb 2022 12:00:16 -0800
Subject: [PATCH 1/2] tr: improve multibyte etc. doc

Problem reported by Dan Jacobson (Bug#48248).
* doc/coreutils.texi (tr invocation): Improve documentation for
tr's failure to support multibyte characters POSIX-style.
* doc/coreutils.texi (tr invocation), src/tr.c (usage):
Use terminology closer to POSIX's.
---
 doc/coreutils.texi | 205 -
 src/tr.c   |  30 +++
 2 files changed, 123 insertions(+), 112 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 7ae5ab8e3..8d2974bde 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -300,7 +300,7 @@ Operating on characters
 
 @command{tr}: Translate, squeeze, and/or delete characters
 
-* Character sets::   Specifying sets of characters
+* Character arrays:: Specifying arrays of characters
 * Translating::  Changing one set of characters to another
 * Squeezing and deleting::   Removing characters
 
@@ -6888,7 +6888,7 @@ These commands operate on individual characters.
 Synopsis:
 
 @example
-tr [@var{option}]@dots{} @var{set1} [@var{set2}]
+tr [@var{option}]@dots{} @var{string1} [@var{string2}]
 @end example
 
 @command{tr} copies standard input to standard output, performing
@@ -6905,9 +6905,11 @@ delete characters,
 delete characters, then squeeze repeated characters from the result.
 @end itemize
 
-The @var{set1} and (if given) @var{set2} arguments define ordered
-sets of characters, referred to below as @var{set1} and @var{set2}.  These
-sets are the characters of the input that @command{tr} operates on.
+The @var{string1} and @var{string2} operands define arrays of
+characters @var{array1} and @var{array2}.  By default @var{array1}
+lists input characters that @command{tr} operates on, and @var{array2}
+lists corresponding translations.  In some cases the second operand is
+omitted.
 
 The program accepts the following options.  Also see @ref{Common options}.
 Options must precede operands.
@@ -6920,34 +6922,29 @@ Options must precede operands.
 @opindex -c
 @opindex -C
 @opindex --complement
-This option replaces @var{set1} with its
-complement (all of the characters that are not in @var{set1}).
-Currently @command{tr} fully supports only single-byte characters.
-Eventually it will support multibyte characters; when it does, the
-@option{-C} option will cause it to complement the set of characters,
-whereas @option{-c} will cause it to complement the set of values.
-This distinction will matter only when some values are not characters,
-and this is possible only in locales using multibyte encodings when
-the input contains encoding errors.
+Instead of @var{array1}, use its complement (all characters not
+specified by @var{string1}), in ascending order.  Use this option with
+caution in multibyte locales where its meaning is not always clear
+or portable; see @ref{Character arrays}.
 
 @item -d
 @itemx --delete
 @opindex -d
 @opindex --delete
-Delete characters in @var{set1}, do not translate
+Delete characters in @var{array1}; do not translate.
 
 @item -s
 @itemx --squeeze-repeats
 @opindex -s
 @opindex --squeeze-repeats
 Replace each sequence of a repeated character that is listed in
-the last specified @var{set}, with a single occurrence of that character.
+the last specified @var{array}, with a single occurrence of that character.
 
 @item -t
 @itemx --truncate-set1
 @opindex -t
 @opindex --truncate-set1
-First truncate @var{set1} to length of @var{set2}.
+Truncate @var{array1} to the length of @var{array2}.
 
 @end table
 
@@ -6955,23 +6952,41 @@ First truncate @var{set1} to length of @var{set2}.
 @exitstatus
 
 @menu
-* Character sets::  Specifying sets of characters.
-* Translating:: Changing one set of characters to another.
+* Character arrays::Specifying arrays of characters.
+* Translating:: Changing characters to other characters.
 * Squeezing and deleting::  Removing characters.
 @end menu
 
 
-@node Character sets
-@subsection Specifying sets of characters
-
-@cindex specifying sets of characters
-
-The format of the @var{set1} and @var{set2} arguments resembles
-the format of regular expressions; however, they are not regular
-expressions, only lists of characters.  Most characters simply
-represent themselves in these strings, but the strings can contain
-the shorthands listed below, for convenience.  Some of them can be
-used only in @var{set1} or @var{set2}, as noted below.
+@node Character arrays
+@subsection Specifying arrays of 

bug#53982: date (GNU coreutils) 8.30 bug report "17 april 2022 + 37 week 5pm"

2022-02-14 Thread Paul Eggert

On 2/14/22 01:41, Stéphane Archer wrote:

is +'%Y-%m-%dT%H:%M:%S.0Z' do what I want


To format an arbitrary timestamp you want "+%Y-%m-%dT%H:%M:%S.%1NZ", 
unless you always want a zero after the period.


Closing the bug report as there's no bug here.





bug#49239: Unexpected results with sort -V

2022-02-12 Thread Paul Eggert

On 6/28/21 10:54, Kamil Dudka wrote:

You are right.  The matching algorithm was not implemented correctly and
the patch you attached fixes it.


I looked into Bug#49239 and found some more places where the 
documentation disagreed with the code. I installed the attached patches 
into Gnulib and Coreutils, respectively, which should bring the two into 
agreement and should fix the bugs that Michael reported albeit in a 
different way than his proposed patch. Briefly:


* The code didn't allow file name suffixes to be the entire file name, 
but the documentation did. Here I went with the documentation. I could 
be talked into the other way; it shouldn't matter much either way.


* The code did the preliminary test (without suffixes) using strcmp, the 
documentation said it should use version comparison. Here I went with 
the documentation.


* As Michael mentioned, sort -V mishandled NUL. I fixed this by adding a 
Gnulib function filenvercmp that treats NUL as just another character.


* As Michael also mentioned, filevercmp fell back on strcmp if version 
sort found no difference, which meant sort's --stable flag was 
ineffective. I fixed this by not having filevercmp fall back on strcmp.


* I fixed the two-consecutive dot and trailing-dot bugs Michael 
mentioned, by rewriting the suffix finder to not have that confusing 
READ_ALPHA state variable, and to instead implement the regular 
expression's nested * operators in the usual way with nested loops.


Thanks, Michael, for reporting the problem. I'm boldly closing the 
Coreutils bug report as fixed.From 9f48fb992a3d7e96610c4ce8be969cff2d61a01b Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 12 Feb 2022 16:27:05 -0800
Subject: [PATCH] filevercmp: fix several unexpected results

Problems reported by Michael Debertol in <https://bugs.gnu.org/49239>.
While looking into this, I spotted some more areas where the
code and documentation did not agree, or where the documentation
was unclear.  The biggest change needed by coreutils is a new
function filenvercmp that can compare byte strings containing NUL.
* lib/filevercmp.c: Do not include sys/types.h, stdlib.h, string.h.
Include idx.h, verify.h.
(match_suffix): Remove, replacing all uses with calls to ...
(file_prefixlen): ... this new function.  Simplify it by
avoiding the need for a confusing READ_ALPHA state variable.
Change its API to something more useful, with a *LEN arg.
it with a new *LEN arg.
(file_prefixlen, verrevcmp):
Prefer idx_t to size_t where either will do.
(order): Change args to S, POS, LEN instead of just S[POS].
This lets us handle NUL bytes correctly.  Callers changed.
Verify that ints are sufficiently wide for its API.
(verrevcmp): Don't assume that S1[S1_LEN] is a non-digit,
and likewise for S2[S2_LEN].  The byte might not be accessible
if filenvercmp is being called.
(filevercmp): Reimplement by calling filenvercmp.
(filenvercmp): New function, rewritten without the assumption
that the inputs are null-terminated.
Remove "easy comparison to see if strings are identical", as the
use of it later (a) was undocumented, and (b) caused sort -V to be
unstable.  When both strings start with ".", do not skip past
the "."s before looking for suffixes, as this disagreed
with the documentation.
* lib/filevercmp.h: Fix comments, which had many mistakes.
(filenvercmp): New decl.
* modules/filevercmp (Depends-on): Add idx, verify.  Remove string.
* tests/test-filevercmp.c: Include string.h.
(examples): Reorder examples ".0" and ".9" that matched the code
but not the documentation.  The code has been fixed to match the
documentation.  Add some examples involving \1 so that they
can be tried with both \1 and \0.  Add some other examples
taken from the bug report.
(equals): New set of test cases.
(sign, test_filevercmp): New functions.
(main): Remove test case where the fixed filevercmp disagrees with
strverscmp.  Use test_filevercmp instead of filevercmp, so that
we also test filenvercmp.  Test the newly-introduced EQUALS cases.
---
 ChangeLog   |  46 ++
 lib/filevercmp.c| 187 +---
 lib/filevercmp.h|  66 ++
 modules/filevercmp  |   3 +-
 tests/test-filevercmp.c |  94 +++-
 5 files changed, 284 insertions(+), 112 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 62162cbfce..4bf0cec7f0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,49 @@
+2022-02-12  Paul Eggert  
+
+	filevercmp: fix several unexpected results
+	Problems reported by Michael Debertol in <https://bugs.gnu.org/49239>.
+	While looking into this, I spotted some more areas where the
+	code and documentation did not agree, or where the documentation
+	was unclear.  The biggest change needed by coreutils is a new
+	function filenvercmp that can compare byte strings containing NUL.
+	* lib/filevercmp.c: Do not include sys/types.h, stdlib.h, string.h.
+	Include

bug#49503: Mention workarounds, so one could achieve the Debian version sorting algorithm

2022-02-10 Thread Paul Eggert

On 7/10/21 02:23, 積丹尼 Dan Jacobson wrote:

(info "(coreutils) Differences from the official Debian Algorithm") and
(info "(coreutils) Minus/Hyphen and Colon characters")
could mention workarounds, so one could indeed achieve the Debian
Algorithm.
Or mention the only way is to use
dpkg --compare-versions
(on pairs only.)


Thanks for mentioning that. I looked over the version-sort doc and fixed 
that problem along with some other stuff I noticed while in the 
neighborhood, by installing the attached patch.From cedf627a901e067ad3a63f0fc20f3376ed59786e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 8 Feb 2022 10:52:10 -0800
Subject: [PATCH] doc: improve version-sort doc
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* doc/coreutils.texi, doc/sort-version.texi:
Capitalize “Coreutils”.
* doc/sort-version.texi: Don’t emphasize natural sort so much,
since Coreutils has just version sort.
Use the term “lexicographic” instead of “alphabetic” or “standard”.
Suggest combining ‘V’ with ‘b’, and show why ‘b’ is needed.
Use shorter titles for sections, as GNU Emacs displays info poorly
when titles are too long to fit in a line.
Use @samp instead of @code for samples of data.
Do not use @samp{@code{...}}; @samp{...} should suffice and
double-nesting looks bad with Emacs.
Omit blank lines in examples that would not be present
in actual shell sessions.
Quote with `` and '', not with " or with '.
Mention dpkg --compare-versions more prominently.
Don’t rely on "\n" being equivalent to "\\n" in shell args.
Prefer Unicode name for hyphen-minus.
---
 doc/coreutils.texi|  14 +-
 doc/sort-version.texi | 402 +-
 2 files changed, 204 insertions(+), 212 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 75b868219..d1ad85865 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -498,9 +498,9 @@ Date input formats
 Version sorting order
 
 * Version sort overview::
-* Implementation Details::
-* Differences from the official Debian Algorithm::
-* Advanced Topics::
+* Version sort implementation::
+* Differences from Debian version sort::
+* Advanced version sort topics::
 
 Opening the software toolbox
 
@@ -3991,7 +3991,7 @@ Output extra information to stderr, like the checksum implementation being used.
 
 @item --untagged
 @opindex --untagged
-Output using the original coreutils format used by the other
+Output using the original Coreutils format used by the other
 standalone checksum utilities like @command{md5sum} for example.
 This format has the checksum at the start of the line, and may be
 more amenable to further processing by other utilities,
@@ -13922,11 +13922,11 @@ If a file being written to does not already exist, it is created.  If a
 file being written to already exists, the data it previously contained
 is overwritten unless the @option{-a} option is used.
 
-In previous versions of GNU coreutils (v5.3.0 - v8.23), a @var{file} of @samp{-}
+In previous versions of GNU Coreutils (v5.3.0 -- v8.23),
+a @var{file} of @samp{-}
 caused @command{tee} to send another copy of input to standard output.
 However, as the interleaved output was not very useful, @command{tee} now
-conforms to POSIX which explicitly mandates it to treat @samp{-} as a file
-with such name.
+conforms to POSIX and treats @samp{-} as a file name.
 
 The program accepts the following options.  Also see @ref{Common options}.
 
diff --git a/doc/sort-version.texi b/doc/sort-version.texi
index 18ddaa94a..7f76ac5bb 100644
--- a/doc/sort-version.texi
+++ b/doc/sort-version.texi
@@ -19,18 +19,17 @@
 @node Version sort overview
 @section Version sort overview
 
-@dfn{version sort} ordering (and similarly, @dfn{natural sort}
-ordering) is a method to sort items such as file names and lines of
-text in an order that feels more natural to people, when the text
+@dfn{Version sort} puts items such as file names and lines of
+text in an order that feels natural to people, when the text
 contains a mixture of letters and digits.
 
-Standard sorting usually does not produce the order that one expects
+Lexicographic sorting usually does not produce the order that one expects
 because comparisons are made on a character-by-character basis.
 
 Compare the sorting of the following items:
 
 @example
-Alphabetical sort:   Version Sort:
+Lexicographic sort:  Version Sort:
 
 a1   a1
 a120 a2
@@ -38,18 +37,19 @@ a13  a13
 a2   a120
 @end example
 
-version sort functionality in GNU coreutils is available in the @samp{ls -v},
-@samp{ls --sort=version}, @samp{sort -V}, @samp{sort --version-sort} commands.
+Version sort functionality in GNU Coreutils is available in the @samp{ls -v},
+@samp{ls --sort=version}, @samp{sort -V}, and
+@samp{sort --version-sort} commands.
 
 
 
-@node Using version sort in 

bug#50694: ls and cpio's idea of "six months ago" are slightly different

2022-02-06 Thread Paul Eggert

On 9/19/21 22:06 in <https://bugs.gnu.org/50694>, 積丹尼 Dan Jacobson wrote:


What a headache.
"Six months ago" means slightly different things to cpio and ls.
And ls documents do say exactly what,
and cpio documents don't even say six months.


Thanks for the bug report. Since the behavior is documented for ls but 
not cpio and lots more people use ls, let's change cpio to behave like 
ls. Proposed patches to cpio attached. The last patch does the actual 
change; the earlier ones are issues I noticed on the way.


(Sergey, I don't have commit privileges for cpio on Savannah. If you 
give me privileges I can install these patches; otherwise, please take a 
look and install if you like. Thanks.)


In the meantime I'll close the coreutils bug report, as I don't think we 
need to change GNU 'ls'.From d2e015a718edb00dfe35d641354c5adb85fb5a49 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 28 Jan 2022 08:35:59 -0800
Subject: [PATCH 1/4] Remove trailing white space and empty lines

---
 ChangeLog.cvs  | 140 -
 NEWS   |   8 +-
 README-alpha   |   3 -
 README-hacking |   8 +-
 TODO   |  23 +++---
 am/ax_compile_check_rettype.m4 |   2 +-
 am/pack.m4 |   2 +-
 doc/Makefile.am|   2 +-
 doc/cpio.1 |   9 +--
 doc/cpio.texi  |  28 +++
 doc/mt.1   |  31 
 lib/Makefile.am|   2 +-
 po/POTFILES.in |   2 -
 src/Makefile.am|   1 -
 src/copyin.c   |  71 +
 src/copyout.c  |  54 ++---
 src/copypass.c |  30 +++
 src/dstring.c  |   2 +-
 src/dstring.h  |   1 -
 src/extern.h   |   7 +-
 src/fatal.c|   1 -
 src/filemode.c |   1 -
 src/global.c   |   1 -
 src/main.c |  90 ++---
 src/makepath.c |   8 +-
 src/mt.c   |   9 +--
 src/tar.c  |   6 +-
 src/util.c |  52 ++--
 sysdep.m4  |   1 -
 tests/CVE-2015-1197.at |   1 -
 tests/CVE-2019-14866.at|   2 +-
 tests/Makefile.am  |   1 -
 tests/atlocal.in   |   1 -
 tests/big-block-size.at|   4 +-
 tests/inout.at |   2 +-
 tests/interdir.at  |  10 +--
 tests/setstat01.at |   2 -
 tests/setstat02.at |   2 -
 tests/setstat03.at |   2 -
 tests/setstat04.at |   2 -
 tests/setstat05.at |   2 -
 tests/symlink-bad-length.at|   2 +-
 tests/version.at   |   1 -
 43 files changed, 296 insertions(+), 333 deletions(-)

diff --git a/ChangeLog.cvs b/ChangeLog.cvs
index 51ae5f8..4bc8ed8 100644
--- a/ChangeLog.cvs
+++ b/ChangeLog.cvs
@@ -10,17 +10,17 @@
 
 	* NEWS, configure.ac: Raise the patchlevel number.
 	* THANKS: Update
-	
+
 	* doc/cpio.texi: Fix a typo.
 	* src/extern.h (warn_if_file_changed): Fix type of the 2nd
 	argument.
 	* src/tar.c (write_out_tar_header): Stylistic change.
 	* src/util.c (copy_files_disk_to_disk): Fix types of automatic
-	variables. 
+	variables.
 	(warn_if_file_changed): Fix type of the 2nd argument.
-	
+
 	Patches supplied by Ladislav Michnovic.
-	
+
 2008-02-08  Sergey Poznyakoff  
 
 	* po/POTFILES.in: Add missing files.
@@ -34,7 +34,7 @@
 2007-12-05  Sergey Poznyakoff  
 
 	Fix mingw build. Thanks to Robert Millan.
-	
+
 	* NEWS, THANKS: Update.
 	* bootstrap: Create lib/system.c, m4/sysdep.m4, update lib/system.h.
 	* mingw.m4, sysdep.m4: New files.
@@ -56,7 +56,7 @@
 2007-06-28  Sergey Poznyakoff  
 
 	* bootstrap: Update for the change of the TP URL
-	
+
 	* NEWS: Update
 	* src/extern.h, src/makepath.c (make_path): Remove mode
 	argument. All callers updated.
@@ -70,7 +70,7 @@
 	* src/extern.h (newdir_umask): New global
 	(delay_set_stat,repair_delayed_set_stat)
 	(apply_delayed_set_stat): New functions
-	
+
 	* src/global.c (newdir_umask): New global
 	* src/idcache.c: Include xalloc.h
 	* src/main.c: New warning control option -W interdir
@@ -107,12 +107,12 @@
 	* src/copyin.c, src/copyout.c, src/copypass.c: Update calls to
 	set_perms.
 	* src/makepath.c: Remove useless includes.
-	
+
 	* src/util.c (set_perms, stat_to_cpio): Use CPIO_UID and CPIO_GID
 	macros to set uid and gid
 	* src/main.c (process_args): Allow to use --owner in copy-out mode.
 	* THANKS: Add Mike Frysinger
-	
+
 2007-05-18  Sergey Poznyakoff  
 
 	* bootstrap: Update from tar repository
@@ -138,7 +138,7 @@
 	* src/Makefile.am: Update
 	* src/main.c, src/mt.c: Include rmt-command.h instead of localedir.h
 	* .cvsignore, doc/.cvsignore: Sort
-	
+
 	* src/util.c (sparse_write): Static.  Provide a forward
 	declaration. Define enum sparse_wri

bug#50115: date command arithmetic involving the epoch produces "invalid date"

2022-02-05 Thread Paul Eggert
Thanks for the bug report. I installed the attached patches to Gnulib 
and to Coreutils, and the fix should be in the next Coreutils release.From aa0d1e7800903f2d75432d78aa64a0e9770e83f2 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 5 Feb 2022 11:05:44 -0800
Subject: [PATCH] parse-datetime: allow calculations to yield -1

Problem reported by Jeremy Cantrell <https://bugs.gnu.org/50115>.
* lib/parse-datetime.y (parse_datetime_body): When calling mktime,
use an unmodifed and negative tm_wday or tm_yday to detect an error,
as a (time_t) -1 return value is valid on most hosts.
* tests/test-parse-datetime.c (main): Add a test for the bug.
---
 ChangeLog   |  9 +
 lib/parse-datetime.y| 22 +++---
 tests/test-parse-datetime.c |  8 
 3 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 5445802ea2..18dcb3fe3f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2022-02-05  Paul Eggert  
+
+	parse-datetime: allow calculations to yield -1
+	Problem reported by Jeremy Cantrell <https://bugs.gnu.org/50115>.
+	* lib/parse-datetime.y (parse_datetime_body): When calling mktime,
+	use an unmodifed and negative tm_wday or tm_yday to detect an error,
+	as a (time_t) -1 return value is valid on most hosts.
+	* tests/test-parse-datetime.c (main): Add a test for the bug.
+
 2022-02-04  Paul Eggert  
 
 	userspec: help fix GNU ‘id’ incompatibility
diff --git a/lib/parse-datetime.y b/lib/parse-datetime.y
index c40fdcef7f..9fc14c9d46 100644
--- a/lib/parse-datetime.y
+++ b/lib/parse-datetime.y
@@ -2076,21 +2076,20 @@ parse_datetime_body (struct timespec *result, char const *p,
   if (pc.days_seen && ! pc.dates_seen)
 {
   intmax_t dayincr;
-  if (INT_MULTIPLY_WRAPV ((pc.day_ordinal
-   - (0 < pc.day_ordinal
-  && tm.tm_wday != pc.day_number)),
-  7, )
-  || INT_ADD_WRAPV ((pc.day_number - tm.tm_wday + 7) % 7,
-dayincr, )
-  || INT_ADD_WRAPV (dayincr, tm.tm_mday, _mday))
-Start = -1;
-  else
+  tm.tm_yday = -1;
+  if (! (INT_MULTIPLY_WRAPV ((pc.day_ordinal
+  - (0 < pc.day_ordinal
+ && tm.tm_wday != pc.day_number)),
+ 7, )
+ || INT_ADD_WRAPV ((pc.day_number - tm.tm_wday + 7) % 7,
+   dayincr, )
+ || INT_ADD_WRAPV (dayincr, tm.tm_mday, _mday)))
 {
   tm.tm_isdst = -1;
   Start = mktime_z (tz, );
 }
 
-  if (Start == (time_t) -1)
+  if (tm.tm_yday < 0)
 {
   if (debugging ())
 dbg_printf (_("error: day '%s' "
@@ -2156,8 +2155,9 @@ parse_datetime_body (struct timespec *result, char const *p,
   tm.tm_min = tm0.tm_min;
   tm.tm_sec = tm0.tm_sec;
   tm.tm_isdst = tm0.tm_isdst;
+  tm.tm_wday = -1;
   Start = mktime_z (tz, );
-  if (Start == (time_t) -1)
+  if (tm.tm_wday < 0)
 {
   if (debugging ())
 dbg_printf (_("error: adding relative date resulted "
diff --git a/tests/test-parse-datetime.c b/tests/test-parse-datetime.c
index 059c810cd1..1e7955bc96 100644
--- a/tests/test-parse-datetime.c
+++ b/tests/test-parse-datetime.c
@@ -398,6 +398,14 @@ main (_GL_UNUSED int argc, char **argv)
   ASSERT (result.tv_sec == thur2 + ((i + 3) % 7 - 7) * 24 * 3600);
 }
 
+  p = "1970-12-31T23:59:59+00:00 - 1 year";  /* Bug#50115 */
+  now.tv_sec = -1;
+  now.tv_nsec = 0;
+  ASSERT (parse_datetime (, p, ));
+  LOG (p, now, result);
+  ASSERT (result.tv_sec == now.tv_sec
+  && result.tv_nsec == now.tv_nsec);
+
   p = "THURSDAY UTC+00";  /* The epoch was on Thursday.  */
   now.tv_sec = 0;
   now.tv_nsec = 0;
-- 
2.32.0

From cf6c84989968c5081c683bbef77825fc35e03c9d Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 5 Feb 2022 11:08:45 -0800
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnulib b/gnulib
index ff208d546..aa0d1e780 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit ff208d546a26fee39a0191297c11560da74b5dee
+Subproject commit aa0d1e7800903f2d75432d78aa64a0e9770e83f2
-- 
2.32.0

From 8a3dedfef9479c53cd9016139ce00d58a6006ba2 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 5 Feb 2022 13:46:44 -0800
Subject: [PATCH 2/2] date: test against bug#50115

* tests/misc/date.pl: Add test.
---
 tests/misc/date.pl | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/misc/date.pl b/tests/misc/date.pl
index e9de8e453..

bug#51288: Break date SYNOPSIS into two sections

2022-02-04 Thread Paul Eggert
Thanks for reporting the problem. It'd be a bit of a pain to implement 
your suggestion exactly since the synopsis is generated automatically. 
However, I installed the attached to try to attack the problem of 
confusion that you reported.


It's been years since I set the date by hand but we might as well be 
clear about it, if only for nostalgia's sake.From 45f8f2dd2b54c7f4745c277e3f52f3c99cea5b57 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 4 Feb 2022 18:21:06 -0800
Subject: [PATCH] date: improve doc

Problem reported by Dan Jacobson (Bug#51288).
* doc/coreutils.texi (date invocation, Setting the time)
(Options for date):
* src/date.c (usage): Improve doc.
---
 doc/coreutils.texi | 23 +++
 src/date.c |  3 ++-
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 088d1764c..d3bbf5768 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -15976,9 +15976,14 @@ Synopses:
 @example
 date [@var{option}]@dots{} [+@var{format}]
 date [-u|--utc|--universal] @c this avoids a newline in the output
-[ MMDDhhmm[[CC]YY][.ss] ]
+[@var{MMDDhhmm}[[@var{CC}]@var{YY}][.@var{ss}]]
 @end example
 
+The @command{date} command displays the date and time.
+With the @option{--set} (@option{-s}) option, or with
+@samp{[@var{MMDDhhmm}[[@var{CC}]@var{YY}][.@var{ss}]]},
+it sets the date and time.
+
 @vindex LC_TIME
 Invoking @command{date} with no @var{format} argument is equivalent to invoking
 it with a default format that depends on the @env{LC_TIME} locale category.
@@ -16312,17 +16317,18 @@ modifiers are GNU extensions.
 @cindex time setting
 @cindex appropriate privileges
 
-If given an argument that does not start with @samp{+}, @command{date} sets
-the system clock to the date and time specified by that argument (as
-described below).  You must have appropriate privileges to set the
-system clock.  Note for changes to persist across a reboot, the
+You must have appropriate privileges to set the
+system clock.  For changes to persist across a reboot, the
 hardware clock may need to be updated from the system clock, which
 might not happen automatically on your system.
 
-The argument must consist entirely of digits, which have the following
-meaning:
+To set the clock, you can use the @option{--set} (@option{-s}) option
+(@pxref{Options for date}).  To set the clock without using GNU
+extensions, you can give @command{date} an argument of the form
+@samp{@var{MMDDhhmm}[[@var{CC}]@var{YY}][.@var{ss}]} where each two-letter
+component stands for two digits with the following meanings:
 
-@table @samp
+@table @var
 @item MM
 month
 @item DD
@@ -16352,6 +16358,7 @@ relative to Universal Time rather than to the local time zone.
 @cindex options for @command{date}
 
 The program accepts the following options.  Also see @ref{Common options}.
+Except for @option{-u}, these options are all GNU extensions to POSIX.
 
 @table @samp
 
diff --git a/src/date.c b/src/date.c
index 0915d7c64..163141adc 100644
--- a/src/date.c
+++ b/src/date.c
@@ -135,7 +135,8 @@ Usage: %s [OPTION]... [+FORMAT]\n\
 "),
   program_name, program_name);
   fputs (_("\
-Display the current time in the given FORMAT, or set the system date.\n\
+Display date and time in the given FORMAT.\n\
+With -s, or with [MMDDhhmm[[CC]YY][.ss]], set the date and time.\n\
 "), stdout);
 
   emit_mandatory_arg_note ();
-- 
2.34.1



bug#53631: coreutils id(1) incorrect behavior

2022-02-04 Thread Paul Eggert
Thanks for the bug report. I installed the attached patch, which I hope 
fixes things for you, and am boldly closing the bug report.


This fix depends on the latest lib/userspec.c from Gnulib; see 
<https://lists.gnu.org/r/bug-gnulib/2022-02/msg0.html>.From 1204c5132d61efbb966fb2a94b4dc7463beddfe1 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 4 Feb 2022 14:43:31 -0800
Subject: [PATCH] id: print groups of listed name

Problem reported by Vladimir D. Seleznev (Bug#53631).
* src/id.c (main): Do not canonicalize user name before
deciding what groups the user belongs to.
---
 NEWS |  3 +++
 src/id.c | 30 ++
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/NEWS b/NEWS
index a4ba0fce6..fcf31fe39 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,9 @@ GNU coreutils NEWS-*- outline -*-
   and B is in some other file system.
   [bug introduced in coreutils-9.0]
 
+  'id xyz' now uses the name 'xyz' to determine groups, instead of xyz's uid.
+  [bug introduced in coreutils-8.22]
+
   On macOS, 'mv A B' no longer fails with "Operation not supported"
   when A and B are in the same tmpfs file system.
   [bug introduced in coreutils-9.0]
diff --git a/src/id.c b/src/id.c
index 2d969cc1e..f7625b6fb 100644
--- a/src/id.c
+++ b/src/id.c
@@ -127,7 +127,6 @@ main (int argc, char **argv)
   int optc;
   int selinux_enabled = (is_selinux_enabled () > 0);
   bool smack_enabled = is_smack_enabled ();
-  char *pw_name = NULL;
 
   initialize_main (, );
   set_program_name (argv[0]);
@@ -235,6 +234,7 @@ main (int argc, char **argv)
   /* For each username/userid to get its pw_name field */
   for (; optind < n_ids; optind++)
 {
+  char *pw_name = NULL;
   struct passwd *pwd = NULL;
   char const *spec = argv[optind];
   /* Disallow an empty spec here as parse_user_spec() doesn't
@@ -242,24 +242,22 @@ main (int argc, char **argv)
  specify a noop or "reset special bits" depending on the system.  */
   if (*spec)
 {
-  if (parse_user_spec (spec, , NULL, NULL, NULL) == NULL)
-{
-  /* parse_user_spec will only extract a numeric spec,
- so we lookup that here to verify and also retrieve
- the PW_NAME used subsequently in group lookup.  */
-  pwd = getpwuid (euid);
-}
+  if (parse_user_spec (spec, , NULL, _name, NULL) == NULL)
+pwd = pw_name ? getpwnam (pw_name) : getpwuid (euid);
 }
   if (pwd == NULL)
 {
-  error (0, errno, _("%s: no such user"), quote (argv[optind]));
+  error (0, errno, _("%s: no such user"), quote (spec));
   ok &= false;
-  continue;
 }
-  pw_name = xstrdup (pwd->pw_name);
-  ruid = euid = pwd->pw_uid;
-  rgid = egid = pwd->pw_gid;
-  print_stuff (pw_name);
+  else
+{
+  if (!pw_name)
+pw_name = xstrdup (pwd->pw_name);
+  ruid = euid = pwd->pw_uid;
+  rgid = egid = pwd->pw_gid;
+  print_stuff (pw_name);
+}
   free (pw_name);
 }
 }
@@ -301,7 +299,7 @@ main (int argc, char **argv)
   if (rgid == NO_GID && errno)
 die (EXIT_FAILURE, errno, _("cannot get real GID"));
 }
-print_stuff (pw_name);
+print_stuff (NULL);
 }
 
   return ok ? EXIT_SUCCESS : EXIT_FAILURE;
@@ -434,7 +432,7 @@ print_stuff (char const *pw_name)
   if (just_user)
   print_user (use_real ? ruid : euid);
 
-  /* print_group and print_group_lists functions return true on successful
+  /* print_group and print_group_list return true on successful
  execution but false if something goes wrong. We then AND this value with
  the current value of 'ok' because we want to know if one of the previous
  users faced a problem in these functions. This value of 'ok' is later used
-- 
2.34.1



bug#50745: coreutils-8.32 gnulib test results on hppa HP-UX 11.11

2022-01-28 Thread Paul Eggert

Thanks, that's a Gnulib report so I've forwarded it to bug-gnulib here:

https://lists.gnu.org/r/bug-gnulib/2022-01/msg00177.html

and am closing the Coreutils bug.





bug#51345: dd with conv=fsync sometimes returns when its writes are still cached

2022-01-28 Thread Paul Eggert
I found a bit of time to work on this and installed the attached patch, 
which should address the issue. Thanks for reporting it.From 3368b8745046aeaa89f418f560e714b374f1a560 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 28 Jan 2022 00:01:07 -0800
Subject: [PATCH] dd: synchronize output after write errors

Problem reported by Sworddragon (Bug#51345).
* src/dd.c (cleanup): Synchronize output unless dd has been interrupted.
(synchronize_output): New function, split out from dd_copy.
Update conversions_mask so synchronization is done at most once.
(main): Do not die with the output file open, since we want to be
able to synchronize it before exiting.  Synchronize output before
exiting.
---
 NEWS   |  3 ++
 doc/coreutils.texi | 10 +++---
 src/dd.c   | 76 +-
 3 files changed, 64 insertions(+), 25 deletions(-)

diff --git a/NEWS b/NEWS
index 15c9428bd..757abee15 100644
--- a/NEWS
+++ b/NEWS
@@ -41,6 +41,9 @@ GNU coreutils NEWS-*- outline -*-
   padding them with zeros to 9 digits.  It uses clock_getres and
   clock_gettime to infer the clock resolution.
 
+  dd conv=fsync now synchronizes output even after a write error,
+  and similarly for dd conv=fdatasync.
+
   timeout --foreground --kill-after=... will now exit with status 137
   if the kill signal was sent, which is consistent with the behavior
   when the --foreground option is not specified.  This allows users to
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index c17406550..088d1764c 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9397,14 +9397,16 @@ Continue after read errors.
 @item fdatasync
 @opindex fdatasync
 @cindex synchronized data writes, before finishing
-Synchronize output data just before finishing.  This forces a physical
-write of output data.
+Synchronize output data just before finishing,
+even if there were write errors.
+This forces a physical write of output data.
 
 @item fsync
 @opindex fsync
 @cindex synchronized data and metadata writes, before finishing
-Synchronize output data and metadata just before finishing.  This
-forces a physical write of output data and metadata.
+Synchronize output data and metadata just before finishing,
+even if there were write errors.
+This forces a physical write of output data and metadata.
 
 @end table
 
diff --git a/src/dd.c b/src/dd.c
index 957ad129e..4ddc6db12 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -939,6 +939,8 @@ iclose (int fd)
   return 0;
 }
 
+static int synchronize_output (void);
+
 static void
 cleanup (void)
 {
@@ -948,6 +950,13 @@ cleanup (void)
   alignfree (obuf);
 #endif
 
+  if (!interrupt_signal)
+{
+  int sync_status = synchronize_output ();
+  if (sync_status)
+exit (sync_status);
+}
+
   if (iclose (STDIN_FILENO) != 0)
 die (EXIT_FAILURE, errno, _("closing input file %s"), quoteaf (input_file));
 
@@ -2377,17 +2386,33 @@ dd_copy (void)
   && 0 <= reported_w_bytes && reported_w_bytes < w_bytes)
 print_xfer_stats (0);
 
-  if ((conversions_mask & C_FDATASYNC) && ifdatasync (STDOUT_FILENO) != 0)
+  return exit_status;
+}
+
+/* Synchronize output according to conversions_mask.
+   Do this even if w_bytes is zero, as fsync and fdatasync
+   flush out write requests from other processes too.
+   Clear bits in conversions_mask so that synchronization is done only once.
+   Return zero if successful, an exit status otherwise.  */
+
+static int
+synchronize_output (void)
+{
+  int exit_status = 0;
+  int mask = conversions_mask;
+  conversions_mask &= ~ (C_FDATASYNC | C_FSYNC);
+
+  if ((mask & C_FDATASYNC) && ifdatasync (STDOUT_FILENO) != 0)
 {
   if (errno != ENOSYS && errno != EINVAL)
 {
   error (0, errno, _("fdatasync failed for %s"), quoteaf (output_file));
   exit_status = EXIT_FAILURE;
 }
-  conversions_mask |= C_FSYNC;
+  mask |= C_FSYNC;
 }
 
-  if ((conversions_mask & C_FSYNC) && ifsync (STDOUT_FILENO) != 0)
+  if ((mask & C_FSYNC) && ifsync (STDOUT_FILENO) != 0)
 {
   error (0, errno, _("fsync failed for %s"), quoteaf (output_file));
   return EXIT_FAILURE;
@@ -2460,6 +2485,16 @@ main (int argc, char **argv)
| (conversions_mask & C_EXCL ? O_EXCL : 0)
| (seek_records || (conversions_mask & C_NOTRUNC) ? 0 : O_TRUNC));
 
+  off_t size;
+  if ((INT_MULTIPLY_WRAPV (seek_records, output_blocksize, )
+   || INT_ADD_WRAPV (seek_bytes, size, ))
+  && !(conversions_mask & C_NOTRUNC))
+die (EXIT_FAILURE, 0,
+ _("offset too large: "
+   "cannot truncate to a length of seek=%"PRIdMAX""
+   " (%td-byte) blocks"),
+ seek_records, output_blocksize);
+
   /* Open the output file w

bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

2022-01-27 Thread Paul Eggert

On 11/7/21 23:04, Paul Eggert wrote:


https://github.com/openzfs/zfs/issues/11900#issuecomment-962812974


Apparently the OpenZFS bug has been fixed, as behlendorf closed it 20 
days ago.


Since there doesn't seem to be a good way for coreutils to work around 
the bug, and the bug potentially affects all apps that use SEEK_DATA, 
I'm taking the liberty of closing the coreutils bug report. Thanks for 
reporting it.






bug#51482: dd with status=progress does not update its output when the main writing is finished

2022-01-27 Thread Paul Eggert

On 10/29/21 07:22, Sworddragon wrote:


When dd is being used with status=progress it appears to update the status
every second but does not do a final update when dd finished its main
writing task (e.g. when dd starts flushing via conv=fsync and it still
blocks for like over a minute) causing the output to be incorrect and
inconsistent across multiple tries.


Thanks for mentioning that. I installed the attached to implement your 
suggestion, and am boldly closing the bug report. Please give it a try 
when you have the chance.From 4cda71156464d20789bac5b31f6a1ea36b183edd Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 27 Jan 2022 18:34:09 -0800
Subject: [PATCH] dd: output final progress before syncing

Problem reported by Sworddragon (Bug#51482).
* src/dd.c (reported_w_bytes): New var.
(print_xfer_stats): Set it.
(dd_copy): Print a final progress report if useful before
synchronizing output data.
---
 NEWS |  4 
 src/dd.c | 12 
 2 files changed, 16 insertions(+)

diff --git a/NEWS b/NEWS
index 561087ccc..15c9428bd 100644
--- a/NEWS
+++ b/NEWS
@@ -58,6 +58,10 @@ GNU coreutils NEWS-*- outline -*-
 
   The new 'date' option --resolution outputs the timestamp resolution.
 
+  With conv=fdatasync or conv=fsync, dd status=progress now reports
+  any extra final progress just before synchronizing output data,
+  since synchronizing can take a long time.
+
   sort --debug now diagnoses issues with --field-separator characters
   that conflict with characters possibly used in numbers.
 
diff --git a/src/dd.c b/src/dd.c
index a6a3708f1..957ad129e 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -196,6 +196,9 @@ static intmax_t r_full = 0;
 /* Number of bytes written.  */
 static intmax_t w_bytes = 0;
 
+/* Last-reported number of bytes written, or negative if never reported.  */
+static intmax_t reported_w_bytes = -1;
+
 /* Time that dd started.  */
 static xtime_t start_time;
 
@@ -815,6 +818,8 @@ print_xfer_stats (xtime_t progress_time)
 }
   else
 fputc ('\n', stderr);
+
+  reported_w_bytes = w_bytes;
 }
 
 static void
@@ -2365,6 +2370,13 @@ dd_copy (void)
 }
 }
 
+  /* fdatasync/fsync can take a long time, so issue a final progress
+ indication now if progress has been made since the previous indication.  */
+  if (conversions_mask & (C_FDATASYNC | C_FSYNC)
+  && status_level == STATUS_PROGRESS
+  && 0 <= reported_w_bytes && reported_w_bytes < w_bytes)
+print_xfer_stats (0);
+
   if ((conversions_mask & C_FDATASYNC) && ifdatasync (STDOUT_FILENO) != 0)
 {
   if (errno != ENOSYS && errno != EINVAL)
-- 
2.32.0



bug#53262: chmod 9.0 failing, where 8.29 succeeds - but no error message

2022-01-14 Thread Paul Eggert
Thanks for reporting that. This is a duplicate of bug#50784[1], which 
was fixed[2] in September.


Perhaps we should generate a new Coreutils soon, since this bug has been 
reported three times now.


[1]: https://bugs.gnu.org/50784
[2]: 
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=e8b56ebd536e82b15542a00c888109471936bfda






bug#53037: df/total-verify fail with cephfs

2022-01-05 Thread Paul Eggert

On 1/5/22 15:25, Dylan Simon wrote:


Hrm, no, with this patch it still fails, but differently (sorry so many
filesystems):


OK, then perhaps someone with a bit more free time will have to look at 
it - unless you can propose a patch that passed "make check".






bug#53037: df/total-verify fail with cephfs

2022-01-05 Thread Paul Eggert

On 1/5/22 14:11, Dylan Simon wrote:

Then it will look like this (I'm inferring, haven't actually tried it):


I'm still not quite following, but does the attached patch address the 
problem?diff --git a/src/df.c b/src/df.c
index b803fc73b..8a0293ca9 100644
--- a/src/df.c
+++ b/src/df.c
@@ -127,6 +127,7 @@ static bool print_grand_total;
 
 /* Grand total data.  */
 static struct fs_usage grand_fsu;
+static bool grand_fsu_fsu_files_top_bit_set;
 
 /* Display modes.  */
 enum
@@ -993,8 +994,11 @@ get_field_values (struct field_values_t *bv,
 static void
 add_to_grand_total (struct field_values_t *bv, struct field_values_t *iv)
 {
-  if (known_value (iv->total))
-grand_fsu.fsu_files += iv->total;
+  if (known_value (iv->total) && known_value (iv->available_to_root))
+add_uint_with_neg_flag (_fsu.fsu_files,
+_fsu_fsu_files_top_bit_set,
+iv->total - iv->available_to_root,
+iv->total < iv->available_to_root);
   if (known_value (iv->available))
 grand_fsu.fsu_ffree += iv->available;
 
@@ -1860,6 +1864,9 @@ main (int argc, char **argv)
  NULL, NULL, NULL, false, false, _fsu, false);
 
   print_table ();
+
+  if (print_grand_total & grand_fsu_fsu_files_top_bit_set)
+die (EXIT_FAILURE, 0, "iused < 0");
 }
   else
 {


bug#53037: df/total-verify fail with cephfs

2022-01-05 Thread Paul Eggert

On 1/5/22 11:27, Dylan Simon wrote:

Only adding rows with all known values
might make sense but would still break the test (wrong total total instead):

   if (known_value (iv->total) && known_value (iv->available)) {
 grand_fsu.fsu_files += iv->total;
 grand_fsu.fsu_ffree += iv->available;
   }


Sorry, I'm not quite following. If you make the above change, what will 
the output look like instead? And how will that break the test?






bug#53025: Encouragement to go back to *dis*abled quotation marks in ls output as *default* behavior

2022-01-05 Thread Paul Eggert

On 1/5/22 00:44, Joerg M. Sigle wrote:

"When this many people consider a thing a bug, then it's a bug whether maintainers 
disagree or not."


By that standard it'll be a bug no matter what the maintainers do, since 
feelings are strong on both sides of the issue. So by this reasoning, 
maintainers might as well do nothing.


In the meantime you can avoid the issue yourself by using this:

alias ls="ls -N"

in your .profile or whatever.





bug#17774: AIX and lbracket ([) program - will not install on AIX using installp

2021-12-31 Thread Paul Eggert

On 6/14/14 10:28, Paul Eggert wrote:

That part of POSIX has been standardized since POSIX.2 (IEEE Std 
1003.2-1992), and the wording hasn't changed since then if I recall 
correctly, so AIX has had this conformance bug for decades and nobody 
has cared


I just ran into a related problem when building bleeding-edge coreutils 
on AIX 7.1, and so I thought I'd document it here. On AIX 7.1, 'make 
check' fails with:


FAIL: tests/misc/help-version.sh

because 'make' never built a '[' program. And AIX 'make' doesn't build a 
'[' program because it mishandles '[' in 'make' macros, which means that 
with Makefiles like this:


  EXEEXT =
  bin_PROGRAMS = ... src/[$(EXEEXT) ...
  PROGRAMS = $(bin_PROGRAMS) ...
  all-am: ... $(PROGRAMS) ...

'make all-am' does not "see" the 'src/[' and so doesn't build it.

Because of this bug, 'make install' obviously will not work correctly, 
as there's no '[' command to install.


A simple workaround is to use GNU Make, which doesn't have this bug.

This bug in AIX 'make' is so obscure that I'm not going to bother 
documenting it in the Autoconf manual under its portability guidelines 
for 'make'. Anyway, nowadays "just use GNU Make" is a good recipe for 
just about every package other than GNU Make itself.






bug#52873: expr unexpected syntax error

2021-12-29 Thread Paul Eggert

On 12/29/21 12:01, Martin Rixham wrote:

What nonsense. I want to parse source code. ')' is not an uncommon line of
source code. It should work.


Unfortunately, you're asking for what is in general impossible. If the 
left argument of ':' could be any string, then the grammar for 'expr' 
would be ambiguous. Consider the following shell command:


expr '(' : ')'

This outputs ':' because it evaluates the parenthesized string ':'; but 
if the operands of ':' could be any strings it could also be interpreted 
as matching '(' against ')', which means it should output the same thing 
as 'expr a : b', namely '0'.


Of course this means 'expr' was poorly designed in the 1970s, but we're 
stuck with that design now (it's standardized by POSIX), portable code 
must deal with this poor design, and for compatibility reasons it's 
better for GNU expr to support the design, poor as it is.


These days there are much better ways than 'expr' to parse code. For 
example, if you want to count the number of characters in a shell 
variable v, you can use this shell command:


nv=${#v}

This works even if v=')', whereas this:

nv=$(expr "$v" : '.*')

has the bug that you mentioned, plus it's harder to read and it's less 
efficient.






bug#52873: expr unexpected syntax error

2021-12-29 Thread Paul Eggert

On 12/29/21 08:31, Davide Brini wrote:

I think you need to use '+' before the offending token


Yes. That's a GNU extension. If you want to be portable to any POSIX 
implementation, you can use this instead:


expr "X(" : '.*' - 1

A similar example is given in the POSIX spec for 'expr':

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/expr.html

As this is not a bug, I'm closing the bug report.





bug#52844: missing perl not recognized properly by configure

2021-12-28 Thread Paul Eggert

On 12/27/21 23:07, Serge Belyshev wrote:


-case $PERL in *"/missing "*) cu_have_perl=no;; esac
+case $PERL in */missing*) cu_have_perl=no;; esac


Thanks for the bug report and suggested fix. On the whole I think it'd 
be better to address the nearby FIXME instead, so I did that by 
installing this into Gnulib:


https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=8220e0f0b5f46ff61e1d19f8a1614508fa162abd

and the attached into Coreutils. Please give it a try. In the meantime 
I'll assume this will fix the bug for you and so am boldly closing the 
bug report; if that's wrong we can reopen it.
From c7bbfeb80c4be8d55074214e98c236eed8015a15 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 28 Dec 2021 01:53:44 -0800
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnulib b/gnulib
index f67a7185e..8220e0f0b 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit f67a7185e8bee0becde9a6992755d2afa1ca6531
+Subproject commit 8220e0f0b5f46ff61e1d19f8a1614508fa162abd
-- 
2.32.0

From 91042c4d1e5b7b56bf3f7ae9f4c9abc45927809d Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 28 Dec 2021 02:03:21 -0800
Subject: [PATCH 2/2] build: be more careful about Perl

Problem reported by Serge Belyshev (Bug#52844).
* configure.ac (HAVE_PERL): Rely on latest Gnulib gl_PERL, which
sets gl_cv_prog_perl.
---
 configure.ac | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/configure.ac b/configure.ac
index 9b8ea0dde..b982d7e9a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -64,11 +64,7 @@ gl_INIT
 coreutils_MACROS
 
 # The test suite needs to know if we have a working perl.
-# FIXME: this is suboptimal.  Ideally, we would be able to call gl_PERL
-# with an ACTION-IF-NOT-FOUND argument ...
-cu_have_perl=yes
-case $PERL in *"/missing "*) cu_have_perl=no;; esac
-AM_CONDITIONAL([HAVE_PERL], [test $cu_have_perl = yes])
+AM_CONDITIONAL([HAVE_PERL], [test "$gl_cv_prog_perl" != no])
 
 # gl_GCC_VERSION_IFELSE([major], [minor], [run-if-found], [run-if-not-found])
 # 
-- 
2.32.0



bug#52782: Man Page: Incorrect Summary of --color Option

2021-12-24 Thread Paul Eggert

On 12/24/21 10:40, Pranab Lawrence Ekka Dasgupta wrote:

Perhaps it could be reworded to "default if WHEN is omitted", although
that does cause some repetition


Thanks, I gave that sort of thing a shot by installing the attached, and 
am closing the bug report.From 2269ea5aa2b8579ed2ec22407f55de4fd7218e4b Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 24 Dec 2021 15:25:29 -0800
Subject: [PATCH] ls: improve doc for =WHEN

* src/ls.c (usage): Improve clarity of =WHEN args (Bug#52782).
---
 src/ls.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index c350787b6..d6c7302dc 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -5424,18 +5424,13 @@ Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.\n\
 "), stdout);
   fputs (_("\
   -C list entries by columns\n\
-  --color[=WHEN] color the output; WHEN can be 'always' (default\
-\n\
-   if omitted), 'auto', or 'never'; more info below\
-\n\
+  --color[=WHEN] color the output WHEN; more info below\n\
   -d, --directorylist directories themselves, not their contents\n\
   -D, --diredgenerate output designed for Emacs' dired mode\n\
 "), stdout);
   fputs (_("\
   -f list all entries in directory order\n\
-  -F, --classify[=WHEN]  append indicator (one of */=>@|) to entries;\n\
-   WHEN can be 'always' (default if omitted),\n\
-   'auto', or 'never'\n\
+  -F, --classify[=WHEN]  append indicator (one of */=>@|) to entries WHEN\n\
   --file-typelikewise, except do not append '*'\n\
   --format=WORD  across -x, commas -m, horizontal -x, long -l,\n\
single-column -1, verbose -l, vertical -C\n\
@@ -5468,8 +5463,7 @@ Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.\n\
(overridden by -a or -A)\n\
 "), stdout);
   fputs (_("\
-  --hyperlink[=WHEN] hyperlink file names; WHEN can be 'always'\n\
-   (default if omitted), 'auto', or 'never'\n\
+  --hyperlink[=WHEN] hyperlink file names WHEN\n\
 "), stdout);
   fputs (_("\
   --indicator-style=WORD  append indicator with style WORD to entry names:\
@@ -5563,6 +5557,10 @@ Also the TIME_STYLE environment variable sets the default style to use.\n\
 "), stdout);
   fputs (_("\
 \n\
+The WHEN argument defaults to 'always' and can also be 'auto' or 'never'.\n\
+"), stdout);
+  fputs (_("\
+\n\
 Using color to distinguish file types is disabled both by default and\n\
 with --color=never.  With --color=auto, ls emits color codes only when\n\
 standard output is connected to a terminal.  The LS_COLORS environment\n\
-- 
2.32.0



bug#52782: Man Page: Incorrect Summary of --color Option

2021-12-24 Thread Paul Eggert

On 12/24/21 07:11, Pranab Lawrence Ekka Dasgupta wrote:

The summary of the `--color` option incorrectly states that the default
option is 'always', whereas it functions otherwise


It sounds like you misunderstood the man page. It says that 
--color[=WHEN] means "colorize the output; WHEN can be 'always' (default 
if omitted), 'auto', or 'never'". The phrase "if omitted" refers to when 
you use plain "--colorize", not to when you don't use "--colorize" at 
all. The same wording is used to document  --classify[=WHEN], 
--hyperlink[=WHEN].


I suppose we could reword the man page to avoid this potential confusion 
in --color, --classify and --hyperlink. However, I don't see how to do 
that without adding so much wording that the cost would likely exceed 
the benefit. Perhaps some other wordsmith could chip in.


In the meantime I noticed that the documentation uses the word 
"colorize" when it should say "color", so I installed the attached.From 2b30312f77f99efc8c56804424c4a317e25953f1 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 24 Dec 2021 09:47:18 -0800
Subject: [PATCH] doc: colorize -> color

Living so close to Hollywood I know that "colorize"
means adding color to something that was already monochrome,
whereas "color" means to give color to something.
Coreutils apps color text instead of colorizing it.
---
 NEWS   | 2 +-
 cfg.mk | 2 +-
 doc/coreutils.texi | 2 +-
 src/dircolors.hin  | 6 +++---
 src/ls.c   | 4 ++--
 tests/ls/capability.sh | 2 +-
 tests/ls/color-ext.sh  | 2 +-
 tests/ls/color-norm.sh | 2 +-
 tests/ls/multihardlink.sh  | 2 +-
 tests/ls/stat-free-symlinks.sh | 2 +-
 10 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/NEWS b/NEWS
index c8e8bdc16..811e27e3a 100644
--- a/NEWS
+++ b/NEWS
@@ -3063,7 +3063,7 @@ GNU coreutils NEWS-*- outline -*-
   install accepts a new option --strip-program to specify the program used to
   strip binaries.
 
-  ls now colorizes files with capabilities if libcap is available
+  ls now colors names of files with capabilities if libcap is available.
 
   ls -v now uses filevercmp function as sort predicate (instead of strverscmp)
 
diff --git a/cfg.mk b/cfg.mk
index 6d6c37dc2..046f14167 100644
--- a/cfg.mk
+++ b/cfg.mk
@@ -49,7 +49,7 @@ export VERBOSE = yes
 # 4914152 9e
 export XZ_OPT = -8e
 
-old_NEWS_hash = 4d17651e2318a01687a1f0fdca9177e5
+old_NEWS_hash = 612bad626bf28b1847ad0114cb2cd6fe
 
 # Add an exemption for sc_makefile_at_at_check.
 _makefile_at_at_check_exceptions = ' && !/^cu_install_prog/ && !/dynamic-dep/'
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 6068d8b08..f7ce1654b 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -8109,7 +8109,7 @@ may be omitted, or one of:
 @end itemize
 Specifying @option{--color} and no @var{when} is equivalent to
 @option{--color=always}.
-If piping a colorized listing through a pager like @command{less},
+If piping a colored listing through a pager like @command{less},
 use the pager's @option{-R} option to pass the color codes to the terminal.
 
 @vindex LS_COLORS
diff --git a/src/dircolors.hin b/src/dircolors.hin
index b5d6452d7..d86e0088f 100644
--- a/src/dircolors.hin
+++ b/src/dircolors.hin
@@ -9,7 +9,7 @@
 # slackware version of dircolors) are recognized but ignored.
 
 # Below are TERM entries, which can be a glob patterns, to match
-# against the TERM environment variable to determine if it is colorizable.
+# against the TERM environment variable to determine if it is colorable.
 TERM Eterm
 TERM ansi
 TERM *color*
@@ -71,7 +71,7 @@ STICKY 37;44	# dir with the sticky bit set (+t) and not other-writable
 EXEC 01;32
 
 # List any file extensions like '.gz' or '.tar' that you would like ls
-# to colorize below. Put the extension, a space, and the color init string.
+# to color below. Put the extension, a space, and the color init string.
 # (and any comments you want to add after a '#')
 
 # If you use DOS-style suffixes, you may want to uncomment the following:
@@ -80,7 +80,7 @@ EXEC 01;32
 #.com 01;32
 #.btm 01;32
 #.bat 01;32
-# Or if you want to colorize scripts even if they do not have the
+# Or if you want to color scripts even if they do not have the
 # executable bit actually set.
 #.sh  01;32
 #.csh 01;32
diff --git a/src/ls.c b/src/ls.c
index 6e87af651..c350787b6 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -361,7 +361,7 @@ static bool color_symlink_as_referent;
 
 static char const *hostname;
 
-/* mode of appropriate file for colorization */
+/* Mode of appropriate file for coloring.  */
 static mode_t
 file_or_link_mode (struct fileinfo const *file)
 {
@@ -5424,7 +5424,7 @@ Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.\n\
 "), stdout);
   fputs (_("\
   -C 

bug#52656: (id) utility bug found

2021-12-19 Thread Paul Eggert

On 12/19/21 06:58, Glenn Golden wrote:

Possibly the man page
could be updated to say the same.


Thanks for the suggestion. I installed the attached documentation patch 
and am closing the bug report.From d3beb53be64a958988a94ed86e6246a21dc95a9c Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sun, 19 Dec 2021 09:12:59 -0800
Subject: [PATCH] id: improve doc for when USER is omitted
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/id.c (usage): “current user” → “current process” (Bug#52656).
---
 src/id.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/id.c b/src/id.c
index a6c48cd50..b110048d3 100644
--- a/src/id.c
+++ b/src/id.c
@@ -96,7 +96,7 @@ usage (int status)
   printf (_("Usage: %s [OPTION]... [USER]...\n"), program_name);
   fputs (_("\
 Print user and group information for each specified USER,\n\
-or (when USER omitted) for the current user.\n\
+or (when USER omitted) for the current process.\n\
 \n"),
  stdout);
   fputs (_("\
-- 
2.32.0



bug#52525: wanted to add option to date command to handle pure numeric input in varying ways and output for invalid dates

2021-12-15 Thread Paul Eggert

On 12/15/21 14:24, Mike Marchywka wrote:


if date is
going to be a swiss army knife for date conversions
it makes some sense to allow user selection of
ambiguity resolution doesn't it?


There are thousands of possible data conversions and I'm not sure we 
want to head down the road of trying to handle them all.


That being said, this particular conversion might be worth the trouble. 
However, 'date' uses the same date parser that a lot of other GNU 
programs do. Surely if there's a change to be made to date parsing it 
should be made there, not just to 'date', so that all the other programs 
can use the new functionality.






bug#52525: wanted to add option to date command to handle pure numeric input in varying ways and output for invalid dates

2021-12-15 Thread Paul Eggert

On 12/15/21 12:39, Mike Marchywka wrote:

$echo 2000 | date +%Y -f-
2021


How about this instead? The idea is to avoid adding features if they can 
easily be implemented with some other standard utility. This way, you 
can write your shell scripts now rather than waiting for a future fix 
(plus, it keeps 'date' simpler).


echo 2000 | sed 's/$/-07-01/' | date +%Y -f-






bug#52193: mv broken on non-APFS filesystems on macOS on coreutils >= 9.0

2021-12-14 Thread Paul Eggert

On 12/12/21 09:06, Sudhip Nashi wrote:



Thanks, I think I see the problem now. Please try the attached patch; I haven't 
tested myself as I lack access to macOS. Thanks.
<0001-renameatu-port-to-macOS-tmpfs.patch>


It looks like this patch solves the problem.



Thanks, I installed the attached to implement this fix, the first patch 
into Gnulib, the second into coreutils.From dd474e50930ea00910631eb1b77ff4270d7b02c0 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 14 Dec 2021 12:32:30 -0800
Subject: [PATCH] renameatu: port to macOS tmpfs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Sudhip Nashi (Bug#52193).
* lib/renameatu.c (renameat2ish) [HAVE_RENAMEAT]: New function.
(renameatu): Use the new function, to avoid a bug when
renameatx_np fails with errno == ENOTSUP.  Don’t try to support
RENAME_EXCHANGE; the old code didn’t work and nobody using using
RENAME_EXCHANGE anyway.
---
 ChangeLog   | 10 +++
 lib/renameatu.c | 69 +
 2 files changed, 45 insertions(+), 34 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 0e20dcb58..370cd9839 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2021-12-14  Paul Eggert  
+
+	renameatu: port to macOS tmpfs
+	Problem reported by Sudhip Nashi (Bug#52193).
+	* lib/renameatu.c (renameat2ish) [HAVE_RENAMEAT]: New function.
+	(renameatu): Use the new function, to avoid a bug when
+	renameatx_np fails with errno == ENOTSUP.  Don’t try to support
+	RENAME_EXCHANGE; the old code didn’t work and nobody using using
+	RENAME_EXCHANGE anyway.
+
 2021-12-12  Bruno Haible  
 
 	gnulib-tool: Support non-recursive-gnulib-prefix-hack with tests.
diff --git a/lib/renameatu.c b/lib/renameatu.c
index 38438a4ef..b75f95269 100644
--- a/lib/renameatu.c
+++ b/lib/renameatu.c
@@ -61,6 +61,29 @@ rename_noreplace (char const *src, char const *dst)
 
 #undef renameat
 
+#if HAVE_RENAMEAT
+
+/* Act like renameat (FD1, SRC, FD2, DST), except fail with EEXIST if
+   FLAGS is nonzero and it is easy to fail atomically if DST already exists.
+   This lets renameatu be atomic when it can be implemented in terms
+   of renameatx_np.  */
+static int
+renameat2ish (int fd1, char const *src, int fd2, char const *dst,
+  unsigned int flags)
+{
+# ifdef RENAME_EXCL
+  if (flags)
+{
+  int r = renameatx_np (fd1, src, fd2, dst, RENAME_EXCL);
+  if (r == 0 || errno != ENOTSUP)
+return r;
+}
+# endif
+
+  return renameat (fd1, src, fd2, dst);
+}
+#endif
+
 /* Rename FILE1, in the directory open on descriptor FD1, to FILE2, in
the directory open on descriptor FD2.  If possible, do it without
changing the working directory.  Otherwise, resort to using
@@ -93,9 +116,6 @@ renameatu (int fd1, char const *src, int fd2, char const *dst,
 
 #if HAVE_RENAMEAT
   {
-# if defined RENAME_EXCL/* macOS */
-  unsigned int uflags;
-# endif
   size_t src_len;
   size_t dst_len;
   char *src_temp = (char *) src;
@@ -107,23 +127,12 @@ renameatu (int fd1, char const *src, int fd2, char const *dst,
   struct stat dst_st;
   bool dst_found_nonexistent = false;
 
-  /* Check the flags.  */
-# if defined RENAME_EXCL
-  /* We can support RENAME_EXCHANGE and RENAME_NOREPLACE.  */
-  if (flags & ~(RENAME_EXCHANGE | RENAME_NOREPLACE))
-# else
-  /* RENAME_NOREPLACE is the only flag currently supported.  */
-  if (flags & ~RENAME_NOREPLACE)
-# endif
-return errno_fail (ENOTSUP);
-
-# if defined RENAME_EXCL
-  uflags = ((flags & RENAME_EXCHANGE ? RENAME_SWAP : 0)
-| (flags & RENAME_NOREPLACE ? RENAME_EXCL : 0));
-# endif
-
-  if ((flags & RENAME_NOREPLACE) != 0)
+  switch (flags)
 {
+case 0:
+  break;
+
+case RENAME_NOREPLACE:
   /* This has a race between the call to lstatat and the calls to
  renameat below.  This lstatat is needed even if RENAME_EXCL
  is defined, because RENAME_EXCL is buggy on macOS 11.2:
@@ -134,26 +143,22 @@ renameatu (int fd1, char const *src, int fd2, char const *dst,
   if (errno != ENOENT)
 return -1;
   dst_found_nonexistent = true;
+  break;
+
+default:
+  return errno_fail (ENOTSUP);
 }
 
   /* Let strace see any ENOENT failure.  */
   src_len = strlen (src);
   dst_len = strlen (dst);
   if (!src_len || !dst_len)
-# if defined RENAME_EXCL
-return renameatx_np (fd1, src, fd2, dst, uflags);
-# else
-return renameat (fd1, src, fd2, dst);
-# endif
+return renameat2ish (fd1, src, fd2, dst, flags);
 
   src_slash = src[src_len - 1] == '/';
   dst_slash = dst[dst_len - 1] == '/';
   if (!src_slash && !dst_slash)
-# if defined RENAME_EXCL
-return renameatx_np (fd1, src, fd2, dst, uflags);
-# else
-return renameat (fd1, src, fd2, dst);
-# endif
+return renameat2ish (fd1, src, fd2, dst, flags);
 
   /* Presence of a trailing slash requires directory semantics.  If
  the source doe

  1   2   3   4   5   6   7   8   9   10   >