from:"Pádraig Brady"

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-13 Thread Pádraig Brady


On 13/09/2024 13:55, Pádraig Brady wrote:

On 12/09/2024 20:33, Paul Eggert wrote:

On 2024-09-12 12:03, Pádraig Brady wrote:



This is tricky enough, that we should be as restrictive as possible here,
so I may resort to strspn(f, "0123456789") to parse instead.
I'll think a bit about it.


The code's also assuming INT_MAX < INTMAX_MAX, which POSIX doesn't
require. You could put in a static_assert to that effect, I suppose, to
document the assumption.


Indeed. We would have incorrectly taken the INTMAX_MAX arg in the
(albeit unlikely) case where INT_MAX >= INTMAX_MAX,
and the provided number overflowed INTMAX_MAX.
To be explicit, strtol() doesn't return 0 in that case,
so we need to check overflow (like Colin suggested).


More important, though, if you're not in the C locale all bets are off
as far as what strtoimax will also parse.

When I ran into this problem with GNU tar, I ended by giving up on
strtoimax and did my own little integer parser. It does exactly what I
want and I don't have to fire up the strtoimax complexity+locale engine.


Right, it's best to preparse for the above reason,
and to avoid any confusion re leading spaces etc. like I previously mentioned.

The attached adjustment does the preparse with strspn(),
and only does the strtoimax() for appropriate strings.


I adjusted and pushed a further simplification that
clamps %$ to %INT_MAX$, which is equivalent
as argc can practically only be <= INT_MAX - 2.
That simplifies the strtoimax() error handling,
and removes any limits on valid decimal numbers.

cheers,
Pádraig

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-13 Thread Pádraig Brady


On 12/09/2024 20:33, Paul Eggert wrote:

On 2024-09-12 12:03, Pádraig Brady wrote:



This is tricky enough, that we should be as restrictive as possible here,
so I may resort to strspn(f, "0123456789") to parse instead.
I'll think a bit about it.


The code's also assuming INT_MAX < INTMAX_MAX, which POSIX doesn't
require. You could put in a static_assert to that effect, I suppose, to
document the assumption.


Indeed. We would have incorrectly taken the INTMAX_MAX arg in the
(albeit unlikely) case where INT_MAX >= INTMAX_MAX,
and the provided number overflowed INTMAX_MAX.
To be explicit, strtol() doesn't return 0 in that case,
so we need to check overflow (like Colin suggested).


More important, though, if you're not in the C locale all bets are off
as far as what strtoimax will also parse.

When I ran into this problem with GNU tar, I ended by giving up on
strtoimax and did my own little integer parser. It does exactly what I
want and I don't have to fire up the strtoimax complexity+locale engine.


Right, it's best to preparse for the above reason,
and to avoid any confusion re leading spaces etc. like I previously mentioned.

The attached adjustment does the preparse with strspn(),
and only does the strtoimax() for appropriate strings.

cheers,
Pádraigdiff --git a/src/printf.c b/src/printf.c
index 37844f14a..442935705 100644
--- a/src/printf.c
+++ b/src/printf.c
@@ -454,14 +454,23 @@ print_formatted (char const *format, int argc, char **argv)
 
 #define GET_CURR_ARG(POS)\
 do {			\
-  char *arge;		\
-  intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);	\
-  if (0 < arg && arg <= INT_MAX && *arge == '$')	\
+  intmax_t arg = 0;	\
+  size_t argl;		\
+  /* Check with strspn() first to avoid spaces etc.	\
+ This also avoids any locale ambiguities.  */	\
+  if (POS != 3 && (argl = strspn (f, "0123456789"))	\
+  && f[argl] == '$')\
+{			\
+  errno = 0;	\
+  arg = strtoimax (f, NULL, 10);			\
+  if (errno)	\
+arg = 0;	\
+}			\
+  if (1 <= arg && arg <= INT_MAX)			\
 /* Process indexed %i$ format.  */			\
-/* Note '$' comes before any flags.  */		\
 {			\
   SET_CURR_ARG (arg - 1);\
-  f = arge + 1;	\
+  f += argl + 1;	\
   if (POS == 0)	\
 direc_arg = arg - 1;\
 }			\
diff --git a/tests/printf/printf-indexed.sh b/tests/printf/printf-indexed.sh
index d0f799ab1..5d63d68f8 100755
--- a/tests/printf/printf-indexed.sh
+++ b/tests/printf/printf-indexed.sh
@@ -75,19 +75,28 @@ printf_check '   1' '%2$*1$d\n' 4 1
 # Indexed arg, and sequential width
 printf_check '   1' '%2$*d\n' 4 1
 
-# Flags come after $ (0 is not a flag here):
+# Flags come after $ (0 is not a flag here but allowed):
 printf_check '   1' '%01$4d\n' 1
 # Flags come after $ (0 is a flag here):
 printf_check '0001' '%1$0*2$d\n' 1 4
 # Flags come after $ (-2 not taken as a valid index here):
 printf_check_err 'printf: %-2$: invalid conversion specification' \
  '%-2$s %1$s\n' A B
+# Flags come after $ (' ' is not allowed as part of number here)
+printf_check_err 'printf: % 2$: invalid conversion specification' \
+ '% 2$s %1$s\n' A B
 
 # Ensure only base 10 numbers are accepted
 printf_check_err "printf: 'A': expected a numeric value" \
  '%0x2$s %2$s\n' A B
+# Ensure empty numbers are rejected
+printf_check_err 'printf: %$: invalid conversion specification' \
+ '%$d\n' 1
 # Verify int limits (avoiding comparisons with argc etc.)
 printf_check_err "printf: %${INT_OFLOW}\$: invalid conversion specification" \
  "%${INT_OFLOW}\$d\n" 1
+# Verify intmax limits, ensuring overflow detected
+printf_check_err "printf: %${INTMAX_OFLOW}\$: invalid conversion specification" \
+ "%${INTMAX_OFLOW}\$d\n" 1
 
 Exit $fail

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-12 Thread Pádraig Brady


On 12/09/2024 18:40, Collin Funk wrote:

Hi Pádraig,

Pádraig Brady  writes:


I'll apply the attached sometime tomorrow.

Marking this as done.


Patch looks good, thanks.

One small comment, though.


+#define GET_CURR_ARG(POS)  \
+do {   \
+  char *arge;  \
+  intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);\
+  if (0 < arg && arg <= INT_MAX && *arge == '$') \
+/* Process indexed %i$ format.  */ \
+/* Note '$' comes before any flags.  */\


Shouldn't you check errno here, like:

   char *arge;
   errno = 0;
   intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);
   if (errno == 0 && 0 < arg && arg <= INT_MAX && *arge == '$')
   [...]

I think that would handle all bad cases.

For example, I think "%$" might return 0 but set errno to EINVAL.


A fair point, but note we only accept 1 ... INT_MAX,
so that implicitly excludes any of the possible error returns.
I should at least add a comment.

Though it got me thinking that strtol() may be too lenient
in what it accepts, resulting in possible confusion for users.
For example some printf flags like ' ' or '+' would be accepted
as part of a number, when ideally they should not be.

For example, the user might do:

$ printf '[% 1$d]\n' 1234
[1234]

When they really intended:

$ printf '[%1$ d]\n' 1234
[ 1234]

This is tricky enough, that we should be as restrictive as possible here,
so I may resort to strspn(f, "0123456789") to parse instead.
I'll think a bit about it.

thanks!
Pádraig

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-12 Thread Pádraig Brady


On 12/09/2024 18:06, Bruno Haible wrote:

Pádraig Brady wrote:

I'll apply the attached sometime tomorrow.


Nice! Thank you.

There seems to be a typo in the unit test, though: It defines a shell
function 'printf_checki_err' but the function it then invokes is
'printf_check_err'.


Hah, good catch.
That hid other errors in the test.
Fixed up with the attached.

thanks for the review,
Pádraigdiff --git a/tests/printf/printf-indexed.sh b/tests/printf/printf-indexed.sh
index 1c3a6c380..d0f799ab1 100755
--- a/tests/printf/printf-indexed.sh
+++ b/tests/printf/printf-indexed.sh
@@ -34,14 +34,14 @@ EOF
   compare exp out || fail=1
 }
 
-printf_checki_err() {
+printf_check_err() {
   cat < exp || framework_failure_
 $1
 EOF
 
   shift
 
-  returns_1 $prog "$@" 2> out || fail=1
+  returns_ 1 $prog "$@" 2> out || fail=1
   compare exp out || fail=1
 }
 
@@ -85,9 +85,9 @@ printf_check_err 'printf: %-2$: invalid conversion specification' \
 
 # Ensure only base 10 numbers are accepted
 printf_check_err "printf: 'A': expected a numeric value" \
- '%0x2$s %1$s\n' A B
+ '%0x2$s %2$s\n' A B
 # Verify int limits (avoiding comparisons with argc etc.)
-printf_check_err "printf: %${INT_OFLOW}$: invalid conversion specification" \
- "%${INT_OFLOW}$d\n" 1
+printf_check_err "printf: %${INT_OFLOW}\$: invalid conversion specification" \
+ "%${INT_OFLOW}\$d\n" 1
 
 Exit $fail

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-12 Thread Pádraig Brady


On 09/09/2024 19:30, Pádraig Brady wrote:

On 06/09/2024 15:06, Bruno Haible wrote:

Hi,

POSIX:2024 specifies that printf(1) should support numbered conversion
specifications:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html
https://austingroupbugs.net/view.php?id=1592

Could this support please be added to GNU coreutils? As of coreutils 9.5,
I still get:

$ /usr/bin/printf 'abc%2$sdef%1$sxxx\n' 1 2
abc/usr/bin/printf: %2$: invalid conversion specification



This make sense to implement.
I see ksh and FreeBSD at least, already have.
I'll have a look at doing this.


I'll apply the attached sometime tomorrow.

Marking this as done.

cheers,
PádraigFrom 97e55c7ace9e9e46a32faa0d592870983a14367b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Wed, 11 Sep 2024 16:07:48 +0100
Subject: [PATCH] printf: add indexed argument support

* src/printf.c (print_formatted): Add support for %i$ indexed args.
* tests/printf/printf-indexed.sh: Add a new file of test cases.
* tests/local.mk: Reference the new test file.
* doc/coreutils.texi (printf invocation): Mention how mixed
processing of indexed and sequential references are supported,
unlike the printf(2) library function.
* NEWS: Mention the new feature.
These are specified in POSIX:2024.
Addresses https://bugs.gnu.org/73068
---
 NEWS   |   4 +
 doc/coreutils.texi |   6 ++
 src/printf.c   | 179 -
 tests/local.mk |   1 +
 tests/printf/printf-indexed.sh |  93 +
 5 files changed, 215 insertions(+), 68 deletions(-)
 create mode 100755 tests/printf/printf-indexed.sh

diff --git a/NEWS b/NEWS
index e1d3f82d1..6094de8d2 100644
--- a/NEWS
+++ b/NEWS
@@ -40,6 +40,10 @@ GNU coreutils NEWS-*- outline -*-
   ls now supports the --sort=name option,
   to explicitly select the default operation of sorting by file name.
 
+  printf now supports indexed arguments, using the POSIX:2024 specified
+  %i$ format, where 'i' is an integer referencing a particular argument,
+  thus allowing repetition or reordering of printf arguments.
+
 ** Improvements
 
   'head -c NUM', 'head -n NUM', 'nl -l NUM', 'nproc --ignore NUM',
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 58b425779..9fe953587 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -13429,6 +13429,12 @@ Missing @var{argument}s are treated as null strings or as zeros,
 depending on whether the context expects a string or a number.  For
 example, the command @samp{printf %sx%d} prints @samp{x0}.
 
+@item
+Indexed arguments referenced with @samp{%i$} formats, can be
+mixed with standard sequential argument references,
+in which case both index types are independent.
+For example, the command @samp{printf '%1$s%s' A} prints @samp{AA},
+
 @item
 @kindex \c
 An additional escape, @samp{\c}, causes @command{printf} to produce no
diff --git a/src/printf.c b/src/printf.c
index de3507925..b1c7dd561 100644
--- a/src/printf.c
+++ b/src/printf.c
@@ -291,15 +291,13 @@ print_esc_string (char const *str)
 }
 
 /* Evaluate a printf conversion specification.  START is the start of
-   the directive, LENGTH is its length, and CONVERSION specifies the
-   type of conversion.  LENGTH does not include any length modifier or
-   the conversion specifier itself.  FIELD_WIDTH and PRECISION are the
-   field width and precision for '*' values, if HAVE_FIELD_WIDTH and
-   HAVE_PRECISION are true, respectively.  ARGUMENT is the argument to
-   be formatted.  */
+   the directive, and CONVERSION specifies the type of conversion.
+   FIELD_WIDTH and PRECISION are the field width and precision for '*'
+   values, if HAVE_FIELD_WIDTH and HAVE_PRECISION are true, respectively.
+   ARGUMENT is the argument to be formatted.  */
 
 static void
-print_direc (char const *start, size_t length, char conversion,
+print_direc (char const *start, char conversion,
  bool have_field_width, int field_width,
  bool have_precision, int precision,
  char const *argument)
@@ -333,6 +331,7 @@ print_direc (char const *start, size_t length, char conversion,
 break;
   }
 
+size_t length = strlen (start);
 p = xmalloc (length + length_modifier_len + 2);
 q = mempcpy (p, start, length);
 q = mempcpy (q, length_modifier, length_modifier_len);
@@ -448,50 +447,92 @@ print_direc (char const *start, size_t length, char conversion,
 static int
 print_formatted (char const *format, int argc, char **argv)
 {
-  int save_argc = argc;		/* Preserve original value.  */
+
+/* Set curr_arg from indexed %i$ or otherwise next in sequence.
+   POS can be 0,1,2,3 corresponding to
+   [%][width][.precision][conversion] respectively.  */
+
+#define GET_CURR_ARG(POS)\
+do {			\
+  char *arge;		\

bug#73194: ls command converts utf-8 character into escape sequences

2024-09-12 Thread Pádraig Brady


On 12/09/2024 11:16, Simon Wolfe wrote:

I have one file name that uses Unicode character U+318DF, which is in the 
tertiary pane, more precisely CJK Unified Ideographs Extension H.

touch 𱣟
ls

returns:

''$'\360\261\243\237'

Extension H was introduced in Unicode 15.0 in 2022.

I also notice that this bug occurs with any character with Extension I 
(introduced in 2023).

Extension G seems to works okay.


ls 9.4 works as expected for me with glibc-2.39 in a UTF-8 locale.
I.e. that file is displayed directly.
Now if I set the locale to non UTF-8 it will display the form above
(which works on all locales BTW).

  $ touch ''$'\360\261\243\237'
  $ ls ''$'\360\261\243\237'
  𱣟
  $ LC_ALL=C ls ''$'\360\261\243\237'
  ''$'\360\261\243\237'

So I suspect your system libs are not updated to recognize this character,
hence the fallback format is used.

cheers,
Pádraig.

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-09 Thread Pádraig Brady


On 06/09/2024 15:06, Bruno Haible wrote:

Hi,

POSIX:2024 specifies that printf(1) should support numbered conversion
specifications:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html
https://austingroupbugs.net/view.php?id=1592

Could this support please be added to GNU coreutils? As of coreutils 9.5,
I still get:

   $ /usr/bin/printf 'abc%2$sdef%1$sxxx\n' 1 2
   abc/usr/bin/printf: %2$: invalid conversion specification



This make sense to implement.
I see ksh and FreeBSD at least, already have.
I'll have a look at doing this.

thank you,
Pádraig

bug#72842: Compile error due to lto warning on Arch Linux -Werror=maybe-uninitialized

2024-08-28 Thread Pádraig Brady


On 27/08/2024 22:34, Paul Eggert wrote:

I'm not seeing that false alarm when building coreutils v9.5 on Ubuntu
24.04 LTS with "./configure --enable-gcc-warnings CC=gcc-14".

What GCC version are you using? If it's not GCC 14, try upgrading. I'm
using gcc-14 (Ubuntu 14-20240412-0ubuntu1) 14.0.1 20240412
(experimental) [master r14-9935-g67e1433a94f].


I think LTO is key here, as GCC has more context then to produce warnings.
With `-march=native -O3 -flto` on GCC 14.2.1 I see,
2 false warnings in gnulib (patch attached).
1 correct warning in coreutils (patch attached).

Marking this as done.
I'll apply the patches later today.

cheers,
Pádraig.From 264edcf263bff02aebfe67f1d7343fbf0e96b5de Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Wed, 28 Aug 2024 12:10:43 +0100
Subject: [PATCH] avoid GCC -Wmaybe-uninitialized false positives with LTO

Avoids false warnings with GCC 14.2.1 with -flto

* lib/canonicalize.c: Initialize END_IDX.
* lib/getndelim2.c: Initicalise C.
---
 ChangeLog  | 8 
 lib/canonicalize.c | 9 -
 lib/getndelim2.c   | 8 +---
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8a7f812b67..fdcc79134e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2024-08-28  Pádraig Brady  
+
+	avoid GCC -Wmaybe-uninitialized false positives with LTO
+	Avoids false warnings with GCC 14.2.1 with -flto
+
+	* lib/canonicalize.c: Initialize END_IDX.
+	* lib/getndelim2.c: Initicalise C.
+
 2024-08-28  Bruno Haible  
 
 	doc: Add more details about O_EXEC and O_SEARCH.
diff --git a/lib/canonicalize.c b/lib/canonicalize.c
index 7d2a629024..2572b40558 100644
--- a/lib/canonicalize.c
+++ b/lib/canonicalize.c
@@ -34,6 +34,13 @@
 #include "hash-triple.h"
 #include "xalloc.h"
 
+/* Suppress bogus GCC -Wmaybe-uninitialized warnings.  */
+#if defined GCC_LINT || defined lint
+# define IF_LINT(Code) Code
+#else
+# define IF_LINT(Code) /* empty */
+#endif
+
 #ifndef DOUBLE_SLASH_IS_DISTINCT_ROOT
 # define DOUBLE_SLASH_IS_DISTINCT_ROOT false
 #endif
@@ -367,7 +374,7 @@ canonicalize_filename_mode_stk (const char *name, canonicalize_mode_t can_mode,
   buf[n] = '\0';
 
   char *extra_buf = bufs->extra.data;
-  idx_t end_idx;
+  idx_t end_idx IF_LINT (= 0);
   if (end_in_extra_buffer)
 end_idx = end - extra_buf;
   size_t len = strlen (end);
diff --git a/lib/getndelim2.c b/lib/getndelim2.c
index 89989aefdd..db61e2a5e6 100644
--- a/lib/getndelim2.c
+++ b/lib/getndelim2.c
@@ -47,8 +47,10 @@
 #include "memchr2.h"
 
 /* Avoid false GCC warning "'c' may be used uninitialized".  */
-#if _GL_GNUC_PREREQ (4, 7)
-# pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#if defined GCC_LINT || defined lint
+# define IF_LINT(Code) Code
+#else
+# define IF_LINT(Code) /* empty */
 #endif
 
 /* The maximum value that getndelim2 can return without suffering from
@@ -102,7 +104,7 @@ getndelim2 (char **lineptr, size_t *linesize, size_t offset, size_t nmax,
   /* Here always ptr + size == read_pos + nbytes_avail.
  Also nbytes_avail > 0 || size < nmax.  */
 
-  int c;
+  int c IF_LINT (= EOF);
   const char *buffer;
   size_t buffer_len;
 
-- 
2.46.0

From 9b9763e6a7499d78373fb42bac3dc2ee68a14b69 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Wed, 28 Aug 2024 12:33:17 +0100
Subject: [PATCH] all: fix error checking in gl/lib/xdectoint.c

This issue was noticed with -flto on GCC 14.2.1

* gl/lib/xdectoint.c (__xnumtoint): Only inspect the
returned value if LONGINT_INVALID is not set,
as the returned value is uninitialized in that case.
Fixes https://bugs.gnu.org/72842
---
 gl/lib/xdectoint.c | 37 -
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/gl/lib/xdectoint.c b/gl/lib/xdectoint.c
index bb38431f3..05534701c 100644
--- a/gl/lib/xdectoint.c
+++ b/gl/lib/xdectoint.c
@@ -49,24 +49,27 @@ __xnumtoint (char const *n_str, int base, __xdectoint_t min, __xdectoint_t max,
   /* Errno value to report if there is an overflow.  */
   int overflow_errno;
 
-  if (tnum < min)
+  if (s_err != LONGINT_INVALID)
 {
-  r = min;
-  overflow_errno = flags & XTOINT_MIN_RANGE ? ERANGE : EOVERFLOW;
-  if (s_err == LONGINT_OK)
-s_err = LONGINT_OVERFLOW;
-}
-  else if (max < tnum)
-{
-  r = max;
-  overflow_errno = flags & XTOINT_MAX_RANGE ? ERANGE : EOVERFLOW;
-  if (s_err == LONGINT_OK)
-s_err = LONGINT_OVERFLOW;
-}
-  else
-{
-  r = tnum;
-  overflow_errno = EOVERFLOW;
+  if (tnum < min)
+{
+  r = min;
+  overflow_errno = flags & XTOINT_MIN_RANGE ? ERANGE : EOVERFLOW;
+  if (s_err == LONGINT_OK)
+s_err = LONGINT_OVERFLOW;
+}
+

bug#72770: stat: filesystem ID regression in 9.5 with musl libc

2024-08-23 Thread Pádraig Brady


On 23/08/2024 12:39, Natanael Copa wrote:

On Fri, 23 Aug 2024 11:41:28 +0100
Pádraig Brady  wrote:


On 23/08/2024 08:14, Natanael Copa wrote:

With coreutils 9.5 stat -f -c %i returns wrong value with musl libc.

$ docker run --rm -it -v /tmp:/tmp alpine:3.20 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.5
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89
4dc59e89680d9b7c

Coreutils returns 4dc59e89 instead of the expected 4dc59e89680d9b7c.

With 9.4 this works as expected:

$ docker run --rm -it -v /tmp:/tmp alpine:3.19 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.4
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89680d9b7c
4dc59e89680d9b7c

Downstream report: https://gitlab.alpinelinux.org/alpine/aports/-/issues/16386


I can't see what might have changed on the coreutils side with a quick look.
It would help eliminate cases if you provided strace output from 9.4 and 9.5


I think the difference is that with 9.4 it ends up using statfs(2), and
with 9.5 it ends up using statvfs(3).

musl only uses 32 bits of the returned 64 bits from statfs(2) to match the 
POSIX interface.

https://git.musl-libc.org/cgit/musl/tree/src/stat/statvfs.c?id=dd1e63c3638d5f9afb857fccf6ce1415ca5f1b8b#n39

I believe they do so to have consistent fsid on 32 bit and 64 bit architectures.


Interesting. I see musl has been doing this truncation since 2011,
but I guess the coreutils build picked up statvfs() since musl
started returning f_type in it:
https://git.musl-libc.org/cgit/musl/commit/src/stat/statvfs.c?id=7291c6c6
I suppose we could adjust m4/stat-prog.m4 in coreutils to
avoid statvfs detection on musl

It would be worth suggesting to the musl folks to avoid this truncation.

thanks,
Pádraig

bug#72770: stat: filesystem ID regression in 9.5 with musl libc

2024-08-23 Thread Pádraig Brady


On 23/08/2024 08:14, Natanael Copa wrote:

With coreutils 9.5 stat -f -c %i returns wrong value with musl libc.

$ docker run --rm -it -v /tmp:/tmp alpine:3.20 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.5
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later .
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89
4dc59e89680d9b7c

Coreutils returns 4dc59e89 instead of the expected 4dc59e89680d9b7c.

With 9.4 this works as expected:

$ docker run --rm -it -v /tmp:/tmp alpine:3.19 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.4
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later .
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89680d9b7c
4dc59e89680d9b7c

Downstream report: https://gitlab.alpinelinux.org/alpine/aports/-/issues/16386


I can't see what might have changed on the coreutils side with a quick look.
It would help eliminate cases if you provided strace output from 9.4 and 9.5

thanks,
Pádraig

bug#72707: install: need_copy (for "-C") should use stat instead of lstat

2024-08-19 Thread Pádraig Brady


On 19/08/2024 03:44, Frank Heckenbach wrote:

install dereferences symlinks given as sources (and I think that's
good). The "-C" option, however, uses lstat rather than
stat in need_copy and returns true if !S_ISREG.

So the (dereferenced) file will always be copied despite "-C".

Even worse, since "-p" cannot be combined with "-C", the installed
file will always get a new timestamp which may trigger rebuilds
(e.g. if the file is a header) even if nothing was changed, e.g.:

% touch a
% ln -s a b
% install -C b c
% ls --time-style=+%T -l
total 0
-rw--- 1 frank frank 0 23:46:51 a
lrwxrwxrwx 1 frank frank 1 23:46:54 b -> a
-rwxr-xr-x 1 frank frank 0 23:46:58 c
% install -C b c
% ls --time-style=+%T -l
total 0
-rw--- 1 frank frank 0 23:46:51 a
lrwxrwxrwx 1 frank frank 1 23:46:54 b -> a
-rwxr-xr-x 1 frank frank 0 23:47:06 c
% install -C b c
% ls --time-style=+%T -l
total 0
-rw--- 1 frank frank 0 23:46:51 a
lrwxrwxrwx 1 frank frank 1 23:46:54 b -> a
-rwxr-xr-x 1 frank frank 0 23:47:08 c
%

I asked Kamil Dudka who implemented "-C" originally and he pointed
out that requiring both the files to be regular files was suggested
by Jim Meyering in the review:
https://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00106.html

However, I don't see a real reason there, only his question:
"Have you considered requiring that both files be `regular', too?"

I don't know if install didn't dereference symlinks back then, but
now that it does, it seems consistent to dereference them in the
check as well, i.e. use stat rather than lstat.


I agree with your arguments to change this.

I'm not sure if the suggestion above about regular files,
was meant to preclude symlinks.

Marking this as done,
and I'll apply the attached later.

thanks,
PádraigFrom 692ae740939cb356d7a16cb1e17d959e68d1fa91 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 19 Aug 2024 12:43:09 +0100
Subject: [PATCH] install: dereference source symlinks when comparing

* NEWS: Mention the change in behavior.
* src/install.c (need_copy): s/lstat/stat/ for the source.
* tests/install/install-C.sh: Add test cases
(and improve existing test case which wan't valid
due to the existing non standard modes on test files).
Addresses https://bugs.gnu.org/72707
---
 NEWS   |  3 +++
 src/install.c  |  2 +-
 tests/install/install-C.sh | 12 +++-
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 65cc3fde0..e1d3f82d1 100644
--- a/NEWS
+++ b/NEWS
@@ -24,6 +24,9 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Changes in behavior
 
+  install -C now dereferences symlink sources when comparing,
+  rather than always treating as different and performing the copy.
+
   ls's -f option now simply acts like -aU, instead of also ignoring
   some earlier options.  For example 'ls -fl' and 'ls -lf' are now
   equivalent because -f no longer ignores an earlier -l.  The new
diff --git a/src/install.c b/src/install.c
index 939db9cd4..85f2d9e11 100644
--- a/src/install.c
+++ b/src/install.c
@@ -177,7 +177,7 @@ need_copy (char const *src_name, char const *dest_name,
 return true;
 
   /* compare files using stat */
-  if (lstat (src_name, &src_sb) != 0)
+  if (stat (src_name, &src_sb) != 0)
 return true;
 
   if (fstatat (dest_dirfd, dest_relname, &dest_sb, AT_SYMLINK_NOFOLLOW) != 0)
diff --git a/tests/install/install-C.sh b/tests/install/install-C.sh
index a6b332e14..ee3ee2467 100755
--- a/tests/install/install-C.sh
+++ b/tests/install/install-C.sh
@@ -79,8 +79,18 @@ ginstall -Cv -m$mode3 a b > out || fail=1
 compare out out_installed_second || fail=1
 
 # files are not regular files
+ginstall -v -m$mode1 a b > out || fail=1  # reset to regular mode
+compare out out_installed_second || fail=1
+# symlink source is always dereferenced (and so regular here)
+ginstall -v -m$mode1 a d > out || fail=1  # create regular dest
+echo "'a' -> 'd'" > out_installed_first_ad || framework_failure_
+compare out out_installed_first_ad || fail=1
 ln -s a c || framework_failure_
-ln -s b d || framework_failure_
+ginstall -Cv -m$mode1 c d > out || fail=1
+compare out out_empty || fail=1
+# symlink (non regular) dest is always replaced
+ln -nsf b d || framework_failure_
+ls -l a b c d > /tmp/pb.out
 ginstall -Cv -m$mode1 c d > out || fail=1
 echo "removed 'd'
 'c' -> 'd'" > out_installed_second_cd
-- 
2.45.2

bug#72617: sort -n loses lines.

2024-08-14 Thread Pádraig Brady


On 14/08/2024 11:04, Simon B wrote:

Hi Pádraig

I am largely satisfied by your great explanation,

I am still confused why lines go "missing" though.




Even if the dot is being interpreted, it still should not lose the
line containing 172.169.6.164


This is actually well explained in the online documentation,
so I won't repeat here.  See the --unique description at:
https://www.gnu.org/software/coreutils/sort

cheers,
Pádraig

bug#72617: sort -n loses lines.

2024-08-14 Thread Pádraig Brady


tag 72617 notabug
close 72617
stop

On 14/08/2024 09:43, Simon B wrote:

Hallo,

The output of my grep command is:

# grep -i "sshd" /root/access.report | egrep -o
  
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
64.227.127.122
172.169.5.249
172.169.6.164


Adding the --debug option shows the issue.
I.e. the '.' being considered as part of a number:

$ sort --debug -rbn -s ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: note numbers use ‘.’ as a decimal point in this locale
178.128.44.128
___
172.169.6.164
___
172.169.5.249
___


Taking the example for sorting IPv4 addresses from the manual,
shows the desired comparisons being performed:

$ sort --debug -t '.' -k 1,1rn -k 2,2rn -k 3,3rn -k 4,4rn -u ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: numbers use ‘.’ as a decimal point in this locale
178.128.44.128
___
___
__
   ___
172.169.6.164
___
___
_
  ___
172.169.5.249
___
___
_
  ___


cheers,
Pádraig

bug#72567: printf: Incorrect description of "%b"

2024-08-11 Thread Pádraig Brady


forcemerge 72567 72568
close 72567
stop

On 11/08/2024 04:05, Keith Thompson wrote:

There are three different descriptions of the printf(1) "%b" format:

- printf --help
- man printf.1
- info coreutils printf

As of release 9.5 and the latest version in git (Thu 2024-08-08
c5725c8c4), the first two incorrectly say that octal escapes are
of the form \0 or \0NNN. In fact the \0 can be followed by 0 to 3
octal digits.


Right. The main point there really is that a leading 0 is the exception.
(A leading 0 is required by ksh for example, but optional elsewhere).
I've pushed a change along those lines at:
https://github.com/coreutils/coreutils/commit/296cc3ed9

thanks,
Pádraig

bug#72446: [PATCH] stat: Fix memleak

2024-08-03 Thread Pádraig Brady


tag 72446 notabug
close 72446
stop

On 03/08/2024 17:10, Dmitry Chestnykh wrote:

format and format2 strings are allocated
by `malloc()` inside `xasprintf` so the memory
should be freed

* src/stat.c: Call `free()` on `format` and `format2`
---
  src/stat.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/stat.c b/src/stat.c
index 1513abfaa..47f3b5052 100644
--- a/src/stat.c
+++ b/src/stat.c
@@ -1974,5 +1974,7 @@ main (int argc, char *argv[])
 ? do_statfs (argv[i], format)
 : do_stat (argv[i], format, format2));
  
+  free(format);

+  free(format2);
main_exit (ok ? EXIT_SUCCESS : EXIT_FAILURE);
  }


Hi,

There are a could of problems with this.
1. It's redundant to free just before exit
2. It's invalid to free this memory with the -c format specified.

valgrind currently shows no leaks with or without -c
(it does show reachable blocks, but that's fine).

thanks,
Pádraig

bug#72232: "make dist" is not reproducible

2024-07-21 Thread Pádraig Brady


On 21/07/2024 18:58, Paul Eggert wrote:

On 2024-07-21 10:44, Pádraig Brady wrote:

We can just rely on the timestamp of the .tarball-version
to support reproducible _tarballs_.


Although this makes the distributed file contents reproducible, the
tarballs themselves are still not reproducible, since they contain
random timestamps and (I imagine) other metadata. I vaguely recall Bruno
having a general solution for that sort of problem but don't recall
where it got put. (I do something more complicated in TZDB.)


Fair enough.  That issue is discussed at:
https://www.gnu.org/software/tar/manual/html_section/Reproducibility.html

For now at least I'll adjust the summary
to say "tarball contents" rather than "tarballs".

thanks,
Pádraig

bug#72232: "make dist" is not reproducible

2024-07-21 Thread Pádraig Brady


On 21/07/2024 17:18, Bruno Haible wrote:

Hi,

Subsequent runs of "make dist" in the same environment produce tarballs with
different contents.

How to reproduce:

In a git checkout of coreutils, do:

$ echo snapshot > .tarball-version
$ ./configure; make -k maintainer-clean
$ ./bootstrap
$ ./configure; make dist
$ mkdir 1; (cd 1 && tar xf ../*.tar.gz)
$ make; make dist
$ mkdir 2; (cd 2 && tar xf ../*.tar.gz)
$ diff -r -q 1 2

The last command produces a difference:

$ diff -r -q 1 2
Files 1/coreutils-snapshot/.timestamp and 2/coreutils-snapshot/.timestamp differ
$ cat 1/coreutils-snapshot/.timestamp
1721577765
$ cat 2/coreutils-snapshot/.timestamp
1721577832

Can this be avoided? Can the contents of '.timestamp' always be the same?
(Wouldn't it be enough to give it a different modification time, each time?)


Right, we should be able to adjust this.
The .timestamp file was added to support reproducible _builds_:
https://github.com/coreutils/coreutils/commit/c1b3d6587
https://reproducible-builds.org/docs/source-date-epoch/

We can just rely on the timestamp of the .tarball-version
to support reproducible _tarballs_.
That's done in the attached, which I'll apply later.

Marking this as done.

thanks,
PádraigFrom ed1273b00b127cad9bb9867e0fd2be70d60bee1a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 21 Jul 2024 18:25:05 +0100
Subject: [PATCH] build: support reproducible tarball creation

We already support reproducible builds since commit v8.24-99-gc1b3d6587,
and this adjusts that change to also support reproducible tarballs with
subsequent runs of `make dist`.

* Makefile.am: Don't create a varying .timestamp file, instead ...
* man/local.mk: Rely on the timestamp of the .tarball-version file.
Fixes https://bugs.gnu.org/72232
---
 Makefile.am  | 2 --
 man/local.mk | 3 ++-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index bbbdc7829..3fd546599 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -82,14 +82,12 @@ BUILT_SOURCES = .version
 
 # Have no read-only files in the tarball to allow easy removal.
 # Have .tarball-version based versions only in tarball builds.
-# Have .timestamp based dates only in tarball builds.
 # The perl substitution is to change some key uses of "rm" to "/bin/rm".
 # See the rm_subst comment for details.
 # The touch avoids a subtle, spurious "make distcheck" failure.
 dist-hook: gen-ChangeLog
 	$(AM_V_GEN)chmod -R +rw $(distdir)
 	$(AM_V_GEN)echo $(VERSION) > $(distdir)/.tarball-version
-	$(AM_V_GEN)date +%s > $(distdir)/.timestamp
 	$(AM_V_at)perl -pi -e '$(rm_subst)' $(distdir)/Makefile.in
 	$(AM_V_at)touch $(distdir)/doc/constants.texi \
 	  $(distdir)/doc/coreutils.info
diff --git a/man/local.mk b/man/local.mk
index 24f88745a..1acd73fe7 100644
--- a/man/local.mk
+++ b/man/local.mk
@@ -194,7 +194,8 @@ endif
 	  && $(MKDIR_P) $$t		\
 	  && (cd $$t && $(LN_S) '$(abs_top_builddir)/src/'$$prog$(EXEEXT) \
 $$argv$(EXEEXT))			\
-	&& : $${SOURCE_DATE_EPOCH=`cat $(srcdir)/.timestamp 2>/dev/null || :`} \
+	&& : $${SOURCE_DATE_EPOCH=`date -r $(srcdir)/.tarball-version +%s \
+   2>/dev/null || :`} \
 	&& : $${TZ=UTC0} && export TZ	\
 	&& export SOURCE_DATE_EPOCH && $(run_help2man)			\
 		 --source='$(PACKAGE_STRING)'			\
-- 
2.45.2

bug#72024: documentation suggestion: make old versions of docs easily available

2024-07-10 Thread Pádraig Brady


On 09/07/2024 22:18, mark.yagnatinsky--- via GNU coreutils Bug Reports wrote:

If go to the website of say, Perl or Python, and I see there's a cool function I want to 
use, then often there will be a note "new in version X.Y".
Then I know that if I need to support versions older than that, I can't use it, 
and otherwise I can.
If there is no such note, then I also have the option of easily pulling up old 
versions of the docs.
So I can quickly grab the docs for the oldest version I care about, and see if 
the function exists there.

Most GNU software (e.g., bash, coreutils, etc.) makes the latest version easily 
available on the website, but not older versions.
(The older versions are indeed available, but not require downloading and 
unzipping the full source distribution, or a similar level of annoyance.)
This is sometimes fine, since if I need to support some ancient version, then 
presumably I have access to at least one machine where that version is 
installed, and that machine will often have docs installed too.
But it's still nice to know whether I'm "skating close to the edge" or whether 
I'm using stuff that was already supported last century.

Just my two cents.
Mark.


This is a fair point.

Now there should be less interface churn from version to version,
than a larger API surface like python etc., but it's still a valid point.
Currently one way to glean this info would be from the NEWS file:
https://raw.githubusercontent.com/coreutils/coreutils/master/NEWS
But that is awkward to search, and you're not sure did you miss
an item, or that it was always present.

Presenting this version info inline in the docs would be
too invasive I think, given it's not usually required,
but presenting versions to select would be a useful service.
Along the lines of the FreeBSD project for example:
https://man.freebsd.org/cgi/man.cgi?query=timeout

Now the above is more useful as it's a complete distro,
which suggests it may be more appropriate for each distro
to provide complete versioned online man pages.
Debian does in fact do this with selectable distro versions:
https://manpages.debian.org/

Actually I now see a distro agnostic versioned man page repo,
which would be the most general way to determine portability constraints:
https://manned.org/

cheers,
Pádraig.

bug#72012: [PATCH] id: Warn if user in more groups than `id` can reliably print

2024-07-10 Thread Pádraig Brady


tag 72012 notabug
close 72012
stop

On 10/07/2024 00:31, Paul Eggert wrote:

On 7/10/24 00:35, Pádraig Brady wrote:

OK so id(1) will always show all groups it knows about.
Then the warning message might be along the lines of:

    "warning: User '%s' is a member of more groups than the current
system limit"


I also am not seeing the point of the proposed diagnostic. I daresay
most users would be more annoyed than usefully warned by the diagnostic;
I know I would.

The rare user concerned about being in "too many" groups can run
'getconf NGROUPS_MAX' and 'id -G | wc -w' and compare.


Right.
Given that id can display all the groups,
it's not its responsibility to display potential limits from elsewhere.

cheers,
Pádraig

bug#72012: [PATCH] id: Warn if user in more groups than `id` can reliably print

2024-07-09 Thread Pádraig Brady


On 09/07/2024 21:15, Otto Kekäläinen wrote:

Hi!

The point is just to emit a warning when this happens. Sure it is rare but the 
fix is pretty safe to apply.


OK so id(1) will always show all groups it knows about.
Then the warning message might be along the lines of:

  "warning: User '%s' is a member of more groups than the current system limit"

For reference I see a summary of limits of various systems at:
https://www.j3e.de/ngroups.html

cheers,
Pádraig

bug#72012: [PATCH] id: Warn if user in more groups than `id` can reliably print

2024-07-09 Thread Pádraig Brady


On 09/07/2024 05:22, Otto Kekäläinen wrote:

While rare, it is possible for a user to be a member in more groups than
what the system limit allows (on Linux typically NGROUPS_MAX=65536) and
if that is the case, running `id` or `id user` will not print all of
them. This is a minor bug, but easily fixable by emitting a warning if
it happens.
---
  src/id.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/id.c b/src/id.c
index 38d5517bd..c572b2d99 100644
--- a/src/id.c
+++ b/src/id.c
@@ -401,6 +401,13 @@ print_full_info (char const *username)
  ok &= false;
  return;
}
+else if (sysconf(_SC_NGROUPS_MAX) > 0 && n_groups > 
sysconf(_SC_NGROUPS_MAX))
+  {
+fprintf (stderr,
+ _("Warning: User '%s' may be member of more groups than "\
+   "the system allows\n"),
+ (username != NULL) ? username : "");
+  }
  
  if (n_groups > 0)

fputs (_(" groups="), stdout);


I'm a bit confused with this patch.
If the n_groups is larger than NGROUPS_MAX what consequence will it have?
I.e. is there any point to id(1) warning about this edge case?
id will be able to show all of the n_groups in this case right?
I interpret NGROUPS_MAX as a static limit(ation),
which more dynamic interfaces (like getgrouplist) are not constrained to.

cheers,
Pádraig

bug#71986: RFC: date @ to support ms.

2024-07-08 Thread Pádraig Brady


On 07/07/2024 20:46, Richard Neill wrote:

Hello,

I've noticed a lot of systems now return the timestamp in milliseconds
since the epoch, rather than seconds. This means that e.g.

date --date='@1720378861258'

will do something rather unexpected!

May I suggest that it would be nice if date had an input format that
would let me specify that the value is in ms?

I know we can bodge it with bc, or by injecting the decimal, or trimming
off the last 3 chars, but that seems inelegant, and requires extra
thinking (and hence bugs) from the programmer.
date --date='@1720378861.258'

Perhaps one of these syntaxes might be suitable?

   date --date='@ms1720378861258'
   date --date='@@1720378861258'
   date --epoch-ms --date='@1720378861258'


Yes this has some merit, but given we can leverage numfmt
to convert / round, I'm not sure it's warranted.  Consider for e.g.:

  $ ms2date() { date --date=@$(numfmt --to-unit=1K --round=nearest "$1"); }  $ 
ms2date 1720378861999
  Sun 07 Jul 2024 20:01:02 IST

cheers,
Pádraig

bug#71895: Test failure in Coreutils 9.4 [During LFS Build]

2024-07-02 Thread Pádraig Brady


On 02/07/2024 02:05, 0xn00dle via GNU coreutils Bug Reports wrote:

Hello,

While running the test for Coreutils during my LFS build (Chapter 8.57 
Coreutils-9.4: 
https://www.linuxfromscratch.org/lfs/view/12.1/chapter08/coreutils.html), I 
encountered a potential bug and the test recommended I report it.

LFS 12.1 Build, on Pop_OS! Host System
Coreutils Verson: 9.4
Kernel: 6.8.0-76060800daily20240311-generic
Architecture: x86_64

It occurred after running:

su tester -c "PATH=$PATH make RUN_EXPENSIVE_TESTS=yes check"

The fail details:

FAIL: tests/tty/tty
===




--
tty: extra operand 'a'
Try 'tty --help' for more information.
+ returns 2 tty -s a
tty: extra operand 'a'
Try 'tty --help' for more information.
+ test -w /dev/full
+ test -c /dev/full
+ test -t 0
+ returns 3 tty
tty: write error: No space left on device
+ returns 3 tty
tty: write error: No space left on device
+ Exit 1
+ set +e
+ exit 1
+ exit 1
+ removetmp
+ _st=1
+ cleanup
+ :
+ test '' = yes
+ cd /sources/coreutils-9.4
+ chmod -R u+rwx /sources/coreutils-9.4/gt-tty.sh.xcFx
+ rm -rf /sources/coreutils-9.4/gt-tty.sh.xcFx
+ exit 1FAIL tests/tty/tty.sh (exit status: 1)

I applied coreutils-9.4-il8n-1.patch and a fix for a vulnerability in the split 
utility prior to the test.

Best,

Noelle



I can't repro.
Can you repro this test in isolation with:

  su tester -c 'make check VERBOSE=yes TESTS=tests/tty/tty.sh SUBDIRS=.'

Note some of the output of the test seemed to be elided,
so I wasn't able to see exactly the line that failed
(No space left on device is an expected case and not the issue here).

thanks,
Pádraig

bug#71803: ls --time=mtime is sorting by name instead of mtime

2024-06-27 Thread Pádraig Brady


On 27/06/2024 16:05, Dave wrote:

The ls command without the -l option and with the --time=mtime option,
is incorrectly sorting by the name rather than by the modification
time.

ls   # sorts by name (ok)
ls --time=mtime  # sorts by name (should sort by mtime)



// The current statement in ls.c (lines 2383-2387)
   sort_type = (0 <= sort_opt ? sort_opt
: (format != long_format
   && (time_type == time_ctime || time_type == time_atime
   || time_type == time_btime))
? sort_time : sort_name);

// Proposed correction (untested)
   sort_type = (0 <= sort_opt ? sort_opt
: (format != long_format
   && (time_type == time_ctime || time_type == time_atime
   || time_type == time_btime || time_type == time_mtime))
? sort_time : sort_name);


Right, we should be applying this GNU extension to --time=mtime also.
I.e. sorting when not displaying time (-l not specified),
and no other sorting specific option is used.

The proposed fix wouldn't work as time_mtime is the default,
so we'd be sorting by mtime rather than name by default.

I'll apply the attached later to address this.

Marking this as done.

thanks,
PádraigFrom b95c6ac7d20084d9167bdb9d759bcebcb0c2b766 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Thu, 27 Jun 2024 18:15:02 +0100
Subject: [PATCH] ls: treat --time=mtime consistently with other time selectors

* src/ls.c: Track if --time=mtime is explicitly specified,
so that we can apply the GNU extension of sorting by the
specified time, when not displaying (-l not specified),
and not explicitly sorting (-t not specified).
* tests/ls/ls-time.sh: Add / Update test cases.
Fixes https://bugs.gnu.org/71803
---
 src/ls.c| 18 --
 tests/ls/ls-time.sh | 30 --
 2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index 5e2f7c3e7..f7ffe1086 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -468,6 +468,7 @@ enum time_type
   };
 
 static enum time_type time_type;
+static bool explicit_time;
 
 /* The file characteristic to sort by.  Controlled by -t, -S, -U, -X, -v.
The values of each item of this enum are important since they are
@@ -1943,6 +1944,7 @@ decode_switches (int argc, char **argv)
 
 case 'c':
   time_type = time_ctime;
+  explicit_time = true;
   break;
 
 case 'd':
@@ -2017,6 +2019,7 @@ decode_switches (int argc, char **argv)
 
 case 'u':
   time_type = time_atime;
+  explicit_time = true;
   break;
 
 case 'v':
@@ -2146,6 +2149,7 @@ decode_switches (int argc, char **argv)
 
 case TIME_OPTION:
   time_type = XARGMATCH ("--time", optarg, time_args, time_types);
+  explicit_time = true;
   break;
 
 case FORMAT_OPTION:
@@ -2372,18 +2376,12 @@ decode_switches (int argc, char **argv)
   if (eolbyte < dired)
 error (LS_FAILURE, 0, _("--dired and --zero are incompatible"));
 
-  /* If -c or -u is specified and not -l (or any other option that implies -l),
- and no sort-type was specified, then sort by the ctime (-c) or atime (-u).
- The behavior of ls when using either -c or -u but with neither -l nor -t
- appears to be unspecified by POSIX.  So, with GNU ls, '-u' alone means
- sort by atime (this is the one that's not specified by the POSIX spec),
- -lu means show atime and sort by name, -lut means show atime and sort
- by atime.  */
+  /* If a time type is explicitly specified (with -c, -u, or --time=)
+ and we're not showing a time (-l not specified), then sort by that time,
+ rather than by name.  Note this behavior is unspecified by POSIX.  */
 
   sort_type = (0 <= sort_opt ? sort_opt
-   : (format != long_format
-  && (time_type == time_ctime || time_type == time_atime
-  || time_type == time_btime))
+   : (format != long_format && explicit_time)
? sort_time : sort_name);
 
   if (format == long_format)
diff --git a/tests/ls/ls-time.sh b/tests/ls/ls-time.sh
index f27a5dbbe..be8a83556 100755
--- a/tests/ls/ls-time.sh
+++ b/tests/ls/ls-time.sh
@@ -33,11 +33,15 @@ u2='1998-01-14 12:00'
 u3='1998-01-14 13:00'
 
 touch -m -d "$t3" a || framework_failure_
-touch -m -d "$t2" b || framework_failure_
+touch -m -d "$t2" B || framework_failure_  # Capital to distinguish name sort
 touch -m -d "$t1" c || framework_failure_
 
+# Check default name sorting works
+set $(ls a B c)
+test "$*" = 'B a c' || fail=1
+
 touch -a -d "$u3" c || framework_failure_
-touch -a -d "$u2" b || framework_failure_
+touch -a -d "$u2" B || framework_failure_
 # Make sure A has ctime at least 1 second more recent than C's.
 sleep 2
 touch -a -d "$u1" a || framework_failure_
@@ -47,7 +51,9 @@ touch -a -d "$u1" a || framework_failure_
 
 
 # A has ctime more recent

bug#71242: PR: src/stat.c - add magic for bcachefs

2024-05-28 Thread Pádraig Brady


On 28/05/2024 12:08, Paul Hedderly via GNU coreutils Bug Reports wrote:

Currently stat does not know bcachefs

$:~/gits/coreutils$ sudo stat -f -c %T /
UNKNOWN (0xca451a4e)

But it just needs to know that magic

$:~/gits/coreutils$ sudo ./src/stat -f -c %T /
bcachefs


Please pull this small commit:

https://github.com/coreutils/coreutils/commit/6da16c99ff0c6ec35943dd2db5f51c25efb8b508


I pushed the following before receiving this bug report:
https://github.com/coreutils/coreutils/commit/3ce31e6f1

Marking this as done.

thanks,
Pádraig

bug#71191: od does unnecessary reads instead of seeking

2024-05-25 Thread Pádraig Brady


On 25/05/2024 08:05, Joseph C. Sible wrote:

Consider running the following command, for looking at the pagemap
bits of a given memory page:

od -t x8 -N 8 -j 34359738368 /proc/PID/pagemap

That file supports seeking, but od will unnecessarily read and discard
32GB worth of data instead of doing so.

Looking at the skip function in od.c, it looks like this happens
because the file has a bogus st_size of 0 (as is typical for files in
/proc) despite usable_st_size returning true for it, which results in
the calls to fseeko never even being tried.


This is related to https://bugs.gnu.org/36291
Note the workaround with dd mentioned there ¹.
In that report I detailed four cases of /proc /sys and /dev files
that od needs to handle, and suggested we try the lseek() for
the non regular file case (where we don't worry about seeking past EOF).

As an aside we can actually _seek_ in /proc and /sys files
¹ dd used to report an error when seeking in such files,
when in fact seeking is actually supported. The warning was removed
(for all empty files) in https://github.com/coreutils/coreutils/commit/ac6b8d822
(as part of https://bugs.gnu.org/70231).
But this case is more complicated by the fact that od supports combining files,
so you need to avoid seeking beyond EOF.
I.e. `od --skip-bytes 1 empty nonempty` would give different results
if we blindly tried an lseek() if st_stize==0

I wonder could you efficiently seek to offset, avoiding seeking past EOF,
while ignoring st_size with something like:
  if (lseek(N-1))
if read(1)
  skip -= N
else
  lseek(-(N-1))
I'm not sure how general that would be.

cheers,
Pádraig

bug#71186: email address for patch

2024-05-25 Thread Pádraig Brady


tab 71186 notabug
close 71186
stop

On 24/05/2024 19:15, Данила Скачедубов wrote:

Hi! I have a few questions. Please tell me, I want to compile the main
files so that they are output by the man command. As far as I
understand, this is done by the help2man-1.48.5 command. But at the
same time, I can't compile correctly, let's say chown.x > chown.1, What
could be the problem? And the second question: I carefully studied the
documentation on the rules for preparing patches and it said that I
could get the address by running --help, but I did not see the email
address. Maybe it's him- [1]coreut...@gnu.org ? Thank you very much in
advance for the answers!

--
Danila Skachedubov

References

1. https://e.mail.ru/compose?To=coreut...@gnu.org


Proposed bug fixes go to bug-gnu...@gnu.org
General discussions go to coreut...@gnu.org

Usage for help2man is:

  help2man chown -i chown.x > chown.1

cheers,
Pádraig

bug#71029: single-binary build broken on systems with separate libintl

2024-05-21 Thread Pádraig Brady


On 18/05/2024 18:09, Audrey Dutcher wrote:

I presume you have the same issue with coreutils 9.4 ?


Correct.


I presume the problematic flags are propagated through LIBINTL or MBRTOWC_LIB. 
What are those set to for reference in your Makefile?


On plain FreeBSD it is set to /usr/local/lib/libintl.so -Wl,-rpath
-Wl,/usr/local/lib. Within a nix jail on FreeBSD (which is where I
encountered the issue first) it is simply -lintl.

Thanks for getting back to me!
- Audrey


Could you test with the attached,
which should reenable automake protections in this area.

cheers,
PádraigFrom 5691ff399e824c79090f2c7f3d08218fde4f9560 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Tue, 21 May 2024 13:08:45 +0100
Subject: [PATCH] build: fix build failure in --enable-single-binary mode

* src/local.mk: Avoid overriding automake generated DEPENDENCIES,
so that it applies its adjustments to LDADD to avoid propagating
flags (like -Wl,-rpath) into make targets.  This was seen on FreeBSD
where LIBINTL is set to:
/usr/local/lib/libintl.so -Wl,-rpath -Wl,/usr/local/lib
Instead let automake generate a sanitized src_coreutils_DEPENDENCIES
(based on LDADD), which we then augment with the EXTRA_... variable.
---
 src/local.mk | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/local.mk b/src/local.mk
index 3356f8a2a..8133925ac 100644
--- a/src/local.mk
+++ b/src/local.mk
@@ -482,13 +482,13 @@ if SINGLE_BINARY
 src_coreutils_CFLAGS = -DSINGLE_BINARY $(AM_CFLAGS)
 #src_coreutils_LDFLAGS = $(AM_LDFLAGS)
 src_coreutils_LDADD = $(single_binary_deps) $(LDADD) $(single_binary_libs)
-src_coreutils_DEPENDENCIES = $(LDADD) $(single_binary_deps)
+EXTRA_src_coreutils_DEPENDENCIES = $(single_binary_deps)
 
 include $(top_srcdir)/src/single-binary.mk
 
 # Creates symlinks or shebangs to the installed programs when building
 # coreutils single binary.
-EXTRA_src_coreutils_DEPENDENCIES = src/coreutils_$(single_binary_install_type)
+EXTRA_src_coreutils_DEPENDENCIES += src/coreutils_$(single_binary_install_type)
 endif SINGLE_BINARY
 
 CLEANFILES += src/coreutils_symlinks
-- 
2.44.0

bug#71029: single-binary build broken on systems with separate libintl

2024-05-18 Thread Pádraig Brady


On 17/05/2024 17:11, Audrey Dutcher wrote:

On my FreeBSD system, downloading coreutils-9.5.tar.xz, then building
with `./configure --enable-single-binary && make` does not succeed,
with the error message `don't know how to make -Wl,-rpath. Stop`

I believe the root cause of this is
https://github.com/coreutils/coreutils/blob/52e024b/src/local.mk#L485,
which mixes DEPENDENCIES and LDADD. This seems bad, for very relevant
reasons!


I presume you have the same issue with coreutils 9.4 ?

I presume the problematic flags are propagated through LIBINTL or MBRTOWC_LIB.
What are those set to for reference in your Makefile?

I'll have a look at cleaning this up.

thanks,
Pádraig

bug#70946: [PATCH] doc: improve the man page for sleep

2024-05-15 Thread Pádraig Brady


On 15/05/2024 03:46, Nikolaos Chatzikonstantinou wrote:

On Tue, May 14, 2024 at 4:03 PM Nikolaos Chatzikonstantinou
 wrote:


On Tue, May 14, 2024, 3:59 PM Pádraig Brady  wrote:


On 14/05/2024 17:36, Nikolaos Chatzikonstantinou wrote:

See attachment.


Well just above your new mention of floating-point, we have:
"NUMBER need not be an integer".
How about I adjust your patch to adjust that text to say:
"NUMBER can be an integer or floating-point".

Sounds good! Either I missed it or my system had an older man page.


If I may suggest, instead write it like this:

 Pause for NUMBER seconds, where NUMBER is an integer or floating-point.

Then the sentence "NUMBER need not be an integer" can be removed. Of
course the info manual shows more details, but I think that note in
the man page will do. Let me know if you'd rather that I write and
submit a patch myself instead.


OK I pushed a change in your name, changing from:

Pause for NUMBER seconds.  SUFFIX may be 's' for seconds (the default),'m' for 
minutes, 'h' for hours or 'd' for days.  NUMBER need not be an
integer.  Given two or more arguments, pause for the amount of time
specified by the sum of their values.

to:

Pause for NUMBER seconds, where NUMBER is an integer or floating-point.
SUFFIX may be 's','m','h', or 'd', for seconds, minutes, hours, days.
With multiple arguments, pause for the sum of their values.

Marking this as done.

thanks,
Pádraig

bug#70946: [PATCH] doc: improve the man page for sleep

2024-05-14 Thread Pádraig Brady


On 14/05/2024 17:36, Nikolaos Chatzikonstantinou wrote:

See attachment.


Well just above your new mention of floating-point, we have:
"NUMBER need not be an integer".
How about I adjust your patch to adjust that text to say:
"NUMBER can be an integer or floating-point".

cheers,
Pádraig.

bug#70887: In coreutils 9.5, "cp" command does not honor "--interactive" option

2024-05-12 Thread Pádraig Brady


On 12/05/2024 16:06, Paul Eggert wrote:

On 2024-05-12 04:49, Pádraig Brady wrote:


@@ -1151,7 +1151,8 @@ main (int argc, char **argv)
  {
/* Default cp operation.  */
x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
  }
else if (update_opt == UPDATE_NONE)
  {
@@ -1166,7 +1167,8 @@ main (int argc, char **argv)
else if (update_opt == UPDATE_OLDER)
  {
x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;


Thanks for looking into this messy area. Here is a comment from another
pair of eyes.

Could you elaborate a bit more about why these two bits of code change
x.interactive at all? That is, why doesn't update_opt simply affect
x.update? Why does update_opt bother to override a previous setting of
x.interactive to I_ALWAYS_YES, I_ALWAYS_NO, or I_ALWAYS_SKIP?

Another way to put it: shouldn't x.update simply reflect the value of
the --update option, whereas x.interactive reflects reflects whether -f,
-i, -n are used? Although this would require changes to copy.c, it'd
make the code easier to follow.


I agree that some refactoring would be good here.
At least x.update should be renamed to x.update_older.

As interactive selection, and file dates all relate
to selecting which files to update, it's tempting to conflate the settings.
However you're right that this introduces complexities when
trying to avoid all inconsistencies. Currently for example:
  $ cp -v -i --update=none new old  # Won't prompt as expected
  $ cp -v --update=none -i new old  # Unexpectedly ignores update option

So yes we should separate these things.


Another way to put it: why should, for example, --update=all override a
previous -f or (partly) -n but not a previous -i?


Right -f is significant for mv here (for completeness -f for cp is a separate 
thing).
I.e. we need to treat I_ALWAYS_YES specially in mv with the current scheme.

BTW -n is not overridden by any --update option currently,
and this change effectively applies the same logic to -i now.

thanks,
Pádraig

bug#70887: In coreutils 9.5, "cp" command does not honor "--interactive" option

2024-05-12 Thread Pádraig Brady


On 12/05/2024 00:03, Robert Hill wrote:

After upgrading coreutils from 9.0 to 9.5, the following change occurred:

In coreutils 9.0, the command "cp -Tipruvx /src-dir /dst-dir" requested
interactive confirmation before replacing an old destination file with a
newer source file, as expected.

In coreutils 9.5, the command "cp -Tipruvx /src-dir /dst-dir" no longer
requests interactive confirmation, but just goes ahead and replaces old
destination files with newer source files, which is not expected.

Thank you in advance for looking at this, Bob.


Right.

The thinking was for 9.3 that the new long form --update={older,all} options
would override a previous -i, especially as -i was commonly set in root
users' cp and mv aliases on Red Hat flavored distros.
Then in 9.5 we expanded this so -u behaved the same as --update=older.

In retrospect, users can avoid these aliases in various ways,
and the protective -i option should really combine with -u
rather than being overridden by it.

For completeness, -i following -u would always reinstate the protection.

The attached changes the behavior back to that -i is never overridden.

thanks,
PádraigFrom 2ec640a3d2719ac0204d4976baabe6082b85e502 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 12 May 2024 12:21:19 +0100
Subject: [PATCH] cp,mv: ensure -i is not overridden by -u

Have -i combine with -u instead.
In coreutils 9.3 we had --update={all,older} override -i
In coreutils 9.5 this was expanded to -u
(to make it consistent with --update=older).

* NEWS: Mention the bug fix.
* src/cp.c (main): Don't have -u disable prompting.
* src/mv.c (main): Likewise.
* tests/cp/cp-i.sh: Add a test case.
* tests/mv/update.sh: Likewise.
Fixes https://bugs.gnu.org/70887
---
 NEWS   |  3 +++
 src/cp.c   |  6 --
 src/mv.c   |  6 --
 tests/cp/cp-i.sh   | 13 +
 tests/mv/update.sh |  3 +++
 5 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index febb9ac68..430c2ec54 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,9 @@ GNU coreutils NEWS-*- outline -*-
   rejected as an invalid option.
   [bug introduced in coreutils-9.5]
 
+  cp,mv --update no longer overrides --interactive.
+  [bug introduced in coreutils-9.3]
+
   ls and printf fix shell quoted output in the edge case of escaped
   first and last characters, and single quotes in the string.
   [bug introduced in coreutils-8.26]
diff --git a/src/cp.c b/src/cp.c
index 06dbad155..cae4563cb 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -1151,7 +1151,8 @@ main (int argc, char **argv)
 {
   /* Default cp operation.  */
   x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
   else if (update_opt == UPDATE_NONE)
 {
@@ -1166,7 +1167,8 @@ main (int argc, char **argv)
   else if (update_opt == UPDATE_OLDER)
 {
   x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
 }
   break;
diff --git a/src/mv.c b/src/mv.c
index 692943a70..0162cd728 100644
--- a/src/mv.c
+++ b/src/mv.c
@@ -394,7 +394,8 @@ main (int argc, char **argv)
 {
   /* Default mv operation.  */
   x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
   else if (update_opt == UPDATE_NONE)
 {
@@ -409,7 +410,8 @@ main (int argc, char **argv)
   else if (update_opt == UPDATE_OLDER)
 {
   x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
 }
   break;
diff --git a/tests/cp/cp-i.sh b/tests/cp/cp-i.sh
index f99f743dc..28969f9c8 100755
--- a/tests/cp/cp-i.sh
+++ b/tests/cp/cp-i.sh
@@ -70,4 +70,17 @@ returns_ 1 cp -bn c d 2>/dev/null || fail=1
 returns_ 1 cp -b --update=none c d 2>/dev/null || fail=1
 returns_ 1 cp -b --update=none-fail c d 2>/dev/null || fail=1
 
+# Verify -i combines with -u,
+echo old > old || framework_failure_
+touch -d yesterday old || framework_failure_
+echo new > new || framework_failure_
+# coreutils 9.3 had --update={all,older} ignore -i
+echo n | returns_ 1 cp -vi --update=older new old 2>/dev/null >out8 || fail=1
+compare /dev/null out8 || fail=1
+echo n | returns_ 1 cp -vi --update=all new old 2>/dev/null >out8 || fail=1
+compare /dev/null out8 || fail=1
+# coreutils 9.5 also had -u ignore -i
+echo n | returns_ 1 cp -vi -

bug#70873: join --return-error-if-any-unmatched-lines

2024-05-11 Thread Pádraig Brady


On 11/05/2024 10:14, Dan Jacobson wrote:

join should have an option to return an error value in the shell's $?
if any lines are not matched.

Currently the man page doesn't even mention a return value. So it is not
set in stone yet.

Currently one must save -v output in a file then use test -s do detect
if there were any non-matched lines. And then exit one script with
non-zero. Big hassle.

join (GNU coreutils) 9.4


This does seem to have some merit.
Perhaps --check-pairable similar to the existing --check-order option.

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-05-09 Thread Pádraig Brady


On 13/04/2024 18:42, Pádraig Brady wrote:

On 13/04/2024 17:39, Pádraig Brady wrote:

install(1) defaults to mode 600 for new files, and uses set_acl() with that
(since 2007 https://github.com/coreutils/coreutils/commit/f634e8844 )
The psuedo code that install(1) uses is:

copy_reg()
 if (x->set_mode) /* install */
   set_acl(dest, x->mode /* 600 */)
 ctx->acl = acl_from_mode ( /* 600 */)
 acl_set_fd (ctx->acl) /* fails EACCES */
 if (! acls_set)
must_chmod = true;
 if (must_chmod)
   saved_errno = EACCES;
   chmod (ctx->mode /* 600 */)
   if (save_errno)
 return -1;

This issue only only seems to be on CIFS.
I'm seeing lot of weird behavior with ACLs there:

 acl_set_fd (acl_from_mode (600)) -> EACCES
 acl_set_fd (acl_from_mode (755)) -> EINVAL
 getxattr ("system.posix_acl_access") -> EOPNOTSUPP

Note we ignore EINVAL and EOPNOTSUPP errors in set_acl(),
and it's just the EACCES that's problematic.
Note this is quite similar to https://debbugs.gnu.org/65599
where Paul also noticed EACCES with fsetxattr() (and others) on CIFS.

The attached is a potential solution which I tested as working
on the same matoro system that Bruno used.

I think I'll apply that after thinking a bit more about it.


Actually that probably isn't appropriate,
as fsetxattr() can validly return EACESS
even though not documented in the man page at least.
The following demonstrates that:
.
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
  int wfd;
  /* Note S_IWUSR is not set.  */
  if ((wfd=open("writable", O_CREAT|O_WRONLY|O_EXCL, S_IRUSR)) == -1)
  fprintf(stderr, "open('writable') error [%m]\n");
  if (write(wfd, "data", 1) == -1)
  fprintf(stderr, "write() error [%m]\n");
  if (fsetxattr(wfd, "user.test", "test", 4, 0) == -1)
  fprintf(stderr, "fsetxattr() error [%m]\n");
}

Another solution might be to file_has_acl() in copy.c
and skip if "not supported" or nothing is returned.
The following would do that, but I'm not sure about this
(as I'm not sure about ACLs in general TBH).
Note listxattr() returns 0 on CIFS here,
while getxattr ("system.posix_acl_access") returns EOPNOTSUPP,
and file_has_acl() uses listxattr() first.

diff --git a/src/copy.c b/src/copy.c
index d584a27eb..2145d89d5 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1673,8 +1673,13 @@ set_dest_mode:
   }
 else if (x->set_mode)
   {
-  if (set_acl (dst_name, dest_desc, x->mode) != 0)
-return_val = false;
+  errno = 0;
+  int n = file_has_acl (dst_name, &sb);
+  if (0 < n || (errno && ! (is_ENOTSUP (errno) || errno == ENOSYS)))
+{
+  if (set_acl (dst_name, dest_desc, x->mode) != 0)
+return_val = false;
+}
   }


BTW I'm surprised this wasn't reported previously for CIFS,
so I due to this bug and https://debbugs.gnu.org/65599
I suspect a recentish change in CIFS wrt EACCES.


Thinking more about this, the solution presented above wouldn't work,
and I think this needs to be addressed in CIFS.
I.e. we may still need to reset the mode even if the file has no ACLs,
as generally only dirs get default ACLs copied, as demonstrated below:

$ mkdir acl$ setfacl -d -m o::rw acl
$ getfacl acl
# file: acl
# owner: padraig
# group: padraig
user::rwx
group::r-x
other::r-x
default:user::rwx
default:group::r-x
default:other::rw-

$ touch acl/file
$ ls -l acl/file
-rw-r--rw-. 1 padraig padraig 0 May  9 13:11 acl/file
$ getfacl acl/file
# file: acl/file
# owner: padraig
# group: padraig
user::rw-
group::r--
other::rw-

cheers,
Pádraig.

bug#70814: pwd(1) man page: Unclear "avoid" wording

2024-05-07 Thread Pádraig Brady


On 07/05/2024 05:19, Bruce Jerrick wrote:

This wording in the pwd(1) man page is unclear:

     -P, --physical
     avoid all symlinks

"resolve all symlinks" would be much better wording.


Pushed that in your name.

Marking this as done.

thanks,
Pádraig

bug#70801: Weird behaviour when standard input is closed

2024-05-06 Thread Pádraig Brady


On 06/05/2024 08:38, Bernard Burette wrote:

Hi,



If I try:

$ cat <&-

cat: -: Bad file descriptor

cat: closing standard input: Bad file descriptor

$



The error on stdin beign closed is displayed twice plus "-" is for a FILE 
argument to replace standard input, It would make more sense to me to have someting like:

$ cat <&-

cat: standard input: Bad file descriptor

$



This is the "git diff" of my proposed change:

diff --git a/src/cat.c b/src/cat.c

index b33faeb35..4be189d85 100644

--- a/src/cat.c

+++ b/src/cat.c

@@ -50,6 +50,14 @@

/* Name of input file.  May be "-".  */

static char const *infile;



+/* Pretty name of input file */

+static char const *

+quotef_infile (void)

+{

+  if (STREQ (infile, "-")) return _("standard input");

+  return quotef (infile);

+}

+

/* Descriptor on which input file is open.  */

static int input_desc;



@@ -164,7 +172,7 @@ simple_cat (char *buf, idx_t bufsize)

    size_t n_read = safe_read (input_desc, buf, bufsize);

    if (n_read == SAFE_READ_ERROR)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    return false;

  }



@@ -313,7 +321,7 @@ cat (char *inbuf, idx_t insize, char *outbuf, idx_t outsize,

    size_t n_read = safe_read (input_desc, inbuf, insize);

    if (n_read == SAFE_READ_ERROR)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    write_pending (outbuf, &bpout);

    newlines2 = newlines;

    return false;

@@ -526,7 +534,7 @@ copy_cat (void)

  || errno == EBADF || errno == EXDEV || errno == ETXTBSY

  || errno == EPERM)

    return 0;

-    error (0, errno, "%s", quotef (infile));

+    error (0, errno, "%s", quotef_infile());

  return -1;

    }

}

@@ -684,7 +692,7 @@ main (int argc, char **argv)

    input_desc = open (infile, file_open_mode);

    if (input_desc < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

    continue;

  }

@@ -692,7 +700,7 @@ main (int argc, char **argv)



    if (fstat (input_desc, &stat_buf) < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

    goto contin;

  }

@@ -719,7 +727,7 @@ main (int argc, char **argv)

  }

    if (exhausting)

  {

-  error (0, 0, _("%s: input file is output file"), quotef 
(infile));

+  error (0, 0, _("%s: input file is output file"), 
quotef_infile());

    ok = false;

    goto contin;

  }

@@ -794,7 +802,7 @@ main (int argc, char **argv)

  contin:

    if (!reading_stdin && close (input_desc) < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

  }

  }

@@ -807,7 +815,8 @@ main (int argc, char **argv)

  }



    if (have_read_stdin && close (STDIN_FILENO) < 0)

-    error (EXIT_FAILURE, errno, _("closing standard input"));

+    if (ok)

+  error (EXIT_FAILURE, errno, _("closing standard input"));



    return ok ? EXIT_SUCCESS : EXIT_FAILURE;

}



There are a lot more of coreutil utilities that (could) behave weird on a 
closed standard input, here is my first list:

b2sum base32 base64 cat cksum comm csplit cut expand factor fmt fold head join 
md5sum nl nohup numfmt od paste pr ptx sha1sum sha224sum sha256sum sha384sum 
sha512sum shred shuf sort split stdbuf stty sum tac tail tee tr tsort unexpand 
uniq wc





I would like to work on those as well but it would help to have:

1) some kind of consensus on "this is a better way of displaying the error", 
for example other tools also go wild like:

  grep: (standard input): Bad file descriptor

  - why parentheses?

  sed: read error on stdin: Bad file descriptor

  - stdin is the C name for standard input, documented?

2) a better way of offering changes than sending a diff patch in a e-mail.


Yes it would be good to fix up at least the duplicated error.
We did the same on the write side of things with:
https://github.com/coreutils/coreutils/commit/0b2ff7637
Also we generally checked that readers diagnosed errors in:
https://github.com/coreutils/coreutils/blob/master/tests/misc/read-errors.sh

So something similar could probably be done here.

cheers,
Pádraig

bug#70727: cp doesn't recognize the new --update=none-fail option

2024-05-03 Thread Pádraig Brady


On 03/05/2024 05:12, Attila Fidan via GNU coreutils Bug Reports wrote:

Hi,

I wanted to use the new cp --update=none-fail option introduced in 9.5,
but it said "invalid argument ‘none-fail’ for ‘--update’". It turns out
that the commit (49912bac286eb3c0ef7d1567ae790193ad5eb2e8) adding it
forgot to add the new operation to update_type[] and
update_type_string[] in cp.c like it did for mv.c. After patching
coreutils locally the functionality works as expected.

It seems like the test suite didn't catch this because there's no
cp/update.sh test like there is for mv. There's a test for if using
--backup and --update=none-fail are mutually exclusive by checking if cp
returns 1, but an invalid argument also makes cp return 1 :)

I didn't include a patch in case a change to the test suite is wanted,
but the proposed change is tiny and rather obvious.


Well that's embarrassing.
I implemented in cp first, tested that manually,
then must have messed up that hunk when rebasing.

The attached should fix this.

Marking this as done.

thanks,
Pádraig.From de49e993ea8b6dcdf6cada3c0f44a6371514f952 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Fri, 3 May 2024 10:18:50 +0100
Subject: [PATCH] cp: actually support --update=none-fail

* src/cp.c: Add the entries for the --update=none-fail option.
* tests/mv/update.sh: Add a test case.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/70727
---
 NEWS   |  4 
 src/cp.c   |  4 ++--
 tests/mv/update.sh | 11 +++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 389f72516..7e8ccb34f 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,10 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Bug fixes
 
+  cp fixes support for --update=none-fail, which would have been
+  rejected as an invalid option.
+  [bug introduced in coreutils-9.5]
+
   ls and printf fix shell quoted output in the edge case of escaped
   first and last characters, and single quotes in the string.
   [bug introduced in coreutils-8.26]
diff --git a/src/cp.c b/src/cp.c
index 28b0217db..06dbad155 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -104,11 +104,11 @@ ARGMATCH_VERIFY (reflink_type_string, reflink_type);
 
 static char const *const update_type_string[] =
 {
-  "all", "none", "older", nullptr
+  "all", "none", "none-fail", "older", nullptr
 };
 static enum Update_type const update_type[] =
 {
-  UPDATE_ALL, UPDATE_NONE, UPDATE_OLDER,
+  UPDATE_ALL, UPDATE_NONE, UPDATE_NONE_FAIL, UPDATE_OLDER,
 };
 ARGMATCH_VERIFY (update_type_string, update_type);
 
diff --git a/tests/mv/update.sh b/tests/mv/update.sh
index 164357803..39ff677b9 100755
--- a/tests/mv/update.sh
+++ b/tests/mv/update.sh
@@ -38,6 +38,17 @@ for interactive in '' -i; do
   done
 done
 
+# These should accept all options
+for update_option in '--update' '--update=older' '--update=all' \
+ '--update=none' '--update=none-fail'; do
+
+  touch file1 || framework_failure_
+  mv $update_option file1 file2 || fail=1
+  test -f file1 && fail=1
+  cp $update_option file2 file1 || fail=1
+  rm file1 file2 || framework_failure_
+done
+
 # These should perform the rename / copy
 for update_option in '--update' '--update=older' '--update=all' \
  '--update=none --update=all'; do
-- 
2.44.0

bug#70714: realpath no error for unreadable-symlink

2024-05-02 Thread Pádraig Brady


On 02/05/2024 07:16, Nineteendo INC wrote:

coreutils version: stable 9.5 (bottled)
OS version: macOS 13.6.6 (22G630)

`realpath` doesn’t behave correctly for unreadable symlinks:

wannes@Stefans-iMac ~ % ln -s . src
wannes@Stefans-iMac ~ % grealpath -e src/..
/Users
wannes@Stefans-iMac ~ % chmod -h 000 src
wannes@Stefans-iMac ~ % grealpath -e src/..
/Users/wannes

Expected behaviour:

wannes@Stefans-iMac ~ % grealpath -e src/..
grealpath: src/..: Permission denied


Right, looks like we'll have to cater for EACCES on darwin.
I'll have a look

bug#70699: Possible bug in /usr/bin/paste

2024-05-01 Thread Pádraig Brady


On 01/05/2024 15:28, Art Shelest via GNU coreutils Bug Reports wrote:

Good morning,

I am seeing an aberrant behavior from the /usr/bin/paste utility when working 
with Windows-style CR/LF text files.
The repro is for Mint Mate (Virginia).

If I change the line endings in the first file to Unix format (LF), it works as 
expected.
If I change the line endings to Max (CR), it breaks even worse.

$ hexdump -C letters.txt
  61 61 09 41 41 0d 0a 62  62 09 42 42 0d 0a|aa.AA..bb.BB..|

$ cat letters.txt
aa   AA
bb   BB
$ cat numbers.txt
1
2
$ paste letters.txt numbers.txt
aa   1A
bb   2B
$


Expected:
$ paste letters.txt numbers.txt
aa   AA   1
bb   BB   2

Thank you.



paste(1) is treating the CR like a standard character,
and when outputting that back to the terminal it "messes up" the expected 
output.
I suggest you convert any such files to unix format before processing.

thanks,
Pádraig

bug#70532: sort: Mention counting fields from the end

2024-04-23 Thread Pádraig Brady


On 23/04/2024 11:14, Dan Jacobson wrote:

In (info "(coreutils) sort invocation") be sure to add an example of a
way or workaround for counting fields from the end of the line. E.g., we
want to sort on the last field, but don't know for sure how many fields
a line might contain. E.g., sort by surname, when lines consist of First
[Middle...] Surname. perl -a uses $F[-1]. so maybe sort(1) could also
use a negative field number. Same for character number.


All good suggestions. I'll at least add an example along the lines of:

  awk '{print $NF, $0}' | sort -k1,1 | cut -f2- -d' '

cheers,
Pádraig.

bug#70418: ls bug

2024-04-17 Thread Pádraig Brady


On 16/04/2024 23:17, Paul Eggert wrote:

On 4/16/24 14:30, Toby Kelsey wrote:

The man page doesn't explain this format conflict, while the info page
(info '(coreutils) ls invocation' or 'info ls') claims '-f' implies '-1'
which is also incorrect: 'ls -1f' gives different output to to 'ls -f'.


Yes, this area of GNU 'ls' a mess. Option order should not matter here.

Option order didn't matter in 7th Edition Unix, where -f overrode -l
regardless of whether -f came before or after -l. And option order
doesn't matter in FreeBSD, where -f and -l are orthogonal. GNU ls is an
odd hybrid of 7th Edition and FreeBSD and messes this up.

Rather than document the hybrid mess, let's bite the bullet and fix it.
FreeBSD behavior makes more sense, so let's do that. Proposed patch
attached.


+1

Related to this is the recent adjustment of usage() for -f:
https://bugs.gnu.org/67765

Patch looks good, and conforms to POSIX.

thanks!
Pádraig

bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady


On 16/04/2024 15:47, Alejandro Colomar wrote:

Hi Pádraig,

On Tue, Apr 16, 2024 at 03:25:22PM +0100, Pádraig Brady wrote:

What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently


I don't know.  The reporter didn't tell.  I see you also asked on the
Github original report.


Note that check originally came from:
https://github.com/coreutils/coreutils/commit/dea4262fa

I suppose we could relax the check as follows, for files of apparent size 0
which would cater for this, and others that may also have unstable inodes.


H.  Since you couldn't reprodude it in a recent Darwin, maybe it's
just a bug in an old Darwin.  And since noone else seems to have met
this Darwin's bug, maybe we can just ignore it.  (And if it were a
regression in a more recent Darwin, hopefully they should fix their
kernel.)

I'm not happy relaxing a security check, without making sure that there
are no implications at all.

I vote for claiming only limited support to such a Darwin system.  I
already workarounded it in the Linux man-pages, by not piping to
install(1) in a common task; and nobody else seems to be affected.

Unless you feel confident that it's perfectly fine to do it.  But I have
no sympathy for workarounding Darwin bugs here.


I agree if it's older Darwin only, we can ignore.
The version I tested on is 3 years old now though,
so I'm not sure whether the issue is on newer or older.

Note we had similar issue on Solaris,
where we used an fstat() wrapper to adjust things:
https://bugs.gnu.org/35713

A related suggestion was from Marc Chantreux (CC'd)
to support '-' to imply stdin, which would be more portable.
There is some merit to that suggestion too.

cheers,
Pádraig

bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady


On 16/04/2024 12:33, Pádraig Brady wrote:

On 16/04/2024 01:19, Alejandro Colomar wrote:

Hi!

I don't own a Darwin system, so I can't help much reproduce.  However,
I've received a bug report to the Linux man-pages, that our build
system (GNUmakefile-based), which ends up calling

... | install /dev/stdin $@

doesn't work on Darwin.  Here's the original bug report:
<https://github.com/NixOS/nixpkgs/pull/300797>.

Here are the reported error messages:

...
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addseverity.3
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/adjtime.3
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addmntent.3]
 Error 1
make: *** Waiting for unfinished jobs
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/acosh.3]
 Error 1
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
...

I don't see why install(1) should fail to read /dev/stdin under any
POSIX system


What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently


Note that check originally came from:
https://github.com/coreutils/coreutils/commit/dea4262fa

I suppose we could relax the check as follows, for files of apparent size 0
which would cater for this, and others that may also have unstable inodes.

cheers,
Pádraig.

diff --git a/src/copy.c b/src/copy.c
index 2145d89d5..fb5f0a1a0 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1263,8 +1263,11 @@ copy_reg (char const *src_name, char const *dst_name,
 }

   /* Compare the source dev/ino from the open file to the incoming,
- saved ones obtained via a previous call to stat.  */
-  if (! psame_inode (src_sb, &src_open_sb))
+ saved ones obtained via a previous call to stat.  Restrict
+ the check to files with an apparent size, to support "files"
+ with unstable inodes, like /dev/stdin on macOS.  */
+  if (! psame_inode (src_sb, &src_open_sb)
+  && (src_sb->st_size || src_open_sb.st_size))
 {
   error (0, 0,
  _("skipping file %s, as it was replaced while being copied"),

bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady


On 16/04/2024 01:19, Alejandro Colomar wrote:

Hi!

I don't own a Darwin system, so I can't help much reproduce.  However,
I've received a bug report to the Linux man-pages, that our build
system (GNUmakefile-based), which ends up calling

... | install /dev/stdin $@

doesn't work on Darwin.  Here's the original bug report:
.

Here are the reported error messages:

...
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addseverity.3
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/adjtime.3
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addmntent.3]
 Error 1
make: *** Waiting for unfinished jobs
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/acosh.3]
 Error 1
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
...

I don't see why install(1) should fail to read /dev/stdin under any
POSIX system


What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-15 Thread Pádraig Brady


On 15/04/2024 15:37, Andreas Grünbacher wrote:

Hello,

Am So., 14. Apr. 2024 um 00:43 Uhr schrieb Pádraig Brady :

On 13/04/2024 20:29, Bruno Haible wrote:

Hi Pádraig,

I wrote:

5) The same thing with 'cp -a' succeeds:

$ build-sparc64/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0
$ build-sparc64-no-acl/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0


You wrote:

The psuedo code that install(1) uses is:

copy_reg()
 if (x->set_mode) /* install */
   set_acl(dest, x->mode /* 600 */)
 ctx->acl = acl_from_mode ( /* 600 */)
 acl_set_fd (ctx->acl) /* fails EACCES */
 if (! acls_set)
must_chmod = true;
 if (must_chmod)
   saved_errno = EACCES;
   chmod (ctx->mode /* 600 */)
   if (save_errno)
 return -1;


And, for comparison, what is the pseudo-code that 'cp -a' uses?
I would guess that there must be a relevant difference between both.


The cp pseudo code is:

copy_reg()
if (preserve_xattr)
  copy_attr()
ret = attr_copy_fd()
if (ret == -1 && require_preserve_xattr /*false*/)
  return failure;
if (preserve_mode)
  copy_acl()
qcopy_acl()
  #if USE_XATTR /* true */
fchmod() /* chmod before setting ACLs as doing after may reset */
return attr_copy_fd() /* successful if no ACLs in source */
  #endif

If however you add ACLs in the source, you induce a similar failure:

$ setfacl -m u:nobody:r /var/tmp/foo3942
$ src/cp -a /var/tmp/foo3942 foo3942; echo $?
src/cp: preserving permissions for ‘foo3942’: Permission denied
1

The corresponding strace is:

fchmod(4, 0100640)  = 0
flistxattr(3, NULL, 0)  = 24
flistxattr(3, "system.posix_acl_access\0", 24) = 24
fgetxattr(3, "system.posix_acl_access", NULL, 0) = 44
fgetxattr(3, "system.posix_acl_access", "\2\0...\4", 44) = 44
fsetxattr(4, "system.posix_acl_access", "\2\0...\4", 44, 0) = -1 EACCES 
(Permission denied)


Why does CIFS think EACCES is an appropriate error to return here? The
fchmod() succeeds, so changing the file permissions via fsetxattr()
should really succeed as well.


Right, it seems like a CIFS bug (already CC'd)
Even if it returned ENOTSUP (like getxattr(...posix...) does) it would be ok as 
we handle that.
It would be good to avoid it though.

You confirmed privately to me that the set_acl() is to clear any default ACLs
that may have been added to the newly created file, so the posted solution in
https://bugs.gnu.org/70214#11 that only does the set_acl() iff file_has_acl()
should avoid the CIFS issue.  It would be a bit less efficient if there were
ACLs, but a bit more efficient in the common case where there weren't ACLs.

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady


On 13/04/2024 20:29, Bruno Haible wrote:

Hi Pádraig,

I wrote:

5) The same thing with 'cp -a' succeeds:

$ build-sparc64/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0
$ build-sparc64-no-acl/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0


You wrote:

The psuedo code that install(1) uses is:

copy_reg()
if (x->set_mode) /* install */
  set_acl(dest, x->mode /* 600 */)
ctx->acl = acl_from_mode ( /* 600 */)
acl_set_fd (ctx->acl) /* fails EACCES */
if (! acls_set)
   must_chmod = true;
if (must_chmod)
  saved_errno = EACCES;
  chmod (ctx->mode /* 600 */)
  if (save_errno)
return -1;


And, for comparison, what is the pseudo-code that 'cp -a' uses?
I would guess that there must be a relevant difference between both.


The cp pseudo code is:

copy_reg()
  if (preserve_xattr)
copy_attr()
  ret = attr_copy_fd()
  if (ret == -1 && require_preserve_xattr /*false*/)
return failure;
  if (preserve_mode)
copy_acl()
  qcopy_acl()
#if USE_XATTR /* true */
  fchmod() /* chmod before setting ACLs as doing after may reset */
  return attr_copy_fd() /* successful if no ACLs in source */
#endif

If however you add ACLs in the source, you induce a similar failure:

$ setfacl -m u:nobody:r /var/tmp/foo3942
$ src/cp -a /var/tmp/foo3942 foo3942; echo $?
src/cp: preserving permissions for ‘foo3942’: Permission denied
1

The corresponding strace is:

fchmod(4, 0100640)  = 0
flistxattr(3, NULL, 0)  = 24
flistxattr(3, "system.posix_acl_access\0", 24) = 24
fgetxattr(3, "system.posix_acl_access", NULL, 0) = 44
fgetxattr(3, "system.posix_acl_access", "\2\0...\4", 44) = 44
fsetxattr(4, "system.posix_acl_access", "\2\0...\4", 44, 0) = -1 EACCES 
(Permission denied)

BTW I was wondering about the need for install(1) to set_acl() at all,
rather than just using chmod.
The following comment in lib/set-permissions.c may be pertinent:

/* If we can't set an acl which we expect to be able to set, try setting
   the permissions to ctx->mode. Due to possible inherited permissions,
   we cannot simply chmod */

BTW this is all under kernel version:

$ uname -r
6.8.5-gentoo-sparc64

With these cifs options:

$ mount | grep cifs
//syslog.matoro.tk/guest-pixelbeat on /media/guest-homedirs/pixelbeat type cifs
(rw,nosuid,relatime,vers=1.0,cache=strict,username=nobody,uid=30017,forceuid,
gid=30017,forcegid,addr=fd05:::::::0001,
soft,unix,posixpaths,serverino,mapposix,acl,
rsize=1048576,wsize=65536,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady


On 13/04/2024 17:39, Pádraig Brady wrote:

install(1) defaults to mode 600 for new files, and uses set_acl() with that
(since 2007 https://github.com/coreutils/coreutils/commit/f634e8844 )
The psuedo code that install(1) uses is:

copy_reg()
if (x->set_mode) /* install */
  set_acl(dest, x->mode /* 600 */)
ctx->acl = acl_from_mode ( /* 600 */)
acl_set_fd (ctx->acl) /* fails EACCES */
if (! acls_set)
   must_chmod = true;
if (must_chmod)
  saved_errno = EACCES;
  chmod (ctx->mode /* 600 */)
  if (save_errno)
return -1;

This issue only only seems to be on CIFS.
I'm seeing lot of weird behavior with ACLs there:

acl_set_fd (acl_from_mode (600)) -> EACCES
acl_set_fd (acl_from_mode (755)) -> EINVAL
getxattr ("system.posix_acl_access") -> EOPNOTSUPP

Note we ignore EINVAL and EOPNOTSUPP errors in set_acl(),
and it's just the EACCES that's problematic.
Note this is quite similar to https://debbugs.gnu.org/65599
where Paul also noticed EACCES with fsetxattr() (and others) on CIFS.

The attached is a potential solution which I tested as working
on the same matoro system that Bruno used.

I think I'll apply that after thinking a bit more about it.


Actually that probably isn't appropriate,
as fsetxattr() can validly return EACESS
even though not documented in the man page at least.
The following demonstrates that:
.
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
int wfd;
/* Note S_IWUSR is not set.  */
if ((wfd=open("writable", O_CREAT|O_WRONLY|O_EXCL, S_IRUSR)) == -1)
fprintf(stderr, "open('writable') error [%m]\n");
if (write(wfd, "data", 1) == -1)
fprintf(stderr, "write() error [%m]\n");
if (fsetxattr(wfd, "user.test", "test", 4, 0) == -1)
fprintf(stderr, "fsetxattr() error [%m]\n");
}

Another solution might be to file_has_acl() in copy.c
and skip if "not supported" or nothing is returned.
The following would do that, but I'm not sure about this
(as I'm not sure about ACLs in general TBH).
Note listxattr() returns 0 on CIFS here,
while getxattr ("system.posix_acl_access") returns EOPNOTSUPP,
and file_has_acl() uses listxattr() first.

diff --git a/src/copy.c b/src/copy.c
index d584a27eb..2145d89d5 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1673,8 +1673,13 @@ set_dest_mode:
 }
   else if (x->set_mode)
 {
-  if (set_acl (dst_name, dest_desc, x->mode) != 0)
-return_val = false;
+  errno = 0;
+  int n = file_has_acl (dst_name, &sb);
+  if (0 < n || (errno && ! (is_ENOTSUP (errno) || errno == ENOSYS)))
+{
+  if (set_acl (dst_name, dest_desc, x->mode) != 0)
+return_val = false;
+}
 }


BTW I'm surprised this wasn't reported previously for CIFS,
so I due to this bug and https://debbugs.gnu.org/65599
I suspect a recentish change in CIFS wrt EACCES.

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady

in some cases.
---
 ChangeLog | 7 +++
 lib/set-permissions.c | 8 +++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index c72165e268..fd094d1091 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-04-13  Pádraig Brady  
+
+	acl-permissions: avoid erroneous failure on CIFS
+	* lib/set-permissions.c (set_acls): On Linux (and others), also discount
+	EACESS as a valid errno with FD operations, as CIFS was seen to
+	return that erroneously in some cases.
+
 2024-04-13  Bruno Haible  
 
 	gnulib-tool.py: Code tweak.
diff --git a/lib/set-permissions.c b/lib/set-permissions.c
index 83a355faa5..7f8e55f5cd 100644
--- a/lib/set-permissions.c
+++ b/lib/set-permissions.c
@@ -520,7 +520,13 @@ set_acls (struct permission_context *ctx, const char *name, int desc,
 ret = acl_set_file (name, ACL_TYPE_ACCESS, ctx->acl);
   if (ret != 0)
 {
-  if (! acl_errno_valid (errno))
+  if (! acl_errno_valid (errno)
+#   if defined __linux__
+  /* Special case EACCES for fd operations as CIFS
+ was seen to erroneously return that in some cases.  */
+  || (HAVE_ACL_SET_FD && desc != -1 && errno == EACCES)
+#   endif
+ )
 {
   ctx->acls_not_supported = true;
   if (from_mode || acl_access_nontrivial (ctx->acl) == 0)
-- 
2.44.0

bug#35247: [PATCH] Add alacritty to terminal list supporting colors

2024-04-10 Thread Pádraig Brady


Things have changed a little since the original request.
alacritty sets $COLORTERM, and dircolors now auto accepts that since:
https://github.com/coreutils/coreutils/commit/75c9fc674

There are some complications with remote shells, but
they should boil down to setting up ssh to send/accept COLORTERM.

For some background for setting colors based on env vars
from the Fedora side of things at least, see:
https://fedoraproject.org/wiki/Features/256_Color_Terminals
https://fedoraproject.org/wiki/Changes/Drop_256term.sh

cheers,
Pádraig.

bug#70298: uname -i returns unknown since fedora 38

2024-04-09 Thread Pádraig Brady

On 09/04/2024 10:17, Collin Funk wrote:

On 4/9/24 12:57 AM, Paul Eggert wrote:

Indeed there is, and I merged your bug report into that old one. It'd be nice
if someone could get to the bottom of that bug.

I decided to look into this a bit, since I also have an unknown
'uname -i' and 'uname -p'.

It seems that this option is a Solaris thing. I found the commit that
it was introduced [1]. It also adds some other Solaris compatibility
stuff, so everything seems to add up so far.

The first function it tries is sysinfo (SI_PLATFORM, ...) which seems
to be a Solaris thing [2]. I think Linux has sysinfo but not
SI_PLATFORM.

Then it tries sysctl (...) with a UNAME_HARDWARE_PLATFORM macro to deal
with the BSDs. I think OpenBSD might be missing that definition in the
#ifdef, but I have no way of testing it at the moment [3].

I'm not sure if 'uname -i', 'uname -p', or 'uname -m' are supposed to
be any different from each other. Maybe someone who knows Solaris can
help more?

Illumios defines 'sysinfo' as an alias to 'systeminfo' [4]. There it
returns an 'extern char *platform' in the given buffer when passed
SI_PLATFORM [5]. I've found the definition of the variable, but I
don't know where it is actually set to something useful [6]. Hopefully
that information helps someone...

[1]
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=aaeb7a61c4662fd28cf2bc161b740352711538d2
[2]
https://github.com/illumos/illumos-gate/blob/cf618897f43ea305e3a426f93bbcef4e8106829c/usr/src/uts/common/sys/systeminfo.h#L86
[3] https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/uname.c#n36
[4]
https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/common/sys/sysinfo.S
[5]
https://github.com/illumos/illumos-gate/blob/cf618897f43ea305e3a426f93bbcef4e8106829c/usr/src/uts/common/syscall/systeminfo.c#L124
[6]
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/conf/param.c#L533

Thanks for looking at this.
From the Fedora side, they dropped a Fedora specific patch for Fedora 38.

https://bugzilla.redhat.com/show_bug.cgi?id=548834
https://bugzilla.redhat.com/show_bug.cgi?id=2208048

cheers,
Pádraig.

bug#70231: Performance issue on sort with zero-sized pseudo files

2024-04-07 Thread Pádraig Brady


On 06/04/2024 23:22, Paul Eggert wrote:

On 2024-04-06 03:09, Pádraig Brady wrote:

I'll apply this.


Heh, I beat you to it by looking for similar errors elsewhere and
applying the attached patches to fix the issues I found. None of them
look like serious bugs.


Cool. I thought the sort(1) change worthy of a NEWS entry so pushed one.
Marking this as done.


BTW we should improve sort buffer handling in general


Oh yes.

PS. My current little task is to get i18n to work better with 'sort'.
Among other things I want Unicode-style full case folding.


Excellent, that will help keep the related uniq(1) and join(1)
commands more aligned in their ordering.

cheers,
Pádraig

bug#70219: Bug/Issue with timeout and signals

2024-04-06 Thread Pádraig Brady


tag 70219 notabug
close 70219
stop

On 06/04/2024 16:50, Branden R. Williams via GNU coreutils Bug Reports wrote:

   -k, --kill-after=DURATION
  also send a KILL signal if COMMAND is still running
this long after the initial signal was sent


If you read the above carefully, please note the words _also_ and _initial_.
I.e. -k will cause _another_ signal to be sent, iff the first doesn't terminate 
the command.

Hopefully this example with all pertinent options explains things.
I.e. the first signal sent after 1s is ignored,
and the second kill signal sent after another 2s forcefully kills the command.

  $ date +%s; timeout -k 2s -s0 1s sleep inf; date +%s
  1712427916
  Killed
  1712427919

cheers,
Pádraig

bug#70231: Performance issue on sort with zero-sized pseudo files

2024-04-06 Thread Pádraig Brady


On 06/04/2024 03:52, Takashi Kusumi wrote:

Hi,

I have found a performance issue with the sort command when used on
pseudo files with zero size. For instance, sorting `/proc/kallsyms`, as
demonstrated below, takes significantly longer than executing with
`cat`, generating numerous temporary files. I confirmed this issue on
v8.32 as well as on commit 8f3989d in the master branch.

$ time cat /proc/kallsyms | sort > /dev/null
real0m0.954s
user0m0.873s
sys 0m0.096s

$ time sort /proc/kallsyms > /dev/null
real0m8.555s
user0m3.367s
sys 0m5.064s

$ strace -e trace=openat sort /proc/kallsyms 2>&1 > /dev/null \
  | grep /tmp/sort | head -100
...
openat(AT_FDCWD, "/tmp/sortM6Y6Y1", ...
openat(AT_FDCWD, "/tmp/sortPrHKMG", ...

$ strace -e trace=openat -c sort /proc/kallsyms > /dev/null
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
100.006.419777  19333258 8 openat
-- --- --- - - 
100.006.419777  19333258 8 total

It appears that the buffer size allocated for pseudo files with zero
size is insufficient, likely because it is based on their file size,
which is zero. As seen in the attached patch, I think using
`INPUT_FILE_SIZE_GUESS` to calculate the buffer size when the file size
is zero would resolve this issue.


I'll apply this.

BTW we should improve sort buffer handling in general. From my TODO...

0. Have sort --debug output memory buffer sizes and space avail at $TMPDIR(s)
1. auto increase buffer when reading from pipe or zero sized files.
This will be more efficient and more importantly enable parallel operation.
See http://superuser.com/questions/938558/sort-parallel-isnt-parallelizing/
At least your more appropriate default buffer sizes in this case.
I.e. bigger mins and probably smaller maxs as half avail mem is too aggressive.
2. check() should not need full buffer size?
only merge buffer size or something small at least.
3. Look at minimizing the amount of mem used by default.
Hmm, sort auto adjusts down to avail mem in initbuf() (Test with ulimit -v)
4. Careful with too small buffers as that may initiate
an extra merge step (see section above).

If anyone wants to look at the above give me a heads up,
or I'll get to it sometime in the next release cycle.

thanks!
Pádraig.

bug#70126: Missing a full stop in a coreutils-9.5-pre2 message

2024-04-01 Thread Pádraig Brady


On 01/04/2024 16:30, Petr Pisar wrote:

Hello,

while translating coreutils-9.5-pre2 I noticed this message:

#: src/chown.c:123
msgid ""
"  --reference=RFILE  use RFILE's ownership rather than specifying values\n"
" RFILE is always dereferenced if a symbolic link.\n"

I believe that there is missing a full stop after "values" on the first line.


Fixed.
Marking this as done.

thanks!
Pádraig

bug#70104: "FAIL: test-canonicalize" coreutils v9.5 on musl libc

2024-03-31 Thread Pádraig Brady


On 31/03/2024 10:02, Adept's Lab wrote:

test-canonicalize.c:411: assertion 'strcmp (result1, "//") == 0' failed

^ the only error log message I get. Fail was not presented with previous
stable versions.


This is on musl 1.1.24 as detailed at:
https://github.com/coreutils/coreutils/issues/83

CC'ing bug-gnulib

cheers,
Pádraig

bug#70070: FAIL: test-localtime_r

2024-03-29 Thread Pádraig Brady


On 29/03/2024 12:40, Andreas Schwab wrote:

FAIL: test-localtime_r
==

test-localtime_r.c:58: assertion 'result->tm_hour == 18' failed
FAIL test-localtime_r (exit status: 134)

FAIL: test-localtime_r-mt
=

thread2 disturbed by thread1!
thread1 disturbed by thread2!
FAIL test-localtime_r-mt (exit status: 134)



CC'ing gnulib as those tests are from gnulib.
It may help to detail your platform.

thanks,
Pádraig.

bug#69995: Untranslatable string

2024-03-25 Thread Pádraig Brady


On 25/03/2024 14:02, Göran Uddeborg wrote:

While translating the new version of coreutils to Swedish, I came
across this code from the end of chown-core.h:

   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the %sgroup of each file only if\n\
  its current owner and/or group match those 
specified\n\
  here.  Either may be omitted, in which case a match\n\
  is not required for the omitted attribute\n\
"), user ? "owner and/or " : "");

As this stands, the part "owner and/or " will not be translated.

Simply marking that part too for translation is not a good solution, even
if it would happen to work for Swedish. See
https://www.gnu.org/savannah-checkouts/gnu/gettext/manual/gettext.html#Entire-sentences


Fixed with:
https://github.com/coreutils/coreutils/commit/26fd96a96

thanks,
Pádraig

bug#69985: Untranslatable message in src/chown-core.h:95

2024-03-24 Thread Pádraig Brady


On 24/03/2024 16:57, Frédéric Marchal wrote:

Hi,

In src/chown-core.h:95 (coreutils-9.5-pre1), the following message is
impossible to translate as it is created by concatenating string fragments:

static inline void
emit_from_option_description (bool user)
{
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the %sgroup of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"), user ? "owner and/or " : "");
}

A translatable message must *never ever* be produced by concatenating
substrings. Even if the substrings are themselves translated, there is no
guaranties that the order of the words is the same in every language.

There is, unfortunately, no easy solution that save keystrokes. It must be
written like this:

if (user)
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the owner and/or group of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"));
else
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the group of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"));

That make sense as both are distinct messages with different purposes.

I also suspect that the two messages should be even more different than they
currently are. The message, when user is false, suspiciously contains one more
"owner and/or" that is not removed and it says "Either may be omitted" when
only the group is mentioned at the beginning of the message.


The composed strings are accurate, but yes not translatable.
The attached simplifies the description a little,
so there is a single string to cater for both chown and chgrp.

Marking this as done.

thanks,
Pádraig
From baea9881f79690552b0aa3ca6844bbf64fa6e55e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 24 Mar 2024 20:12:53 +
Subject: [PATCH] doc: fix translation issue in chown/chgrp amalgamation

* src/chown-core.h (emit_from_option_description): The conditional
string composition here caused issues for translators.
Instead move to a more general description ...
(src/chown.c (usage): ... here.
Fixes https://bugs.gnu.org/69985
---
 src/chown-core.h | 12 
 src/chown.c  |  8 +++-
 2 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/src/chown-core.h b/src/chown-core.h
index e1396e3ea..4bd68fed4 100644
--- a/src/chown-core.h
+++ b/src/chown-core.h
@@ -89,16 +89,4 @@ chown_files (char **files, int bit_flags,
  struct Chown_option const *chopt)
   _GL_ATTRIBUTE_NONNULL ();
 
-static inline void
-emit_from_option_description (bool user)
-{
-  printf (_("\
-  --from=CURRENT_OWNER:CURRENT_GROUP\n\
- change the %sgroup of each file only if\n\
- its current owner and/or group match those specified\n\
- here.  Either may be omitted, in which case a match\n\
- is not required for the omitted attribute\n\
-"), user ? "owner and/or " : "");
-}
-
 #endif /* CHOWN_CORE_H */
diff --git a/src/chown.c b/src/chown.c
index 90ce84d67..3194dc75f 100644
--- a/src/chown.c
+++ b/src/chown.c
@@ -109,7 +109,13 @@ With --reference, change the group of each FILE to that of RFILE.\n\
  (useful only on systems that can change the\n\
  ownership of a symlink)\n\
 "), stdout);
-  emit_from_option_description (chown_mode == CHOWN_CHOWN);
+  fputs (_("\
+  --from=CURRENT_OWNER:CURRENT_GROUP\n\
+ change the ownership of each file only if its\n\
+ current owner and/or group match those specified here.\n\
+ Either may be omitted, in which case a match\n\
+ is not required for the omitted attribute\n\
+"), stdout);
   fputs (_("\
   --no-preserve-root  do not treat '/' specially (the default)\n\
   --preserve-rootfail to operate recursively on '/'\n\
-- 
2.44.0

bug#69979: [PATCH-v2][doc] ls -l doc and size -> maj, min for device files

2024-03-24 Thread Pádraig Brady


On 24/03/2024 12:40, Stephane Chazelas wrote:

Tags: patch

My bad, the patch was incorrect, it should have said
"replaced by the corresponding device major and minor numbers as two decimal
numbers separated by a comma and at least one space.", as
there's not always only once space between the major and minor.

See also
https://unix.stackexchange.com/questions/773014/where-do-i-find-documentation-for-the-output-of-ls-l
where the issue was reported and where the variations of output
formats between ls implementations is being mentioned.

New patch attached.


Pushed.
Marking this as done.

thanks,
Pádraig

bug#69951: coreutils: printf formatting bug for nb_NO and nn_NO locales

2024-03-23 Thread Pádraig Brady


tag 69951 notabug
close 69951
stop

On 22/03/2024 20:22, Thomas Dreibholz wrote:

Hi,

I just discovered a printf bug for at least the nb_NO and nn_NO locales
when printing numbers with thousands separator. To reproduce:

#!/bin/bash
for l in de_DE nb_NO ; do
     echo "LC_NUMERIC=$l.UTF-8"
     for n in 1 100 1000 1 10 100 1000 ; do
    LC_NUMERIC=$l.UTF-8 /usr/bin/printf "<%'10d>\n" $n
     done
done

The expected output of "%'10d" is a right-formatted number string with
10 characters.

The output of the test script is fine for e.g. LC_NUMERIC=de_DE.UTF-8
and LC_NUMERIC=en_US.UTF-8:

LC_NUMERIC=de_DE.UTF-8
< 1>
<   100>
< 1.000>
<    10.000>
<   100.000>
< 1.000.000>
<10.000.000>



However, for LC_NUMERIC=nb_NO.UTF-8 and LC_NUMERIC=nn_NO.UTF-8, the
formatting is wrong:

LC_NUMERIC=nb_NO.UTF-8
< 1>
<   100>
<   1 000>
<  10 000>
< 100 000>
<1 000 000>
<10 000 000>



I reproduced the issue with coreutils-8.32-4.1ubuntu1.1 (Ubuntu 22.04)
as well as coreutils-9.3-5.fc39.x86_64 (Fedora 39).

Under FreeBSD 14.0-RELEASE (coreutils-9.4_1), the output looks slightly
better but is still wrong:

LC_NUMERIC=nb_NO.UTF-8
< 1>
<   100>
<    1 000>
<   10 000>
<  100 000>
<1 000 000>
<10 000 000>
LC_NUMERIC=nn_NO.UTF-8
< 1>
<   100>
<    1 000>
<   10 000>
<  100 000>
<1 000 000>
<10 000 000>

May be the issue is that the thousands separator for the Norwegian
locales is a space " ", while it is "."/"," for German/US English locales.


The issue looks to be that the thousands separator for Norwegian locales
is “NARROW NO-BREAK SPACE", or more problematically the _three_ byte
UTF8 sequence E2 80 AF. So it looks like an issue with libc routines
counting bytes rather than characters in this case.

One suggestion is to do the alignment after. For example:

$ export LC_NUMERIC=nb_NO.UTF-8
$ printf "%'.f\n" $(seq -f '1E%.f' 7) | column --table-right=1 -t
10
   100
 1 000
10 000
   100 000
 1 000 000
10 000 000

Actually I've just noticed that specifying the %'10.f format
does count characters and not bytes! So another solution is:

$ export LC_NUMERIC=nb_NO.UTF-8
$ printf "%'10.f\n" $(seq -f '1E%.f' 7)
10
   100
 1 000
10 000
   100 000
 1 000 000
10 000 000

The issue if there is one is in libc at least.
It would be worth checking existing glibc reports about this
and reporting if not mentioned.

cheers,
Pádraig.

bug#69939: test utility bug: "test -a -a -a" and "test -o -o -o" fail

2024-03-22 Thread Pádraig Brady


On 22/03/2024 11:20, Vincent Lefevre wrote:

With GNU Coreutils 9.4, both "test -a -a -a" and "test -o -o -o" fail:

$ export POSIXLY_CORRECT=1
$ /usr/bin/test -a -a -a ; echo $?
/usr/bin/test: ‘-a’: unary operator expected
2
$ /usr/bin/test -o -o -o ; echo $?
/usr/bin/test: ‘-o’: unary operator expected
2

According to POSIX, they should return 0.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
says for 3 arguments:

   If $2 is a binary primary, perform the binary test of $1 and $3.

Here, $2 is -a and -o respectively, which are binary primaries.
And both $1 and $3 are non-null strings.


Agreed. Any leading '-' triggers this:

  $ env test -. -a -.
  test: ‘-.’: unary operator expected

Note we already advise (as does POSIX) to avoid '-a' and '-o':
https://debbugs.gnu.org/22909

BTW POSIX advises to use parenthesis to avoid some parsing ambiguities,
and it helps in this case too:

  $ env test \( -a \) -a \( -a \); echo $?
  0

Though I see bash 5.2.26 has issue with it :/

  $ test \( -a \) -a \( -a \)
  bash: test: `)' expected

bash doesn't like the -a in particular, and is ok with other strings:

  $ test \( -. \) -a \( -. \); echo $?
  0

cheers,
Pádraig

bug#69807: questioning automatic -i in multicolumn pr

2024-03-21 Thread Pádraig Brady


On 14/03/2024 20:31, Douglas McIlroy wrote:

Multicolumn options in pr imply option -i (tabification). The introduction
of tabs with physical rather than logical meaning makes output that is OK
for viewing only if you have correct tab stops, and is complicated for
further processing.  It caters for obsolete equipment--typewriters, on
which tabbing was appreciably faster than spacing.

As a wish-list item I propose abolishing implicit tabification. A second
choice (that doesn't buck Posix) is to provide an option to suppress
implicit tabification. At a bare minimum, document a workaround for the
inconvenient tabs, e.g. post-processing with pr -t -e.


Good call on the documentation.  I'll add this now:

commit 91e69cd2d02f015fc296e02388e0b18a293faa56 (HEAD -> master)
Author: Pádraig Brady 
Date:   Thu Mar 21 15:26:48 2024 +

doc: pr: give solution to expanding TABs in multicolumn output

* doc/coreutils.texi (pr invocation): Explicitly state that
multicolumn output will convert spaces to TABs, and show that
this can be undone with the `pr -t -e` or `expand` commands.
Suggested by Douglas McIlroy in https://bugs.gnu.org/69807

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index e36269588..37d729089 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -2636,9 +2636,11 @@ This option might well cause some lines to be truncated.>
 lines in the columns on each page are balanced.  The options @option{-e}
 and @option{-i} are on for multiple text-column output.  Together with
 @option{-J} option column alignment and line truncation is turned off.
+Since spaces are converted to TABs in multicolumn output, they can be converted
+back by further processing through @command{pr -t -e} or @command{expand}.


thanks,
Pádraig

bug#10311: RFE: Give chmod a "-h" option as well

2024-03-20 Thread Pádraig Brady


On 16/12/2011 16:29, Jan Engelhardt wrote:

Hi,

chown(1) has a -h option by which it affects symlinks directly rather
than the pointed-to file. The bonus side effect is that the
pointed-to files don't get changed in any way, which is kinda welcome
if you attempt to "fix" permissions/ownership in a directory where an
evil user could create a symlink to e.g. /etc/shadow.

Attempting chmod -R g+w /home/groups/evilgroup is still a risk, and
would necessity a more long-winded command involving find(1). It
would therefore be welcome that chmod receive an -h option that just
skips over them (besides perhaps attempting to change their
permissions as well).


Pushed at
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v9.4-162-g07a69fc3b

Marking as done.

cheers,
Pádraig.

bug#11108: [PATCH] chmod: fix symlink race condition

2024-03-20 Thread Pádraig Brady


On 28/03/2012 21:28, Paul Eggert wrote:

On 03/28/2012 01:13 PM, Jim Meyering wrote:

 $ ./chmod u+w f
 ./chmod: changing permissions of 'f': Operation not supported


Yeouch.  I undid the change for now.
Hmm, why did "make check" work for me?
I'll have to investigate later, alas.


Patch for this pushed at:
https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=v9.4-163-g425b8a2f5

Marking this as done.

cheers,
Pádraig.

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-17 Thread Pádraig Brady

On 17/03/2024 11:32, Pádraig Brady wrote:

On 17/03/2024 06:10, Paul Eggert wrote:

On 2024-03-05 06:16, Pádraig Brady wrote:

I think I'll remove the as yet unreleased mv --swap from coreutils,
given that
util-linux is as widely available as coreutils on GNU/Linux platforms.

Although removing that "mv --swap" implementation was a win, I don't
think we can simply delegate this to util-linux's exch command.
Exchanging files via a renameat-like call is not limited to the Linux
kernel; it's also possible on macOS via renameatx_np with RENAME_SWAP,
and there have been noises about adding similar things to other
operating systems.

I just now added support for macOS renameatx_np to Gnulib, so coreutils
does not need to worry about the macOS details; it can simply use
renameatu with the Linux flags. See:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=af32ee824ee18255839f9812b8ed61aa5257a82b

Even with Linux it's dicey. People may have older util-linux installed
and so lack the 'exch' utility; this is true for both Fedora 39 and
Ubuntu 23.10, the current releases. Ubuntu is also odd in that it
doesn't install all the util-linux utilities as part of the util-linux
package, so it's not clear what they will do with 'exch'.

So I propose that we implement the idea in coreutils in a better way,
that interacts more nicely with -t, -T, etc. Also, I suggest using the
Linuxish name "--exchange" instead of the macOSish name "--swap", and
(for now at least) not giving the option a single-letter equivalent as I
expect it to be useful from scripts, not interactively.

After looking at various ways to do it I came up with the attached
proposed patch. This should work on both GNU/Linux and macOS, if your OS
is recent enough and the file system supports atomic exchange.

The implementation looks good.

Re exch(1) on macos, I see util-linux is on homebrew,
so it would be relatively easy to ifdef renameatx_np in util-linux also.

I think the --no-copy situation is brittle, as scripts not using it now
would be atomic, but then if we ever supported cross fs swaps
it may become non atomic. I'd at least doc with a line in the --exchange
description in usage() to say something like:
"Use --no-copy to enforce atomic operation"

While the most flexible, it's also quite awkward to need
`mv --exchange --no-copy --no-target-directory` for most uses.
I.e. it's tempting to imply the --no-... options with --exchange,
but I suppose since scripting is the primary use case for this
flexibility trumps conciseness, so I'm ok with the verbosity I think.

Oh also in the texinfo I think it's important to mention that the swap
will "exchange all data and metadata". That's not obvious otherwise.
For example users may be wondering if only data was being exchanged
with the macos exchangedata(2) or equivalent.

cheers,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-17 Thread Pádraig Brady

On 17/03/2024 06:10, Paul Eggert wrote:

On 2024-03-05 06:16, Pádraig Brady wrote:

I think I'll remove the as yet unreleased mv --swap from coreutils,
given that
util-linux is as widely available as coreutils on GNU/Linux platforms.

I just now added support for macOS renameatx_np to Gnulib, so coreutils
does not need to worry about the macOS details; it can simply use
renameatu with the Linux flags. See:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=af32ee824ee18255839f9812b8ed61aa5257a82b

The implementation looks good.

Re exch(1) on macos, I see util-linux is on homebrew,
so it would be relatively easy to ifdef renameatx_np in util-linux also.

thanks,
Pádraig

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-15 Thread Pádraig Brady


On 15/03/2024 05:21, Paul Eggert wrote:

On 2024-03-14 06:03, Pádraig Brady wrote:



How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
is used when the compiler supports -O1? That way we don't have to worry
about running the program, because (with the "volatile") clang will
error out.

Alternatively perhaps there's some way to check for the bug using
preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
__clang_major__, and __aarch64__. (This could be more fragile, though,
as clang presumably will fix the bug eventually.)


That would probably work for this edge case, but it's brittle
and I'm worried about other combinations of
compiler versions, complier options, and architectures.


Sure, but one cannot resolve such worries completely, as compiler errors
are inherently brittle: even a runtime test cannot prove their absence.

As I understand it, __bf16 is a real zoo: sometimes hardware supports
it, sometimes it doesn't, sometimes the software emulation library is
there, sometimes it's not, and so forth. Running a program on a
development host is likely to not match the runtime behavior on a
production host, so AC_RUN_IFELSE is reasonably likely to produce a
false positive, which is worse than a false negative.

Another issue, which is somewhat related, is that coreutils's current
macro BF16_SUPPORTED is too broad. Because __bf16 has so many bugs,
currently it's probably a mistake to say __bf16 is fully supported
anywhere. Even GCC is buggy; see for example:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114347

which I just now discovered when writing this email.


Nice one! (though that should not impact od)


The question isn't
so much whether __bf16 is supported, it's whether it's supported well
enough to compile and run GNU od. >

Come to think of it, for 'od' we might be better off avoiding __bf16
entirely, and simply converting the bits by hand into a 'float'
variable. This would likely be more portable, as it would work even with
compilers that lack __bf16, or that have __bf16 bugs relevant to 'od'.
And that way we could remove the configure-time test entirely. (There
would be endianness issues but they're easily addressible.)


Right.

My thinking though was that it was nice code wise to treat __bf16 like
_Float16 (and like float etc.).
More importantly I only see __bf16 support improving going forward,
so preferred to keep the code simpler (and the configure check robust).
Currently both the main compilers, and main architectures are supported
(when not cross-compiling), and tested quite robustly in the configure check.
Another thing that dissuaded me from implementing the conversions ourselves
is the plethora of floating point config options for compilers,
which made me think in general that floating point conversion
was best left to the compiler.

I've just now realized that AC_RUN_IFELSE needs an explicit default
for cross-compiling, and I've just pushed a conservative default
(which may default the other way in future when support broadens).

thanks,
Pádraig.

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-14 Thread Pádraig Brady


On 14/03/2024 13:35, Collin Funk wrote:

On 3/14/24 6:03 AM, Pádraig Brady wrote:

It would disable this feature for cross-compilation yes,
but this isn't the first instance of AC_RUN_IFELSE we use.


For completeness I should add that the above check can be
overridden if cross-compiling or whatever like:

  ./configure utils_cv_ieee_16_bit_supported=yes 
utils_cv_brain_16_bit_supported=yes


Sorry if this is not the proper place to ask, but would it be possible
to make Autoconf use an emulator when cross-compiling? This issue
would be *less* of a problem in that case.

I am more familiar with CMake where check_c_source_runs is used
performs a similar task [1]. It is essentially a wrapper function
around try_run [2]. CMake allows you to set
CMAKE_CROSSCOMPILING_EMULATOR which will make try_run use an emulator
to run programs instead [3].

I've used this with Wine and QEMU User a few times and it makes
CMake's configure tests work fine (though much slower). If this
functionality seems beneficial then I can look into adding it to
Autoconf.

[1] https://cmake.org/cmake/help/latest/module/CheckCSourceRuns.html
[2] 
https://github.com/Kitware/CMake/blob/4285dec5f012260337193f29f18899d6cf2536a4/Modules/Internal/CheckSourceRuns.cmake#L93
[3] 
https://cmake.org/cmake/help/latest/variable/CMAKE_CROSSCOMPILING_EMULATOR.html


Interesting.
Perhaps this is something that could be configured separately
on the build system, through something like binfmt_misc ?

cheers,
Pádraig

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-14 Thread Pádraig Brady


On 14/03/2024 05:59, Paul Eggert wrote:

On 2024-03-12 19:24, Grisha Levit wrote:

- AC_COMPILE_IFELSE(
+ AC_RUN_IFELSE(


This sort of change would break cross-compilation, no?


It would disable this feature for cross-compilation yes,
but this isn't the first instance of AC_RUN_IFELSE we use.


How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
is used when the compiler supports -O1? That way we don't have to worry
about running the program, because (with the "volatile") clang will
error out.

Alternatively perhaps there's some way to check for the bug using
preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
__clang_major__, and __aarch64__. (This could be more fragile, though,
as clang presumably will fix the bug eventually.)


That would probably work for this edge case, but it's brittle
and I'm worried about other combinations of
compiler versions, complier options, and architectures.

For example sse availability is in the mix here,
so this will now correctly avoid this feature with:
  CFLAGS=-mno-sse ./configure --quiet
Whereas previously it would have built but produced wrong values.

cheers,
Pádraig.

bug#68708: [PATCH] env,kill: Handle unnamed signals

2024-03-13 Thread Pádraig Brady


On 25/01/2024 19:52, Grisha Levit wrote:

On Thu, Jan 25, 2024, 09:50 Pádraig Brady  wrote:
This mostly looks good, except:

- No need to clear the errno before kill(3).
- Better to use SIG%d rather than the bare %d for signal _names_, as we already 
parse this format


Makes sense, done below.

* src/operand2sig.c (operand2sig): Drop signame argument, accept all
signal numbers <= SIGNUM_BOUND.  All callers updated.
* src/env.c (parse_signal_action_params, reset_signal_handlers)
(parse_block_signal_params, set_signal_proc_mask)
(list_signal_handling): Accept all signal numbers <= SIGNUM_BOUND,
use SIG%d for printing if necessary.
* src/kill.c (list_signals, main): Likewise.
(send_signals): Check errno from kill(3) for bad signo.
* src/timeout.c (main): Update operand2sig call.
* tests/misc/kill.sh: Test listing all signal numbers.


I've made a few adjustments:

- Clarified the commit message.

- I've gone back to have kill -l produce a bare number for unnamed signals, 
because:
this is consistent with util-linux,
SIG%d is only parseable by coreutils (not util-linux or bash),
easier to programatically determine if a name is defined.
I've left as SIG%d for -t output as that's a coreutils specific option,
and not really programatically consumable anyway.

- I've added a validation check for `env --block=32` so that it fails
like `env --default=32` and `env --ignore=32`.
I.e. exits with EXIT_CANCELED.

- Added a NEWS entry.

I'll apply this later.

thanks!
PádraigFrom b8d1b00e21a86270fbbe5903a41319894db266df Mon Sep 17 00:00:00 2001
From: Grisha Levit 
Date: Thu, 25 Jan 2024 14:52:50 -0500
Subject: [PATCH] env,kill,timeout: support unnamed signals

Some signals with values less that the max signal number for the system
do not have defined names.  For example, currently on amd64 Linux,
signals 32 and 33 do not have defined names, and Android has a wider
gap of undefined names where it reserves some realtime signals.

Previously the signal listing in env ended up reusing the name
of the last printed valid signal (the repeated HUP below):

$ env --list-signal-handling true
HUP( 1): IGNORE
HUP(32): BLOCK
HUP(38): IGNORE

..and the corresponding signal numbers were rejected as operands for the
env, kill, and timeout commands.

This patch removes the requirement that sig2str returns 0 for a signal
number associated with an operand.  This allows unnamed signals to be in
the sets `env' attempts to manipulate when a --*-signal option is used
with no argument, and kill(1) and timeout(1) to send such unnamed
signals.

* src/operand2sig.c (operand2sig): Drop signame argument, accept all
signal numbers <= SIGNUM_BOUND.  All callers updated.
* src/env.c (parse_signal_action_params, reset_signal_handlers)
(parse_block_signal_params, set_signal_proc_mask)
(list_signal_handling): Accept all signal numbers <= SIGNUM_BOUND,
use SIG%d for printing if necessary.
* src/kill.c (list_signals, main): Likewise.
(send_signals): Check errno from kill(3) for bad signo.
* src/timeout.c (main): Update operand2sig call.
* tests/misc/kill.sh: Test listing all signal numbers.
* NEWS: Mention the improvement.
---
 NEWS   |  3 +++
 src/env.c  | 29 ++---
 src/kill.c | 24 
 src/operand2sig.c  |  8 
 src/operand2sig.h  |  2 +-
 src/timeout.c  |  3 +--
 tests/misc/kill.sh | 10 +-
 7 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/NEWS b/NEWS
index 20cadf183..f21efc7c0 100644
--- a/NEWS
+++ b/NEWS
@@ -92,6 +92,9 @@ GNU coreutils NEWS-*- outline -*-
   This was previously 128KiB and increasing to 256KiB was seen to increase
   throughput by 10-20% when reading cached files on modern systems.
 
+  env,kill,timeout now support unnamed signals. kill(1) for example now
+  supports sending such signals, and env(1) will list them appropriately.
+
   SELinux operations in file copy operations are now more efficient,
   avoiding unneeded MCS/MLS label translation.
 
diff --git a/src/env.c b/src/env.c
index 73d9847f4..ed6628f8f 100644
--- a/src/env.c
+++ b/src/env.c
@@ -538,7 +538,6 @@ parse_split_string (char const *str, int *orig_optind,
 static void
 parse_signal_action_params (char const *arg, bool set_default)
 {
-  char signame[SIG2STR_MAX];
   char *opt_sig;
   char *optarg_writable;
 
@@ -548,8 +547,7 @@ parse_signal_action_params (char const *arg, bool set_default)
  Some signals cannot be set to ignore or default (e.g., SIGKILL,
  SIGSTOP on most OSes, and SIGCONT on AIX.) - so ignore errors.  */
   for (int i = 1 ; i <= SIGNUM_BOUND; i++)
-if (sig2str (i, signame) == 0)
-  signals[i] = set_default ? DEFAULT_NOERR : IGNORE_NOERR;
+signals[i] = set_default ? DEFAULT_NOERR : IGNORE_NOERR;
   return;
 }
 
@@ -558,7 +556,7 @@ parse_signal_action_params (char

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-13 Thread Pádraig Brady


On 13/03/2024 02:24, Grisha Levit wrote:

Recent clang provides __bf16 on aarch64 but it is broken.

If built with -O0, the conversion is wrong:

 $ printf '\x3F\x80' | od --end=big -An -tfB | tr -d ' '
 1.875

If built with -O1 or higher, compilation fails:

 fatal error: error in backend: Cannot select: 0xb47a58d29780: f32 = 
fp_extend 0xb47a58d31720
   0xb47a58d31720: bf16,ch = CopyFromReg 0xb47b78c53720, 
Register:bf16 %13
 0xb47a58d29470: bf16 = Register %13
 In function: print_bfloat

The latter issue does not cause the existing configure test to fail
because the promotion is optimized out.

* configure.ac: Ensure 16 bit float promotion code does not get
optimized out, and produces an expected result.


Looks good!

I'll follow up with another patch to all these involved checks to be cached / 
bypassed,
by wrapping them in AC_CACHE_VAL().

Marking this as done.

thanks!
Pádraig

bug#69724: [PATCH] dircolors: add more archive extensions

2024-03-11 Thread Pádraig Brady


I'm erring on the side of applying this,
though I'm a bit wary of an endlessly expanding list,
as there is no end to what can be compressed for example.

cheers,
Pádraig

bug#69636: Re: [PATCH] Improve quality of format-checking code for seq.

2024-03-08 Thread Pádraig Brady


tag 69636 notabug
close 69636
stop

On 08/03/2024 12:29, User wrote:


Jim Meyering  wrote:


Paul Eggert  wrote:

* src/seq.c (validate_format): Remove. Migrate its checks into...
(long_double_format): Report an error and exit if an error is found,
instead of returning NULL.  All callers changed.
Use a more-consistent format for diagnostics.
* tests/misc/seq: Adjust to the more-consistent format for diagnostics.
---
   src/seq.c  |   71

+++

   tests/misc/seq |   10 
   2 files changed, 25 insertions(+), 56 deletions(-)


Thanks.
As far as I can see this is a "no semantic change"
patch (modulo diagnostics), and it looks fine, so I pushed it.


Dear Paul Eggert and Jim Meyering,


Thank you very much for the contribution and care! I hope you have
awesome and cozy day!

It is a nice addition to check the format in the most and explicitly
safe cases, I believe, however it disallows to use it in cases when you
need to just repeat a character without ANY processing directive! For
example, `a=10; seq -f 'x' -s '' -- "$a";` would print 10 characters "x"
without the check added prior and significantly (or little but still)
support development in long codes which would require explicit loops or
additional dependencies like awesome Perl.

Is it possible to reconsider the check or add an option to ignore the
format which does not have a required directive? If you know a better
alternative to the standard utility which would result in the same small
code but great result, please suggest if possible!

Again, appreciated, and please stay safe!


This suggestion has some merit,
but there are a few ways to generate repetitive data already:

  a=10

  seq $a | xargs printf -- 'x%.0s'

  printf "%${a}s" '' | tr ' ' x

  yes x | head -n$a | tr -d '\n'

Also if only outputting to a terminal, then you can get the
repetition to happen within the terminal code like:

  printf -- "x\e[${a}b"

Therefore I'm not sure it's worth relaxing the validation in seq.

thanks,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-05 Thread Pádraig Brady


On 05/03/2024 04:10, Paul Eggert wrote:

On 3/4/24 16:43, Dominique Martinet wrote:

Adding Rob to the loop because this impacts compatibility with
toybox/maybe busybox implementations


Busybox does not use RENAME_EXCHANGE, so this isn't a Busybox issue.

Toybox mv added -x to its development version yesterday:

https://github.com/landley/toybox/commit/a2419ad52d489bf1a84a9f3aa73afb351642c765

so there's little prior art there, and there's still plenty of time to
fix its problems before exposing it to the world.



I also see --swap mostly used by scripts and this actually feels a bit
dangerous to me -- I'd *always* use this with -T.


Yes, it's a problem.

By "see --swap mostly used by scripts" I assume you mean scripts that
haven't been written yet, assuming that nobody had -x until yesterday



(by the way, what's this "rename" command you speak of?


https://mirrors.edge.kernel.org/pub/linux/utils/util-linux/

Now that I've looked into it further, util-linux already has an "exch"
command that does exactly what you want. This is the command that toybox
should implement rather than try to simulate it with "mv -x" (which
causes all sorts of problems).

That is, toybox should revert yesterday's change to "mv", and should
implement "exch" instead.


I think having the functionality in mv(1) is better than in rename(1),
but since exch(1) is already released that's probably
the best place for this functionality now.

A separate exch command may be overkill for just this,
but perhaps related functionality might be added to that command in future.
For e.g. some of the discussed functionality for a "replace" command
might reside there.

So I think I'll remove the as yet unreleased mv --swap from coreutils, given 
that
util-linux is as widely available as coreutils on GNU/Linux platforms.

cheers,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-04 Thread Pádraig Brady


On 04/03/2024 15:47, Pádraig Brady wrote:

On 04/03/2024 00:44, Paul Eggert wrote:

Although I like the idea of exposing file swaps to the user, the first
cut of 'mv -x' has significant problems.

I expect 'mv -x A B' to act like 'mv A B' except the destination must
exist and is renamed back to A. However, this is not true for 'mv -x A
B' when B is a directory; it renames B to A rather than renaming B/A to
A as I expect. That is, 'mv -x' acts as if -T (--no-target-directory) is
also specified. There is no way to get mv's traditional behavior, or to
get mv -t behavior.

To fix this, 'mv -x' should respect the usual mv behavior with respect
to directories. For example, when D is a directory 'mv -x A B C D'
should act like 'mv A B C D' except that D's old entries should be
renamed back to A B and C. And the -t and -T options should work with -x
the same way they work when -x is not specified.

This needs to happen before the next coreutils release, to avoid
confusion about 'mv -x'.


So you're saying the current mv -x behavior should only be with -T explicitly 
specified.
With the current implementation, we should at least document that -x implies -T.

By changing as you suggest, we'd not be adding any significant functionality,
but potentially reducing confusion by following mv traditional arg handling.
I agree with you I think, but not strongly.

It's worth stating though that if users want to swap 2 directories
they'd have to `mv -Tx d1 d2`, which may also be a little confusing.

The use case we don't currently support directly is:
cd source_dir && mv -x * "$dest_dir"
Instead that currently requires:
for f in *; do mv -x "$f" "$dest_dir"; done

For completeness, stating disadvantages of pulling this use case within mv,
is that by requiring the for loop for this would allow users
to programatically determine which swap failed, and I see --swap
as primarily a programatic interface used by scripts.
Also if we made this change, We'd have to document that `mv -x 1 2 ... d`
was not atomic over the whole set.

I will look at making the change before the upcoming release.


Another point worth mentioning before changing this,
is that changing would make the --swap operation non symmetric.
I.e. `mv -x a d` would be different to `mv -x d a` where d in a directory.

cheers,
Pádraig

p.s. Re dir swapping I noticed that specifying '/' or '.' dirs always gives 
EBUSY:
  $ mv -x . .
  mv: swap of '.' and '.' failed: Device or resource busy

bug#69546: cksum: inconsistent handling of invalid length values

2024-03-04 Thread Pádraig Brady


On 04/03/2024 15:44, Daniel Hofstetter wrote:

Hi,

When specifying an invalid length value followed by a valid length
value I get the following error:

$ printf "hello" | cksum --algo=blake2b --length=12 --length=8
cksum: invalid length: ‘12’
cksum: length is not a multiple of 8

However, if the invalid length value is a multiple of 8 and greater
than 512 (the maximum digest length for blake2b), there is no error:

$ printf "hello" | cksum --algo=blake2b --length=123456 --length=8
BLAKE2b-8 (-) = 29

I think the behavior should be the same in the two scenarios, whether
it's showing an error or ignoring the invalid value.

I'm using coreutils 9.4.


I pushed a fix at:
https://github.com/coreutils/coreutils/commit/fea833591

Now only the last used --length is validated.

Marking this as done.

cheers,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-04 Thread Pádraig Brady


On 04/03/2024 00:44, Paul Eggert wrote:

Although I like the idea of exposing file swaps to the user, the first
cut of 'mv -x' has significant problems.

I expect 'mv -x A B' to act like 'mv A B' except the destination must
exist and is renamed back to A. However, this is not true for 'mv -x A
B' when B is a directory; it renames B to A rather than renaming B/A to
A as I expect. That is, 'mv -x' acts as if -T (--no-target-directory) is
also specified. There is no way to get mv's traditional behavior, or to
get mv -t behavior.

To fix this, 'mv -x' should respect the usual mv behavior with respect
to directories. For example, when D is a directory 'mv -x A B C D'
should act like 'mv A B C D' except that D's old entries should be
renamed back to A B and C. And the -t and -T options should work with -x
the same way they work when -x is not specified.

This needs to happen before the next coreutils release, to avoid
confusion about 'mv -x'.


So you're saying the current mv -x behavior should only be with -T explicitly 
specified.
With the current implementation, we should at least document that -x implies -T.

By changing as you suggest, we'd not be adding any significant functionality,
but potentially reducing confusion by following mv traditional arg handling.
I agree with you I think, but not strongly.

It's worth stating though that if users want to swap 2 directories
they'd have to `mv -Tx d1 d2`, which may also be a little confusing.

The use case we don't currently support directly is:
cd source_dir && mv -x * "$dest_dir"
Instead that currently requires:
for f in *; do mv -x "$f" "$dest_dir"; done

For completeness, stating disadvantages of pulling this use case within mv,
is that by requiring the for loop for this would allow users
to programatically determine which swap failed, and I see --swap
as primarily a programatic interface used by scripts.
Also if we made this change, We'd have to document that `mv -x 1 2 ... d`
was not atomic over the whole set.

I will look at making the change before the upcoming release.

cheers,
Pádraig

bug#69488: tr (question)

2024-03-01 Thread Pádraig Brady


On 01/03/2024 15:33, lacsaP Patatetom wrote:

hi,

I did a few tests with tr and I'm surprised by the results...

$ echo éèçà
éèçà

these characters are encoded in utf-8 on 2 bytes :

$ echo éèçà | xxd
: c3a9 c3a8 c3a7 c3a0 0a   .

now I use tr to remove non-printable characters :

$ echo éèçà | tr -cd '[:print:]'
$ echo éèçà | tr -cd '[:print:]' | wc
   0   0   0

all characters are deleted by tr
now I want to keep the "é" character :

$ echo éèçà | tr -cd '[:print:]é'
é���

why do the "�" characters appear ?

regards, lacsaP.



It's a known issue that tr is currently non multi-byte aware.

thanks,
Pádraig

bug#69418: test failure when no french locale is installed

2024-02-27 Thread Pádraig Brady


On 26/02/2024 23:05, Bruno Haible wrote:

Testing the current coreutils git master:

On a Debian 12 system, in which I have not installed a French UTF-8 locale,
I see a test failure of tests/misc/join-utf8.

The essential lines from test-suite.log:

+ test set = set
+ LC_ALL=none
../tests/misc/join-utf8.sh: line 24: warning: setlocale: LC_ALL: cannot change 
locale (none): No such file or directory

The cause is that on such a system, LOCALE_FR_UTF8, as determined by
gnulib/m4/locale-fr.m4, is 'none', not empty or absent.

The attached patch fixes the failure.



Applied.

This also indicates that gnulib ensures that LOCALE_FR_UTF8
is set to "none" when not present or usable,
so I've made that simplification in other tests now.

Marking this as done.

thanks!
Pádraig.

bug#69369: wc -w ignores breaking space over UCHAR_MAX

2024-02-25 Thread Pádraig Brady


On 24/02/2024 20:44, Aearil via GNU coreutils Bug Reports wrote:

Hi,

wc -w doesn't seem to recognize whitespace characters with a codepoint
over UCHAR_MAX (255) as word separators. For example, using the
character EM SPACE U+2003:

$ printf "foo\u2003bar" | ./wc -w
1

I should get a word count of 2, but instead the space is ignored while
counting words. Meanwhile, wc v9.4 gives the correct answer:

$ printf "foo\u2003bar" | wc -w
2

It looks like the regression has been introduced by [f40c6b5] and
would be fixed by something like the following change:

diff --git a/src/wc.c b/src/wc.c
index f5a921534..9d456f8c0 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -528,7 +528,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, 
off_t current_pos)
if (width > 0)
  linepos += width;
  }
-  in_word2 = !iswnbspace (wide_char);
+  in_word2 = !iswspace (wide_char) && !iswnbspace 
(wide_char);
  }

/* Count words by counting word starts, i.e., each


Nice one.
Great to catch this before release.
I've augmented your patch with a test,
and will push the attached later.

Marking this as done.

thanks!
Pádraig
From ced8c64c986b79c0bfa74028a9581e07d5df1974 Mon Sep 17 00:00:00 2001
From: Aearil 
Date: Sat, 24 Feb 2024 21:44:24 +0100
Subject: [PATCH] wc: fix -w with breaking space over UCHAR_MAX

* src/wc.c (wc): Fix regression introduced in commit v9.4-48-gf40c6b5cf.
* tests/wc/wc-nbsh.sh: Add test cases for "standard" spaces.
Fixes https://bugs.gnu.org/69369
---
 src/wc.c| 2 +-
 tests/wc/wc-nbsp.sh | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/wc.c b/src/wc.c
index f5a921534..9d456f8c0 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -528,7 +528,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, off_t current_pos)
   if (width > 0)
 linepos += width;
 }
-  in_word2 = !iswnbspace (wide_char);
+  in_word2 = !iswspace (wide_char) && !iswnbspace (wide_char);
 }
 
   /* Count words by counting word starts, i.e., each
diff --git a/tests/wc/wc-nbsp.sh b/tests/wc/wc-nbsp.sh
index 371cc8b5b..39a8baccc 100755
--- a/tests/wc/wc-nbsp.sh
+++ b/tests/wc/wc-nbsp.sh
@@ -38,10 +38,15 @@ fi
 
 export LC_ALL=en_US.UTF-8
 if test "$(locale charmap 2>/dev/null)" = UTF-8; then
+  #non breaking space class
   check_word_sep '\u00A0'
   check_word_sep '\u2007'
   check_word_sep '\u202F'
   check_word_sep '\u2060'
+
+  #sampling of "standard" space class
+  check_word_sep '\u0020'
+  check_word_sep '\u2003'
 fi
 
 export LC_ALL=ru_RU.KOI8-R
-- 
2.43.0

bug#69368: [PATCH] Allow --zero with --check

2024-02-25 Thread Pádraig Brady


On 24/02/2024 19:07, Ricardo Branco wrote:

The --zero option is useless if not supported by --check.

Attached patch fixes it.

https://github.com/coreutils/coreutils/pull/81


I'm not sure about this.

By adding this support we diverge the checksum file formats supported by check.
I.e. users may inadvertently create such files that are not usable
by any previous version of the checksum utilities.

Also having --check support NUL delimited files
doesn't add any additional functionality, as the checksum files
are already automatically escaped appropriately.

Also NUL delimited checksum files can be harder to manage with other text tools.

--zero was added for the case of piping to other tools
that would find the filename escaping problematic.

cheers,
Pádraig

bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-02-24 Thread Pádraig Brady


On 01/02/2024 00:36, Paul Eggert wrote:

On 1/31/24 06:06, Pádraig Brady wrote:

To my mind the most protective option takes precedence.


That's not how POSIX works with mv -i and mv -f. The last flag wins. I
assume this is so that people can have aliases or shell scripts that
make -i the default, but you can override by specifying -f on the
command line. E.g., in mymv:

 #!/bin/sh
 mv -i "$@"

then "mymv -f a b" works as expected.

Wouldn't a similar argument apply to cp's --update options?

Or perhaps we should play it safe, and reject any combination of
--update etc. options that are incompatible. We can always change our
mind later and say that later options override earlier ones, or do
something else that's less conservative.


OK I err'd on the side of changing as little as possible wrt precedence.
-n still has precedence over any -u,--update option.
That's simplest to understand (while not changing existing precedence),
and shouldn't cause any practical issues.

I plan to push the 2 attached patches for this tomorrow.

thanks,
PádraigFrom 51cf6f3ff272467bc9cde75c48d0250446be9b9c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sat, 24 Feb 2024 19:51:56 +
Subject: [PATCH 1/2] cp,mv: reinstate that -n exits with success if files
 skipped

* src/cp.c (main): Adjust so that -n will exit success if skipped files.
* src/mv.c (main): Likewise.
* doc/coreutils.texi (cp invocation): Adjust the description of -n.
* src/system.h (emit_update_parameters_note): Adjust --update=none
comparison.
* tests/cp/cp-i.sh: Adjust -n exit status checks.
* tests/mv/mv-n.sh: Likewise.
* NEWS: Mention the change in behavior.
Fixes https://bugs.gnu.org/62572
---
 NEWS   |  4 
 doc/coreutils.texi | 17 +
 src/cp.c   | 14 --
 src/mv.c   | 10 ++
 src/system.h   |  4 ++--
 tests/cp/cp-i.sh   | 11 +--
 tests/mv/mv-n.sh   | 11 +--
 7 files changed, 39 insertions(+), 32 deletions(-)

diff --git a/NEWS b/NEWS
index 36b7fd1fe..a52b4cf66 100644
--- a/NEWS
+++ b/NEWS
@@ -43,6 +43,10 @@ GNU coreutils NEWS-*- outline -*-
   basenc --base16 -d now supports lower case hexadecimal characters.
   Previously an error was given for lower case hex digits.
 
+  cp --no-clobber, and mv -n no longer exit with failure status if
+  existing files are encountered in the destination.  Instead they revert
+  to the behavior from before v9.2, silently skipping existing files.
+
   ls --dired now implies long format output without hyperlinks enabled,
   and will take precedence over previously specified formats or hyperlink mode.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index c42126955..911e15b46 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9059,12 +9059,13 @@ a regular file in the destination tree.
 @itemx --no-clobber
 @opindex -n
 @opindex --no-clobber
-Do not overwrite an existing file; silently fail instead.
-This option overrides a previous
-@option{-i} option.  This option is mutually exclusive with @option{-b} or
-@option{--backup} option.
-See also the @option{--update=none} option which will
-skip existing files but not fail.
+Do not overwrite an existing file; silently skip instead.
+This option overrides a previous @option{-i} option.
+This option is mutually exclusive with @option{-b} or @option{--backup} option.
+This option is deprecated due to having a different exit status from
+other platforms.  See also the @option{--update} option which will
+give more control over how to deal with existing files in the destination,
+and over the exit status in particular.
 
 @item -P
 @itemx --no-dereference
@@ -9335,8 +9336,8 @@ This is the default operation when an @option{--update} option is not specified,
 and results in all existing files in the destination being replaced.
 
 @item none
-This is similar to the @option{--no-clobber} option, in that no files in the
-destination are replaced, but also skipping a file does not induce a failure.
+This is like the deprecated @option{--no-clobber} option, where no files in the
+destination are replaced, and also skipping a file does not induce a failure.
 
 @item older
 This is the default operation when @option{--update} is specified, and results
diff --git a/src/cp.c b/src/cp.c
index 0355ed97f..36ae4fb66 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -195,8 +195,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
   -L, --dereferencealways follow symbolic links in SOURCE\n\
 "), stdout);
   fputs (_("\
-  -n, --no-clobber ensure no existing files overwritten, and fail\n\
- silently instead. See also --update\n\
+  -n, --no-clobber silently skip existing files.\n\
+ See also --update\n\
 "), stdout);
   fputs (_("\
   -P, --no-de

bug#68871: Can't use od to print half-precision floats

2024-02-05 Thread Pádraig Brady


On 04/02/2024 21:59, Paul Eggert wrote:

Thanks. One minor comment about this:


+@item B
+brain 16 bit float
+@item H
+half precision float


It might be helpful explain these two formats, e.g., to cite:

https://en.wikipedia.org/wiki/Bfloat16_floating-point_format

and

https://en.wikipedia.org/wiki/Half-precision_floating-point_format

respectively, since the formats are new compared to the other formats
being discussed.


Good call.
I added those references and pushed.

cheers,
Pádraig.

bug#68871: Can't use od to print half-precision floats

2024-02-04 Thread Pádraig Brady


On 02/02/2024 01:47, Paul Eggert wrote:

On 2/1/24 13:59, Pádraig Brady wrote:


bfloat16 looks like a truncated single precision IEEE,
so we should be able to just pad the extra 16 bits with zeros
when converting to single precision internally for processing.


Sounds good. This would mean od could work even the platform doesn't
support bfloat16_t, since od.c could fall back on the above code (though
I suppose it could be endianness-dependent).


I see that __bf16 is supported by released versions of gcc and clang,
so rather than add conversion complexity to the core od print loop,
the attached relies on compiler support for _Float16 and __bf16 types,
which compilers will expand to more targets going forward.

I tested the attached on:

gcc 13, clang 17 x86 (Both types supported)
gcc 7 aarch64 (Only -fH supported)
gcc 13 ppc(be) (Neither supported (both will be with GCC 14))

I'll commit this later.
Marking this bug as done.

thanks,
PádraigFrom 5c1349185e050b8bf8fb1d356c54170112d0e673 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Thu, 1 Feb 2024 17:59:51 +
Subject: [PATCH] od: support half precision floating point

Rely on compiler support for _Float16 and __bf16
to support -fH and -fB formats respectively.
I.e. IEEE 16 bit, and brain 16 bit floats respectively.
Modern GCC and LLVM compilers support both types.

clang-sect=half-precision-floating-point
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
https://clang.llvm.org/docs/LanguageExtensions.html#$clang-sect
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0192r4.html
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1467r9.html

This was tested on:
gcc 13, clang 17 x86 (Both types supported)
gcc 7 aarch64 (Only -fH supported)
gcc 13 ppc(be) (Neither supported. -fH will be with GCC 14)

* src/od.c: Support -tf2 or -tfH to print IEEE 16 bit floating point,
or -tfB to print Brain 16 bit floating point.
* configure.ac: Check for _Float16 and __bf16 types.
* doc/coreutils.texi (od invocation): Mention the new -f types.
* tests/od/od-float.sh: Add test cases.
* NEWS: Mention the new feature.
Addresses https://bugs.gnu.org/68871
---
 NEWS |  3 ++
 configure.ac |  3 ++
 doc/coreutils.texi   |  4 +++
 src/od.c | 73 ++--
 tests/od/od-float.sh | 21 +
 5 files changed, 101 insertions(+), 3 deletions(-)

diff --git a/NEWS b/NEWS
index dc5d875dd..5b5befd2c 100644
--- a/NEWS
+++ b/NEWS
@@ -57,6 +57,9 @@ GNU coreutils NEWS-*- outline -*-
   chgrp now accepts the --from=OWNER:GROUP option to restrict changes to files
   with matching current OWNER and/or GROUP, as already supported by chown(1).
 
+  od now supports printing IEEE half precision floating point with -t fH,
+  or brain 16 bit floating point with -t fB, where supported by the compiler.
+
   tail now supports following multiple processes, with repeated --pid options.
 
 ** Improvements
diff --git a/configure.ac b/configure.ac
index 48ab4ef53..64ff32a96 100644
--- a/configure.ac
+++ b/configure.ac
@@ -522,6 +522,9 @@ CFLAGS=$ac_save_CFLAGS
 LDFLAGS=$ac_save_LDFLAGS
 ac_c_werror_flag=$cu_save_c_werror_flag
 
+# Test compiler support for half precision floating point types (for od)
+AC_CHECK_TYPES([_Float16, __bf16])
+
 ac_save_CFLAGS=$CFLAGS
 CFLAGS="-mavx -mpclmul $CFLAGS"
 AC_MSG_CHECKING([if pclmul intrinsic exists])
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index d13e0f834..c7a49de31 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -2127,6 +2127,10 @@ long
 For floating point (@code{f}):
 
 @table @asis
+@item B
+brain 16 bit float
+@item H
+half precision float
 @item F
 float
 @item D
diff --git a/src/od.c b/src/od.c
index 75bed5e7d..1e2489367 100644
--- a/src/od.c
+++ b/src/od.c
@@ -19,6 +19,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -49,6 +50,26 @@ typedef unsigned long long int unsigned_long_long_int;
 typedef unsigned long int unsigned_long_long_int;
 #endif
 
+#if HAVE__FLOAT16
+  /* Available since clang 6 (2018), and gcc 7 (2017).  */
+  typedef _Float16 float16;
+#else
+# define HAVE__FLOAT16 0
+  /* This is just a place-holder to avoid a few '#if' directives.
+ In this case, the type isn't actually used.  */
+  typedef float float16;
+#endif
+
+#if HAVE___BF16
+  /* Available since clang 11 (2020), and gcc 13 (2023). */
+  typedef __bf16 bfloat16;
+#else
+# define HAVE___BF16 0
+  /* This is just a place-holder to avoid a few '#if' directives.
+ In this case, the type isn't actually used.  */
+  typedef float bfloat16;
+#endif
+
 enum size_spec
   {
 NO_SIZE,
@@ -58,6 +79,7 @@ enum size_spec
 LONG,
 LONG_LONG,
 /* FIXME: add INTMAX support, too */
+FLOAT_HALF,
 FLOAT_SINGLE,
 FLOAT_DOUBLE,
 FLOAT_LONG_DOUBLE,
@@ -71,6 +93,8 @@ enum output_format
 OCTAL,
 HEXA

bug#68871: Can't use od to print half-precision floats

2024-02-01 Thread Pádraig Brady


On 01/02/2024 19:16, Paul Eggert wrote:

Oh, and another thought: suppose someone wants to use od on bfloat16_t
values? They're popular in machine learning applications, and likely
will be more popular than float16_t overall. See:

https://sourceware.org/pipermail/libc-alpha/2024-February/154382.html


True. I suppose we would select between these like:

 -t f2, -t fh = IEEE half precision
-t fb = brain floating point

bfloat16 looks like a truncated single precision IEEE,
so we should be able to just pad the extra 16 bits with zeros
when converting to single precision internally for processing.

cheers,
Pádraig

bug#68871: Can't use od to print half-precision floats

2024-02-01 Thread Pádraig Brady


On 01/02/2024 12:43, Rozenberg, Eyal (Consultant) wrote:

If I try to use the od utility to print half-precision (FP16) floating-point 
values, I get:

$ od -t f2 myfloats.bin
od: invalid type string 'f2';
this system doesn't provide a 2-byte floating point type

I'm not exactly sure what "this system" means, but that should work and print 
out my floats.

Eyal

PS - This is my first bug-coreutils post, please be gentle.


I just had a read of these:
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0192r4.html

On the face of it it seems that various 16 bit floating point formats have 
consolidated,
and we could add support for _Float16 in od.
I see gnulib already enables __STDC_WANT_IEC_60559_TYPES_EXT__
so we should be able to do something like the attached.

This is just off the top of my head, and I haven't tested
or thought about this much at all.
Any testing your could do would be appreciated.

thanks,
Pádraigcommit 82fc25ab7f34a8198a8f435c1396801d30e8c6b6
Author: Pádraig Brady 
Date:   Thu Feb 1 17:59:51 2024 +

od: support half precision floating point

* src/od.c: Suport -t f2 to print 16 bit floating point.
* NEWS: Mention the new feature.
Addresses https://bugs.gnu.org/68871

diff --git a/src/od.c b/src/od.c
index 75bed5e7d..2f03a729e 100644
--- a/src/od.c
+++ b/src/od.c
@@ -19,6 +19,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -49,6 +50,12 @@ typedef unsigned long long int unsigned_long_long_int;
 typedef unsigned long int unsigned_long_long_int;
 #endif
 
+#ifndef FLT16_MAX
+/* This is just a place-holder to avoid a few '#if' directives.
+   In this case, the type isn't actually used.  */
+typedef float _Float16;
+#endif
+
 enum size_spec
   {
 NO_SIZE,
@@ -58,6 +65,7 @@ enum size_spec
 LONG,
 LONG_LONG,
 /* FIXME: add INTMAX support, too */
+FLOAT_HALF,
 FLOAT_SINGLE,
 FLOAT_DOUBLE,
 FLOAT_LONG_DOUBLE,
@@ -156,6 +164,7 @@ static const int width_bytes[] =
   sizeof (int),
   sizeof (long int),
   sizeof (unsigned_long_long_int),
+  sizeof (_Float16),
   sizeof (float),
   sizeof (double),
   sizeof (long double)
@@ -477,6 +486,7 @@ PRINT_TYPE (print_int, unsigned int)
 PRINT_TYPE (print_long, unsigned long int)
 PRINT_TYPE (print_long_long, unsigned_long_long_int)
 
+PRINT_FLOATTYPE (print_halffloat, _Float16, ftoastr, FLT_BUFSIZE_BOUND)
 PRINT_FLOATTYPE (print_float, float, ftoastr, FLT_BUFSIZE_BOUND)
 PRINT_FLOATTYPE (print_double, double, dtoastr, DBL_BUFSIZE_BOUND)
 PRINT_FLOATTYPE (print_long_double, long double, ldtoastr, LDBL_BUFSIZE_BOUND)
@@ -824,6 +834,11 @@ decode_one_format (char const *s_orig, char const *s, char const **next,
 
 switch (size_spec)
   {
+  case FLOAT_HALF:
+print_function = print_halffloat;
+field_width = FLT_STRLEN_BOUND_L (decimal_point_len);
+break;
+
   case FLOAT_SINGLE:
 print_function = print_float;
 field_width = FLT_STRLEN_BOUND_L (decimal_point_len);
@@ -1598,6 +1613,9 @@ main (int argc, char **argv)
   for (i = 0; i <= MAX_FP_TYPE_SIZE; i++)
 fp_type_size[i] = NO_SIZE;
 
+#ifdef FLT16_MAX
+  fp_type_size[sizeof (_Float16)] = FLOAT_HALF;
+#endif
   fp_type_size[sizeof (float)] = FLOAT_SINGLE;
   /* The array entry for 'double' is filled in after that for 'long double'
  so that if they are the same size, we avoid any overhead of

bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-31 Thread Pádraig Brady


On 30/01/2024 18:31, Paul Eggert wrote:

On 2024-01-30 03:18, Pádraig Brady wrote:

So we now have the proposed change as:

    - revert -n to old silent success behavior
    - document -n as deprecated
    - Leave --update=none as is (will be synonymous with -n)
    - Provide --update=none-fail to diagnose and exit failure


Thanks, that's a better proposal, but I still see several opportunities
for confusion.

If I understand things correctly, cp --update=none is not synonymous
with the proposed (i.e., old-behavior) cp -n, because -n overrides
previous -i options but --update=none does not. Also, -n overrides
either previous or following --update=UPDATE options, but --update=none
overrides only previous --update=UPDATE options. (For what it's worth,
FreeBSD -n overrides


Good point.
Well -n is a protection mechanism really, so should override.
Since --update now incorporates a protection mode, it should also override I 
think.


Some of this complication seems to be for consistency with how mv
behaves with -f, -i, -n, and --update, and similarly with how rm behaves
with -f, -i, -I, and --interactive. To be honest I don't quite
understand the reason for all this complexity, which suggests it should
be documented somewhere (the manual?) if it isn't already.


To my mind the most protective option takes precedence.
So from least to most that would be -f -n --update=none --update=none-fail -i
So the main consideration there is that -n should not override 
--update=none-fail


This raises more questions:

* If we deprecate cp -n, what about mv -n? FreeBSD mv -n behaves like
Coreutils mv -n: it silently does nothing and exits successfully. So
there's no compatibility argument for changing mv -n's behavior.
However, whatever --update option we add to cp (to output a diagnostic
and exit with failure) should surely also be added to mv, to aid
consistency.


Yes I agree.


* Should cp --update=none be changed so that it really behaves like the
old cp -n, in that it overrides other options in ways that differ from
how the other --update=UPDATE options behave? I'm leaning toward "no" as
this adds complexity that I don't see the use for.

* If we don't change cp --update=none's overriding behavior, is it still
OK to tell users to substitute --update=none for -n even though the two
options are not exactly equivalent? I'm leaning towards "yes" but would
like other opinions.


Yes I think we should still give that advice to deprecate -n,
if we ensure that -n does not override --update=none=fail.

thanks,
Pádraig

bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-30 Thread Pádraig Brady


On 29/01/2024 21:44, Paul Eggert wrote:

On 1/29/24 08:11, Pádraig Brady wrote:


Right, that's why I'm still leaning towards my proposal in the last mail.


Well, I won't insist on doing nothing; however, the proposal needs
ironing out and now's a good time to do it before installing changes.



    - revert to previous exit success -n behavior
    - document -n as deprecated
    - provide --update=noclobber to give exit failure functionality


So --update=noclobber would differ in meaning from the deprecated-in-9.5
--no-clobber, but would agree in meaning with 9.4 --no-clobber? That
sounds pretty confusing for future users. (And a nit: why should one
spelling have a hyphen but the other doesn't?)


That's a fair point; just avoid "no-clobber" entirely.
We could choose a new name so. How about --update=none-fail


      - BTW, it probably makes sense to print a diagnostic for each
skipped file here
    as it's exceptional behavior, for which we're exiting with
failure for.


Coreutils 9.4 cp -n already does that, no? So I'm not sure what's being
proposed here.

$ touch a b
$ cp -n a b; echo $?
cp: not replacing 'b'
1


Sorry I got confused between your suggestion of added diagnostics,
and FreeBSD's silent behavior here. Looks like we just keep the
existing behavior of diagnosing 'not replacing' iff exiting with failure.


    - the existing --update=none provides the exit success functionality


It seems to me that this proposal conflates two questions:

* What rules should cp use to decide whether to update a destination?

* When cp decides not to update a destination, what should it do? Exit
with nonzero status? Output a diagnostic? Both? Neither?

Aren't these independent axes? If so, shouldn't they have independent
options? For example, since we have --update=older, shouldn't there be a
way to say "I want to copy A to B only if B is older than A, and I want
the exit status to be zero only if A was copied to B"?


Well they're not entirely independent.
It's more appropriate to output a diagnostic if exiting failure,
and POSIX also advises along the same lines.
We also have --verbose to control diagnostics somewhat
for the non failure case.

So we now have the proposed change as:

  - revert -n to old silent success behavior
  - document -n as deprecated
  - Leave --update=none as is (will be synonymous with -n)
  - Provide --update=none-fail to diagnose and exit failure

thanks,
Pádraig.

bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-29 Thread Pádraig Brady


On 29/01/2024 14:01, Michael Stone wrote:

On Sun, Jan 28, 2024 at 11:14:14PM -0800, Paul Eggert wrote:

I'm not sure reverting would be best. It would introduce more
confusion, and would make coreutils incompatible with FreeBSD again.


Reverting makes more sense than the current situation. I do not
understand why you seem to value FreeBSD compatibility more than
compatibility with the vast majority of installed coreutils/linux
systems.


Yes, it's not a good place to be. Surely current coreutils is better
than what Debian is doing.


You've introduced a silent incompatibility and I'm trying to find some
way to make that clear. If upstream would provide a better solution I
would certainly use it. I have despaired of there being such since your
attitude thus far seems to be entirely dismissive of compatibility
concerns.


That's a bit unfair.  The current upstream -n behavior is with a view
to being _more_ compat across all systems.
Now I agree this may not be worth it in this case,
but it is a laudable goal.


Another possibility is to add a warning that is emitted only at the
end of 'cp'. The warning would occur only if the exit code differs
because of this cp -n business.


You'd only emit a notification of a change in behavior if some
(potentially uncommon/rarely encountered) situation arises which would
actually trigger breakage? So people can't prepare ahead of time and
change their script to handle the necessary change in logic, they can
only maybe figure out why something broke at 2am when the uncommon event
occurred?


Yes I agree with this point, mostly.
Outputting a diagnostic would help users diagnose what's going on,
and help users to fix forward and avoid their problematic -n usage.
But yes, the crux of the issue with fixing issues as they occur,
is it depends on the state of the destination and so can become an issue
at any time.  Now I previously did an audit with github and debian code search
and noticed very few problematic uses of cp -n, but that does miss
the mountain of private code.


At the end of the day, -n is basically a useless option with unknowable
semantics which should be avoided by everyone. In the past it was an
option which wasn't portable between coreutils/linux and freebsd systems,
and I guess you've "fixed" that (by making it an option everyone should
avoid entirely), but let's be honest about how common that concern was.


Right, that's why I'm still leaning towards my proposal in the last mail.
  - revert to previous exit success -n behavior
  - document -n as deprecated
  - provide --update=noclobber to give exit failure functionality
- BTW, it probably makes sense to print a diagnostic for each skipped file 
here
  as it's exceptional behavior, for which we're exiting with failure for.
  - the existing --update=none provides the exit success functionality

With the above in place for the next coreutils release,
then debian could remove its noisy patch.

thanks,
Pádraig

bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-28 Thread Pádraig Brady


On 17/12/2023 14:46, Pádraig Brady wrote:

On 16/12/2023 21:46, Bernhard Voelker wrote:

On 12/15/23 21:13, Michael Stone wrote:

On Fri, Dec 15, 2023 at 11:21:06AM -0800, Paul Eggert wrote:

Stlll, Pádraig gave a reasonable summary of why the change was made,


To clarify my summary a little, there I said that -n now _immediately_ fails.
I should have said _silently_ fails.  I.e. the complete copy operation
proceeds as before, and only the exit status is at issue here.


despite its incompatibility with previous behavior. (One thing I'd add
is that the FreeBSD behavior is inherently less race-prone.)


Whether the implementation is race-prone or not is an internal thing.
I think we're currently discussing more on a user-perspective level.

IIUC then the question is whether `cp -n` should continue to behave like
the (new) `cp --update=none` which returns EXIT_SUCCESS.

Regardless what other implementations do, when reading the -n description
from a user's point of view:

 -n, --no-clobber do not overwrite an existing file (overrides a
-u or previous -i option). See also --update

then I'd expect the tool to just skip existing files like `rsync 
--ignore-existing`
does.  In that regard I would be surprised if skipping files would result in an 
error.
Well, I would understand if there'd be a '--no-clobber=fail' option.


Agreed we should improve the docs a bit for this option.
I'll apply this at least:

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 1f8b356d1..bf0f424d3 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9057,6 +9057,8 @@ Do not overwrite an existing file; silently fail instead.
   This option overrides a previous
   @option{-i} option.  This option is mutually exclusive with @option{-b} or
   @option{--backup} option.
+See also the @option{--update=none} option which will
+skip existing files but not fail.

   @item -P
   @itemx --no-dereference
diff --git a/src/cp.c b/src/cp.c
index 04a5cbee3..3ccc4c4e6 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -192,8 +192,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
 -L, --dereferencealways follow symbolic links in SOURCE\n\
   "), stdout);
 fputs (_("\
-  -n, --no-clobber do not overwrite an existing file (overrides 
a\n\
- -u or previous -i option). See also 
--update\n\
+  -n, --no-clobber ensure no existing files overwritten, and 
fail\n\
+ silently instead. See also --update\n\
   "), stdout);
 fputs (_("\
 -P, --no-dereference never follow symbolic links in SOURCE\n\



As Kamil added the option in 2009, I'd assume that the same patch was already
active in RHEL versions for quite some longer time.
Now changing the exit code feels kind of rough.


Well RHEL 6 came out a bit after (2010), and had the --no-clobber change,
while RHEL 5 before that did not.

Taking about distros, it's worth noting that the change is Fedora 39
which has been released for a month now.
We'll keep a close eye on issues, but haven't heard much as
of yet at least.


Therefore, from a pure user's perspective and regarding many years of 
precedence,
I am 80:20 for reverting the exit code change.


Thanks for your thoughts,
appreciated as always.


We were undecided how to proceed as of the above discussion,
with some hoping to consolidate -n behavior across all systems,
with others preferring to keep compat with original Linux behavior.

Things have changed in Debian I see, so that cp -n now aggressively warns like:
https://sources.debian.org/patches/coreutils/9.4-3/cp-n.diff/

  $ cp -n /bin/true tmp
  cp: warning: behavior of -n is non-portable and may change in future; use 
--update=none instead

This is problematic as:

  - It's noisy
  - There is no way to get the behavior of indicating failure if existing files 
present
  - The --update=none advice is only portable to newer coreutils

To summarise existing use cases of -n:

  1. User expected -n to exit failure if existing files
  2. User didn't care about existing status
  3. User expected -n to exit success if existing files

Use case 3 is the problematic case we changed behavior for,
but also I think that's the minority of the 3 use cases
and could have been fixed forward.

The debian advice forces use case 2 to change,
and provides bad advice for use case 1,
and in fact there is no solution now for use case 1.

At this stage it seems best for us go back to the original Linux behiavor (use 
case 3),
and to silently deprecate -n in docs to document the portability issues with it.
We should also provide --update=noclobber for use case 1.
Having the control on the --update option, allows use to more clearly deprecate 
-n.

I do realise folks may expect use case 1 now with -n,
especially on releases like Fedora 39, but given the feedback
that would be the least bad option.

thanks,
Pádraig

bug#68708: [PATCH] env,kill: Handle unnamed signals

2024-01-25 Thread Pádraig Brady

On 24/01/2024 20:40, Grisha Levit wrote:

Android reserves [1] some realtime signals and redefines [2] SIGRTMIN,
leaving a gap between the signals that have SIG* constants defined in
signal.h and SIGRTMIN.

When passed such a signal number, gnulib sig2str returns -1 and leaves
its signame argument unchanged.

The signal listing in env ends up reusing the name of the last printed
valid signal:

$ env --list-signal-handling true
HUP( 1): IGNORE
HUP(32): BLOCK
HUP(38): IGNORE

..and the corresponding signal numbers are rejected as operands for the
env, kill, and timeout commands.

This patch removes the requirement that sig2str returns 0 for a signal
number associated with an operand, and allows unnamed signals to be in
the sets `env' attempts to manipulate when a --*-signal option is used
with no argument.

This does leave the possibility of numbers lower than SIGNUM_BOUND that
are not valid signals, but I'm not sure that's really a problem for the
existing code paths (if it is, adding checks for sigset_t manipulations
would probably be enough).

To be on the safe side, added a check to report kill(3) EINVAL as a bad
signo (rather than always blaming the pid).

This does not change the default list printed with `kill -l'. When a
name is to be printed, the signal number associated with an unnamed
signal is used.

[1]:
https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/platform/bionic/reserved_signals.h
[2]:
https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/include/signal.h;l=51

This mostly looks good, except:

- No need to clear the errno before kill(3).
- Better to use SIG%d rather than the bare %d for signal _names_, as we already
parse this format

thanks!
Pádraig

bug#68267: [PATCH] maint: add attributes to two functions without side effects

2024-01-06 Thread Pádraig Brady


On 05/01/2024 16:44, Samuel Tardieu wrote:

* src/date.c (res_width): This function computes its result solely
from the value of its parameter and qualifies for the const attribute.
* src/tee.c (get_next_out): This function has no side effect and
qualifies for the pure attribute.

Those two functions were flagged by GCC 12.3.0.


I'm guessing GCC 12 needs help here as it can't determine the loops are finite.
(as per: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109914 )

Though I'm not seeing this suggestion with GCC 13.2.1,
so perhaps GCC 12 can determine the loops are finite?

I'll apply this since GCC 13 is less that a year old,
but in general we try to avoid littering code like this.

thanks,
Pádraig

bug#68283: ls -l issues on cifs mounted directories

2024-01-06 Thread Pádraig Brady


On 06/01/2024 13:08, Bjoern Voigt via GNU coreutils Bug Reports wrote:

After upgrading coreutils from version 9.3 to 9.4, "ls -l" shows error
messages, if the files or directories are in a cifs/smb3 mounted directory.

Example:
/mnt/cifstest here is mounted with "mount -t cifs -o username=myuser
//192.168.1.2/all /mnt/cifstest/". The server is a Samba 4.19.2 server.
The kernel is version 6.1.71.

# ls -l /mnt/cifstest/
ls: /mnt/cifstest/: Resource temporarily unavailable




"ls -l" from coreutils 9.3 does not show any errors.

Probably it is not just the error message. "dmesg -w" shows several
errors after executing "ls -l":

[17563.999109] smb2_reconnect: 3 callbacks suppressed
[17563.999113] CIFS: VFS: reconnect tcon failed rc = -11



I debugged this problem a bit. The changes in
coreutils-9.4/lib/file-has-acl.c:file_has_acl are probably responsible
for the errors. The system call listxttr produces the error message
"Resource temporarily unavailable". I could not find the reasons for
error messages shown with "dmesg".

The listxattr.c test program in the manual page (man listxattr) shows
the following output:

# ./listxattr /mnt/cifstest/
listxattr: Resource temporarily unavailable

Work-around are to downgrade to coreutils 9.3 or not to use the "-l"
option for ls.


Thanks for the details.
EAGAIN is not a documented return from listxattr,
and it seems like the cifs client should not let that leak.
Perhaps the kernel needs another change along the lines of:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6efa994e35a
(That was included in v5.18).

Perhaps something to ask at linux-c...@vger.kernel.org ?

If this is a common enough issue then it might be appropriate for us to handle,
but given the current info, ls is just informing you of an actual issue on your 
system.

thanks,
Pádraig

bug#68256: df fails on /run/user/1000/doc with "Operation not permitted"

2024-01-05 Thread Pádraig Brady


tag 68256 notabug
close 68256
stop

On 05/01/2024 09:22, Nada Machkova wrote:

hello
I have just upgraded Debian Bullseye
and simple df command respond at user CLI
$ df -hT
df: /run/user/1000/doc: Operation not permitted
...
but when I do the same as root there is NO error.
So I UNmounted relevant file and AFTER that df response has NO error for user
# fusermount -u /run/user/1000/doc
But I need to do it after each reboot :-(

I've just analyzed system status and versions
and checked related bug https://github.com/flatpak/xdg-desktop-portal/issues/553
PLS see the following details and let me know if it is coreutils bug
thank you for your time
Nada


Debian Bullseye (using coreutils 8.32) would need to apply:
https://github.com/coreutils/gnulib/commit/9a38d499ca.patch

That patch was included in the coreutils >= 9.0 releases,
and avoids including this flatpak related fuse.portal mount in the default df 
list.

I also have such a fuse.portal mount on my Fedora 39 system:

  # grep /run/user/1001/doc /proc/mounts
  portal /run/user/1001/doc fuse.portal ...

and weirdly root can statfs() but not stat(), and non root is vice versa :/

  # stat /run/user/1001/doc
  stat: cannot statx '/run/user/1001/doc': Permission denied

  $ stat /run/user/1001/doc
  File: /run/user/1001/doc
  Size: 0   Blocks: 0  IO Block: 4096   directory
  Device: 0,77  Inode: 1   Links: 2
  Access: (0500/dr-x--)  Uid: ..
  Context: system_u:object_r:fusefs_t:s0

  $ stat -f /run/user/1001/doc
  stat: ... '/run/user/1001/doc': Operation not permitted
  # stat -f /run/user/1001/doc
  File: "/run/user/1001/doc"
  ID: 0Namelen: 0   Type: fuseblk
  Block size: 0  Fundamental block size: 0
  Blocks: Total: 0  Free: 0  Available: 0
  Inodes: Total: 0  Free: 0

cheers,
Pádraig

bug#68216: An instance of DD receiving data from another instance with bs=1M, reports the wrong record count.

2024-01-02 Thread Pádraig Brady


On 02/01/2024 22:45, ja...@johnstone.net.au wrote:

Hi Pádraig,

iflag=fullblock seems to resolve this experience and the man page
excerpt provided makes sense though those lines make no appearance in
the man pages on Archlinux. I will follow up with the distro maintainers.


As would explicitly specifying obs on the second dd instance.

cheers,
Pádraig

bug#68216: An instance of DD receiving data from another instance with bs=1M reports the wrong record count.

2024-01-02 Thread Pádraig Brady


forcemerge 8171 68216
stop

On 02/01/2024 10:45, ja...@johnstone.net.au wrote:

Hi all,

It seems an instance of `dd`  with only an output file doesn't report
the count of records accurately despite bs= being described as
influencing both ibs= and obs=

dd'ing from an unstable remote over ssh today I had it continue from the
last received block between crashes in preparation for a P2V. The remote
had bs=1M set with its in=device and the local instance had bs=1M too
with the destination of= device but upon referring to the receiving
end's stable count of records I noticed it ignored my bs=1M setting.
This were worked around using obs=1M to make the counter show a value I
could reliable skip=/seek= with to resume.

I have an example here:

 /$ dd if=/dev/zero bs=1M count=1 | dd bs=1M of=/dev/null
 1+0 records in
 1+0 records out
 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0010021 s, 1.0 GB/s
 *0+16 records in
 0+16 records out*
 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.000771722 s, 1.4 GB/s/

This is being experienced on Archlinux with kernel '6.6.4' and package
'core/coreutils 9.4-2'

dd's man page for this distribution shows:

 /bs=BYTES
    read and write up to BYTES bytes at a time (default:
 512); overrides ibs and obs/

So I feel it not overriding obs= when invoked with only of= and bs=
arguments could be a bug.

Running a single dd with /if=source of=dest bs=1M/ works as intended.

While in my case this isn't a fatal bug working with virtual block
devices it could be problematic for use cases which require a proper
treatment of block sizes such as tape devices.


Well the behavior is expected and documented:

‘bs=BYTES’
 Set both input and output block sizes to BYTES.  This makes ‘dd’
 read and write BYTES per block, overriding any ‘ibs’ and ‘obs’
 settings.  In addition, if no data-transforming ‘conv’ operand is
 specified, input is copied to the output as soon as it’s read, even
 if it is smaller than the block size.

For details please see https://bugs.gnu.org/8171

thanks,
Pádraig.

bug#68064: Date addition error with “day Monthname” versus “Monthname day”

2023-12-27 Thread Pádraig Brady


tag 68064 notabug
close 68064
stop

On 27/12/2023 17:29, Larry Ploetz wrote:

It seems like there might be a problem with date addition when the base
date is specified as “day Monthname” instead of “Monthname day”, where
the offset is being interpreted as an absolute year value. This may be
locale-specific.

 :bin larry$ locale
 LANG="en_US.UTF-8"
 LC_COLLATE="en_US.UTF-8"
 LC_CTYPE="en_US.UTF-8"
 LC_MESSAGES="en_US.UTF-8"
 LC_MONETARY="en_US.UTF-8"
 LC_NUMERIC="en_US.UTF-8"
 LC_TIME="en_US.UTF-8"
 LC_ALL=
 :bin larry$ ./date -d "$(./date -d today +%d\ %b) + 1 day"
 Fri Dec 28 00:00:00 LMT 0001
 :bin larry$ ./date -d "$(./date -d today +%b\ %d) + 1 day"
 Thu Dec 28 00:00:00 CST 2023
 :bin larry$ ./date --version
 date (GNU coreutils) 9.4.97-98d463
 Copyright (C) 2023 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or 
later.
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.


This is due to ambiguity in date input formats.
Your case is:

  $ date --debug -d "27 Dec + 1 day"
  date: parsed date part: (Y-M-D) -001-12-27
  date: parsed relative part: +1 day(s)

A slightly adjusted case highlighting the ambiguity:

  $ date --debug -d "27 Dec + 1 week"
  date: parsed date part: (Y-M-D) -001-12-27
  date: parsed relative part: +7 day(s)

So really the solution here is to avoid the ambiguity
by explicitly specifying the year, or more abstractly
specifying "today" rather than a partial date.

Another gotcha highlighted by the --debug option above is:

  date: warning: using midnight as starting time: 00:00:00
  date: starting date/time: '(Y-M-D) 0001-12-27 00:00:00'
  date: warning: when adding relative days, it is recommended to specify noon

Another gotcha is that only English month abbreviations
are supported on input, so your command will fail in other locales.

For a summary of date input issues see:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e
https://www.gnu.org/software/coreutils/manual/html_node/Date-input-formats.html#index-date-input-formats

cheers,
Pádraig

bug#62572: cp --no-clobber behavior has changed

2023-12-17 Thread Pádraig Brady


On 16/12/2023 21:46, Bernhard Voelker wrote:

On 12/15/23 21:13, Michael Stone wrote:

On Fri, Dec 15, 2023 at 11:21:06AM -0800, Paul Eggert wrote:

Stlll, Pádraig gave a reasonable summary of why the change was made,


To clarify my summary a little, there I said that -n now _immediately_ fails.
I should have said _silently_ fails.  I.e. the complete copy operation
proceeds as before, and only the exit status is at issue here.


despite its incompatibility with previous behavior. (One thing I'd add
is that the FreeBSD behavior is inherently less race-prone.)


Whether the implementation is race-prone or not is an internal thing.
I think we're currently discussing more on a user-perspective level.

IIUC then the question is whether `cp -n` should continue to behave like
the (new) `cp --update=none` which returns EXIT_SUCCESS.

Regardless what other implementations do, when reading the -n description
from a user's point of view:

-n, --no-clobber do not overwrite an existing file (overrides a
   -u or previous -i option). See also --update

then I'd expect the tool to just skip existing files like `rsync 
--ignore-existing`
does.  In that regard I would be surprised if skipping files would result in an 
error.
Well, I would understand if there'd be a '--no-clobber=fail' option.


Agreed we should improve the docs a bit for this option.
I'll apply this at least:

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 1f8b356d1..bf0f424d3 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9057,6 +9057,8 @@ Do not overwrite an existing file; silently fail instead.
 This option overrides a previous
 @option{-i} option.  This option is mutually exclusive with @option{-b} or
 @option{--backup} option.
+See also the @option{--update=none} option which will
+skip existing files but not fail.

 @item -P
 @itemx --no-dereference
diff --git a/src/cp.c b/src/cp.c
index 04a5cbee3..3ccc4c4e6 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -192,8 +192,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
   -L, --dereferencealways follow symbolic links in SOURCE\n\
 "), stdout);
   fputs (_("\
-  -n, --no-clobber do not overwrite an existing file (overrides 
a\n\
- -u or previous -i option). See also 
--update\n\
+  -n, --no-clobber ensure no existing files overwritten, and 
fail\n\
+ silently instead. See also --update\n\
 "), stdout);
   fputs (_("\
   -P, --no-dereference never follow symbolic links in SOURCE\n\



As Kamil added the option in 2009, I'd assume that the same patch was already
active in RHEL versions for quite some longer time.
Now changing the exit code feels kind of rough.


Well RHEL 6 came out a bit after (2010), and had the --no-clobber change,
while RHEL 5 before that did not.

Taking about distros, it's worth noting that the change is Fedora 39
which has been released for a month now.
We'll keep a close eye on issues, but haven't heard much as
of yet at least.


Therefore, from a pure user's perspective and regarding many years of 
precedence,
I am 80:20 for reverting the exit code change.


Thanks for your thoughts,
appreciated as always.

cheers,
Pádraig

bug#62572: cp --no-clobber behavior has changed

2023-12-15 Thread Pádraig Brady


On 15/12/2023 15:56, Michael Stone wrote:

I tend to think this was a serious mistake: it breaks the behavior of
existing scripts with no deprecation period. A stated advantage is
better compatibility with freebsd, but I don't understand why that is
more desirable than compatibility with all deployed gnu/linux systems? I
also don't think it's sufficient to try to lawyer out by saying that the
current behavior was undocumented: the previous documentation said that
-n would "silently do nothing" and that the return code would be zero on
success. Logically, unless cp fails to "do nothing", it should exit with
a zero code.

Such a drastic change in behavior demands a new flag, not a radical
repurposing of a widely used existing flag.

I was hoping to see more action on this bug, but that hasn't happened.
I'm not sure I see a way forward for debian other than reverting to the
old behavior. I am reluctant to do so as that will likely lead to
divergent behavior between distributions, but breaking scripts without a
compelling reason is also not good. I would encourage coreutils to
reconsider the change and finding a non-breaking way forward.


Yes it's a fair point.
It's an awkward case, and worth discussing.

To summarise:

  coreutils >= 7.1 had -n skip existing in dest (2009)
  coreutils >= 9.2 has -n immediately fail if existing in dest
  coreutils >= 9.3 has --update=none to skip existing in dest

  FreeBSD >= 4.7/macos has -n immediately fail if existing in dest

  bash has noclobber as a file protection mechanism,
  and fails immediately upon trying to overwrite a file.
  This is more consistent with the new coreutils behavior.

I see a reasonable amount of cp -n usage across github:
https://github.com/search?q=/cp+.*+-n+.*/+path:*.sh&type=code

Now it's not clear which behavior these github usages expect,
and the original docs didn't make it clear which behavior to expect.
A quick scan of the github usages also seem mainly to expect
a protection rather than an update use case, so failing
immediately would be the most appropriate action there too.
Also the original coreutils bug report here expected the new behaviour.

So we probably all agree that failing immediately is the
most appropriate / consistent -n behavior,
but GNU had diverged from that so there are about 10 years
of scripts that may expect the silent skip behavior.

Two options I see are:

- Leave as is and fix -n usages that expected the skip behavior
- Deprecate -n entirely and prompt to use --update={fail,none}

Advantages of leaving as is:
We get consistency of "noclobber" behavior across systems / shells.
We fix cases where previously scripts could have proceeded with
stale old files in place.

Disadvantages of leaving as is:
Users expecting the skip behavior, have to change to --update=none.

There is no potential for data loss etc. so it just comes
down to how disruptive it is, or how often -n was used
with the "skip behavior" assumption.

We've not had much push back as of yet,
and my current thinking is it's not that disruptive a change.
So I'd be 55:45 if favor of keeping things as is.

thanks,
Pádraig.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3376 matches

Mail list logo