bug#71242: PR: src/stat.c - add magic for bcachefs

2024-05-28 Thread Pádraig Brady

On 28/05/2024 12:08, Paul Hedderly via GNU coreutils Bug Reports wrote:

Currently stat does not know bcachefs

$:~/gits/coreutils$ sudo stat -f -c %T /
UNKNOWN (0xca451a4e)

But it just needs to know that magic

$:~/gits/coreutils$ sudo ./src/stat -f -c %T /
bcachefs


Please pull this small commit:

https://github.com/coreutils/coreutils/commit/6da16c99ff0c6ec35943dd2db5f51c25efb8b508


I pushed the following before receiving this bug report:
https://github.com/coreutils/coreutils/commit/3ce31e6f1

Marking this as done.

thanks,
Pádraig





bug#71191: od does unnecessary reads instead of seeking

2024-05-25 Thread Pádraig Brady

On 25/05/2024 08:05, Joseph C. Sible wrote:

Consider running the following command, for looking at the pagemap
bits of a given memory page:

od -t x8 -N 8 -j 34359738368 /proc/PID/pagemap

That file supports seeking, but od will unnecessarily read and discard
32GB worth of data instead of doing so.

Looking at the skip function in od.c, it looks like this happens
because the file has a bogus st_size of 0 (as is typical for files in
/proc) despite usable_st_size returning true for it, which results in
the calls to fseeko never even being tried.


This is related to https://bugs.gnu.org/36291
Note the workaround with dd mentioned there ¹.
In that report I detailed four cases of /proc /sys and /dev files
that od needs to handle, and suggested we try the lseek() for
the non regular file case (where we don't worry about seeking past EOF).

As an aside we can actually _seek_ in /proc and /sys files
¹ dd used to report an error when seeking in such files,
when in fact seeking is actually supported. The warning was removed
(for all empty files) in https://github.com/coreutils/coreutils/commit/ac6b8d822
(as part of https://bugs.gnu.org/70231).
But this case is more complicated by the fact that od supports combining files,
so you need to avoid seeking beyond EOF.
I.e. `od --skip-bytes 1 empty nonempty` would give different results
if we blindly tried an lseek() if st_stize==0

I wonder could you efficiently seek to offset, avoiding seeking past EOF,
while ignoring st_size with something like:
  if (lseek(N-1))
if read(1)
  skip -= N
else
  lseek(-(N-1))
I'm not sure how general that would be.

cheers,
Pádraig





bug#71186: email address for patch

2024-05-25 Thread Pádraig Brady

tab 71186 notabug
close 71186
stop

On 24/05/2024 19:15, Данила Скачедубов wrote:

Hi! I have a few questions. Please tell me, I want to compile the main
files so that they are output by the man command. As far as I
understand, this is done by the help2man-1.48.5 command. But at the
same time, I can't compile correctly, let's say chown.x > chown.1, What
could be the problem? And the second question: I carefully studied the
documentation on the rules for preparing patches and it said that I
could get the address by running --help, but I did not see the email
address. Maybe it's him- [1]coreut...@gnu.org ? Thank you very much in
advance for the answers!

--
Danila Skachedubov

References

1. https://e.mail.ru/compose?To=coreut...@gnu.org


Proposed bug fixes go to bug-gnu...@gnu.org
General discussions go to coreut...@gnu.org

Usage for help2man is:

  help2man chown -i chown.x > chown.1

cheers,
Pádraig





bug#71029: single-binary build broken on systems with separate libintl

2024-05-21 Thread Pádraig Brady

On 18/05/2024 18:09, Audrey Dutcher wrote:

I presume you have the same issue with coreutils 9.4 ?


Correct.


I presume the problematic flags are propagated through LIBINTL or MBRTOWC_LIB. 
What are those set to for reference in your Makefile?


On plain FreeBSD it is set to /usr/local/lib/libintl.so -Wl,-rpath
-Wl,/usr/local/lib. Within a nix jail on FreeBSD (which is where I
encountered the issue first) it is simply -lintl.

Thanks for getting back to me!
- Audrey


Could you test with the attached,
which should reenable automake protections in this area.

cheers,
PádraigFrom 5691ff399e824c79090f2c7f3d08218fde4f9560 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Tue, 21 May 2024 13:08:45 +0100
Subject: [PATCH] build: fix build failure in --enable-single-binary mode

* src/local.mk: Avoid overriding automake generated DEPENDENCIES,
so that it applies its adjustments to LDADD to avoid propagating
flags (like -Wl,-rpath) into make targets.  This was seen on FreeBSD
where LIBINTL is set to:
/usr/local/lib/libintl.so -Wl,-rpath -Wl,/usr/local/lib
Instead let automake generate a sanitized src_coreutils_DEPENDENCIES
(based on LDADD), which we then augment with the EXTRA_... variable.
---
 src/local.mk | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/local.mk b/src/local.mk
index 3356f8a2a..8133925ac 100644
--- a/src/local.mk
+++ b/src/local.mk
@@ -482,13 +482,13 @@ if SINGLE_BINARY
 src_coreutils_CFLAGS = -DSINGLE_BINARY $(AM_CFLAGS)
 #src_coreutils_LDFLAGS = $(AM_LDFLAGS)
 src_coreutils_LDADD = $(single_binary_deps) $(LDADD) $(single_binary_libs)
-src_coreutils_DEPENDENCIES = $(LDADD) $(single_binary_deps)
+EXTRA_src_coreutils_DEPENDENCIES = $(single_binary_deps)
 
 include $(top_srcdir)/src/single-binary.mk
 
 # Creates symlinks or shebangs to the installed programs when building
 # coreutils single binary.
-EXTRA_src_coreutils_DEPENDENCIES = src/coreutils_$(single_binary_install_type)
+EXTRA_src_coreutils_DEPENDENCIES += src/coreutils_$(single_binary_install_type)
 endif SINGLE_BINARY
 
 CLEANFILES += src/coreutils_symlinks
-- 
2.44.0



bug#71029: single-binary build broken on systems with separate libintl

2024-05-18 Thread Pádraig Brady

On 17/05/2024 17:11, Audrey Dutcher wrote:

On my FreeBSD system, downloading coreutils-9.5.tar.xz, then building
with `./configure --enable-single-binary && make` does not succeed,
with the error message `don't know how to make -Wl,-rpath. Stop`

I believe the root cause of this is
https://github.com/coreutils/coreutils/blob/52e024b/src/local.mk#L485,
which mixes DEPENDENCIES and LDADD. This seems bad, for very relevant
reasons!


I presume you have the same issue with coreutils 9.4 ?

I presume the problematic flags are propagated through LIBINTL or MBRTOWC_LIB.
What are those set to for reference in your Makefile?

I'll have a look at cleaning this up.

thanks,
Pádraig





bug#70946: [PATCH] doc: improve the man page for sleep

2024-05-15 Thread Pádraig Brady

On 15/05/2024 03:46, Nikolaos Chatzikonstantinou wrote:

On Tue, May 14, 2024 at 4:03 PM Nikolaos Chatzikonstantinou
 wrote:


On Tue, May 14, 2024, 3:59 PM Pádraig Brady  wrote:


On 14/05/2024 17:36, Nikolaos Chatzikonstantinou wrote:

See attachment.


Well just above your new mention of floating-point, we have:
"NUMBER need not be an integer".
How about I adjust your patch to adjust that text to say:
"NUMBER can be an integer or floating-point".

Sounds good! Either I missed it or my system had an older man page.


If I may suggest, instead write it like this:

 Pause for NUMBER seconds, where NUMBER is an integer or floating-point.

Then the sentence "NUMBER need not be an integer" can be removed. Of
course the info manual shows more details, but I think that note in
the man page will do. Let me know if you'd rather that I write and
submit a patch myself instead.


OK I pushed a change in your name, changing from:

Pause for NUMBER seconds.  SUFFIX may be 's' for seconds (the default),'m' for 
minutes, 'h' for hours or 'd' for days.  NUMBER need not be an
integer.  Given two or more arguments, pause for the amount of time
specified by the sum of their values.

to:

Pause for NUMBER seconds, where NUMBER is an integer or floating-point.
SUFFIX may be 's','m','h', or 'd', for seconds, minutes, hours, days.
With multiple arguments, pause for the sum of their values.

Marking this as done.

thanks,
Pádraig





bug#70946: [PATCH] doc: improve the man page for sleep

2024-05-14 Thread Pádraig Brady

On 14/05/2024 17:36, Nikolaos Chatzikonstantinou wrote:

See attachment.


Well just above your new mention of floating-point, we have:
"NUMBER need not be an integer".
How about I adjust your patch to adjust that text to say:
"NUMBER can be an integer or floating-point".

cheers,
Pádraig.





bug#70887: In coreutils 9.5, "cp" command does not honor "--interactive" option

2024-05-12 Thread Pádraig Brady

On 12/05/2024 16:06, Paul Eggert wrote:

On 2024-05-12 04:49, Pádraig Brady wrote:


@@ -1151,7 +1151,8 @@ main (int argc, char **argv)
  {
/* Default cp operation.  */
x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
  }
else if (update_opt == UPDATE_NONE)
  {
@@ -1166,7 +1167,8 @@ main (int argc, char **argv)
else if (update_opt == UPDATE_OLDER)
  {
x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;


Thanks for looking into this messy area. Here is a comment from another
pair of eyes.

Could you elaborate a bit more about why these two bits of code change
x.interactive at all? That is, why doesn't update_opt simply affect
x.update? Why does update_opt bother to override a previous setting of
x.interactive to I_ALWAYS_YES, I_ALWAYS_NO, or I_ALWAYS_SKIP?

Another way to put it: shouldn't x.update simply reflect the value of
the --update option, whereas x.interactive reflects reflects whether -f,
-i, -n are used? Although this would require changes to copy.c, it'd
make the code easier to follow.


I agree that some refactoring would be good here.
At least x.update should be renamed to x.update_older.

As interactive selection, and file dates all relate
to selecting which files to update, it's tempting to conflate the settings.
However you're right that this introduces complexities when
trying to avoid all inconsistencies. Currently for example:
  $ cp -v -i --update=none new old  # Won't prompt as expected
  $ cp -v --update=none -i new old  # Unexpectedly ignores update option

So yes we should separate these things.


Another way to put it: why should, for example, --update=all override a
previous -f or (partly) -n but not a previous -i?


Right -f is significant for mv here (for completeness -f for cp is a separate 
thing).
I.e. we need to treat I_ALWAYS_YES specially in mv with the current scheme.

BTW -n is not overridden by any --update option currently,
and this change effectively applies the same logic to -i now.

thanks,
Pádraig





bug#70887: In coreutils 9.5, "cp" command does not honor "--interactive" option

2024-05-12 Thread Pádraig Brady

On 12/05/2024 00:03, Robert Hill wrote:

After upgrading coreutils from 9.0 to 9.5, the following change occurred:

In coreutils 9.0, the command "cp -Tipruvx /src-dir /dst-dir" requested
interactive confirmation before replacing an old destination file with a
newer source file, as expected.

In coreutils 9.5, the command "cp -Tipruvx /src-dir /dst-dir" no longer
requests interactive confirmation, but just goes ahead and replaces old
destination files with newer source files, which is not expected.

Thank you in advance for looking at this, Bob.


Right.

The thinking was for 9.3 that the new long form --update={older,all} options
would override a previous -i, especially as -i was commonly set in root
users' cp and mv aliases on Red Hat flavored distros.
Then in 9.5 we expanded this so -u behaved the same as --update=older.

In retrospect, users can avoid these aliases in various ways,
and the protective -i option should really combine with -u
rather than being overridden by it.

For completeness, -i following -u would always reinstate the protection.

The attached changes the behavior back to that -i is never overridden.

thanks,
PádraigFrom 2ec640a3d2719ac0204d4976baabe6082b85e502 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 12 May 2024 12:21:19 +0100
Subject: [PATCH] cp,mv: ensure -i is not overridden by -u

Have -i combine with -u instead.
In coreutils 9.3 we had --update={all,older} override -i
In coreutils 9.5 this was expanded to -u
(to make it consistent with --update=older).

* NEWS: Mention the bug fix.
* src/cp.c (main): Don't have -u disable prompting.
* src/mv.c (main): Likewise.
* tests/cp/cp-i.sh: Add a test case.
* tests/mv/update.sh: Likewise.
Fixes https://bugs.gnu.org/70887
---
 NEWS   |  3 +++
 src/cp.c   |  6 --
 src/mv.c   |  6 --
 tests/cp/cp-i.sh   | 13 +
 tests/mv/update.sh |  3 +++
 5 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index febb9ac68..430c2ec54 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,9 @@ GNU coreutils NEWS-*- outline -*-
   rejected as an invalid option.
   [bug introduced in coreutils-9.5]
 
+  cp,mv --update no longer overrides --interactive.
+  [bug introduced in coreutils-9.3]
+
   ls and printf fix shell quoted output in the edge case of escaped
   first and last characters, and single quotes in the string.
   [bug introduced in coreutils-8.26]
diff --git a/src/cp.c b/src/cp.c
index 06dbad155..cae4563cb 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -1151,7 +1151,8 @@ main (int argc, char **argv)
 {
   /* Default cp operation.  */
   x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
   else if (update_opt == UPDATE_NONE)
 {
@@ -1166,7 +1167,8 @@ main (int argc, char **argv)
   else if (update_opt == UPDATE_OLDER)
 {
   x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
 }
   break;
diff --git a/src/mv.c b/src/mv.c
index 692943a70..0162cd728 100644
--- a/src/mv.c
+++ b/src/mv.c
@@ -394,7 +394,8 @@ main (int argc, char **argv)
 {
   /* Default mv operation.  */
   x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
   else if (update_opt == UPDATE_NONE)
 {
@@ -409,7 +410,8 @@ main (int argc, char **argv)
   else if (update_opt == UPDATE_OLDER)
 {
   x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
 }
   break;
diff --git a/tests/cp/cp-i.sh b/tests/cp/cp-i.sh
index f99f743dc..28969f9c8 100755
--- a/tests/cp/cp-i.sh
+++ b/tests/cp/cp-i.sh
@@ -70,4 +70,17 @@ returns_ 1 cp -bn c d 2>/dev/null || fail=1
 returns_ 1 cp -b --update=none c d 2>/dev/null || fail=1
 returns_ 1 cp -b --update=none-fail c d 2>/dev/null || fail=1
 
+# Verify -i combines with -u,
+echo old > old || framework_failure_
+touch -d yesterday old || framework_failure_
+echo new > new || framework_failure_
+# coreutils 9.3 had --update={all,older} ignore -i
+echo n | returns_ 1 cp -vi --update=older new old 2>/dev/null >out8 || fail=1
+compare /dev/null out8 || fail=1
+echo n | returns_ 1 cp -vi --update=all new old 2>/dev/null >out8 || fail=1
+compare /dev/null out8 || fail=1
+# coreutils 9.5 also had -u ignore -i
+echo n | returns_ 1 cp -vi 

bug#70873: join --return-error-if-any-unmatched-lines

2024-05-11 Thread Pádraig Brady

On 11/05/2024 10:14, Dan Jacobson wrote:

join should have an option to return an error value in the shell's $?
if any lines are not matched.

Currently the man page doesn't even mention a return value. So it is not
set in stone yet.

Currently one must save -v output in a file then use test -s do detect
if there were any non-matched lines. And then exit one script with
non-zero. Big hassle.

join (GNU coreutils) 9.4


This does seem to have some merit.
Perhaps --check-pairable similar to the existing --check-order option.

cheers,
Pádraig





bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-05-09 Thread Pádraig Brady

On 13/04/2024 18:42, Pádraig Brady wrote:

On 13/04/2024 17:39, Pádraig Brady wrote:

install(1) defaults to mode 600 for new files, and uses set_acl() with that
(since 2007 https://github.com/coreutils/coreutils/commit/f634e8844 )
The psuedo code that install(1) uses is:

copy_reg()
 if (x->set_mode) /* install */
   set_acl(dest, x->mode /* 600 */)
 ctx->acl = acl_from_mode ( /* 600 */)
 acl_set_fd (ctx->acl) /* fails EACCES */
 if (! acls_set)
must_chmod = true;
 if (must_chmod)
   saved_errno = EACCES;
   chmod (ctx->mode /* 600 */)
   if (save_errno)
 return -1;

This issue only only seems to be on CIFS.
I'm seeing lot of weird behavior with ACLs there:

 acl_set_fd (acl_from_mode (600)) -> EACCES
 acl_set_fd (acl_from_mode (755)) -> EINVAL
 getxattr ("system.posix_acl_access") -> EOPNOTSUPP

Note we ignore EINVAL and EOPNOTSUPP errors in set_acl(),
and it's just the EACCES that's problematic.
Note this is quite similar to https://debbugs.gnu.org/65599
where Paul also noticed EACCES with fsetxattr() (and others) on CIFS.

The attached is a potential solution which I tested as working
on the same matoro system that Bruno used.

I think I'll apply that after thinking a bit more about it.


Actually that probably isn't appropriate,
as fsetxattr() can validly return EACESS
even though not documented in the man page at least.
The following demonstrates that:
.
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
  int wfd;
  /* Note S_IWUSR is not set.  */
  if ((wfd=open("writable", O_CREAT|O_WRONLY|O_EXCL, S_IRUSR)) == -1)
  fprintf(stderr, "open('writable') error [%m]\n");
  if (write(wfd, "data", 1) == -1)
  fprintf(stderr, "write() error [%m]\n");
  if (fsetxattr(wfd, "user.test", "test", 4, 0) == -1)
  fprintf(stderr, "fsetxattr() error [%m]\n");
}

Another solution might be to file_has_acl() in copy.c
and skip if "not supported" or nothing is returned.
The following would do that, but I'm not sure about this
(as I'm not sure about ACLs in general TBH).
Note listxattr() returns 0 on CIFS here,
while getxattr ("system.posix_acl_access") returns EOPNOTSUPP,
and file_has_acl() uses listxattr() first.

diff --git a/src/copy.c b/src/copy.c
index d584a27eb..2145d89d5 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1673,8 +1673,13 @@ set_dest_mode:
   }
 else if (x->set_mode)
   {
-  if (set_acl (dst_name, dest_desc, x->mode) != 0)
-return_val = false;
+  errno = 0;
+  int n = file_has_acl (dst_name, );
+  if (0 < n || (errno && ! (is_ENOTSUP (errno) || errno == ENOSYS)))
+{
+  if (set_acl (dst_name, dest_desc, x->mode) != 0)
+return_val = false;
+}
   }


BTW I'm surprised this wasn't reported previously for CIFS,
so I due to this bug and https://debbugs.gnu.org/65599
I suspect a recentish change in CIFS wrt EACCES.


Thinking more about this, the solution presented above wouldn't work,
and I think this needs to be addressed in CIFS.
I.e. we may still need to reset the mode even if the file has no ACLs,
as generally only dirs get default ACLs copied, as demonstrated below:

$ mkdir acl$ setfacl -d -m o::rw acl
$ getfacl acl
# file: acl
# owner: padraig
# group: padraig
user::rwx
group::r-x
other::r-x
default:user::rwx
default:group::r-x
default:other::rw-

$ touch acl/file
$ ls -l acl/file
-rw-r--rw-. 1 padraig padraig 0 May  9 13:11 acl/file
$ getfacl acl/file
# file: acl/file
# owner: padraig
# group: padraig
user::rw-
group::r--
other::rw-

cheers,
Pádraig.





bug#70814: pwd(1) man page: Unclear "avoid" wording

2024-05-07 Thread Pádraig Brady

On 07/05/2024 05:19, Bruce Jerrick wrote:

This wording in the pwd(1) man page is unclear:

     -P, --physical
     avoid all symlinks

"resolve all symlinks" would be much better wording.


Pushed that in your name.

Marking this as done.

thanks,
Pádraig





bug#70801: Weird behaviour when standard input is closed

2024-05-06 Thread Pádraig Brady

On 06/05/2024 08:38, Bernard Burette wrote:

Hi,



If I try:

$ cat <&-

cat: -: Bad file descriptor

cat: closing standard input: Bad file descriptor

$



The error on stdin beign closed is displayed twice plus "-" is for a FILE 
argument to replace standard input, It would make more sense to me to have someting like:

$ cat <&-

cat: standard input: Bad file descriptor

$



This is the "git diff" of my proposed change:

diff --git a/src/cat.c b/src/cat.c

index b33faeb35..4be189d85 100644

--- a/src/cat.c

+++ b/src/cat.c

@@ -50,6 +50,14 @@

/* Name of input file.  May be "-".  */

static char const *infile;



+/* Pretty name of input file */

+static char const *

+quotef_infile (void)

+{

+  if (STREQ (infile, "-")) return _("standard input");

+  return quotef (infile);

+}

+

/* Descriptor on which input file is open.  */

static int input_desc;



@@ -164,7 +172,7 @@ simple_cat (char *buf, idx_t bufsize)

    size_t n_read = safe_read (input_desc, buf, bufsize);

    if (n_read == SAFE_READ_ERROR)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    return false;

  }



@@ -313,7 +321,7 @@ cat (char *inbuf, idx_t insize, char *outbuf, idx_t outsize,

    size_t n_read = safe_read (input_desc, inbuf, insize);

    if (n_read == SAFE_READ_ERROR)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    write_pending (outbuf, );

    newlines2 = newlines;

    return false;

@@ -526,7 +534,7 @@ copy_cat (void)

  || errno == EBADF || errno == EXDEV || errno == ETXTBSY

  || errno == EPERM)

    return 0;

-    error (0, errno, "%s", quotef (infile));

+    error (0, errno, "%s", quotef_infile());

  return -1;

    }

}

@@ -684,7 +692,7 @@ main (int argc, char **argv)

    input_desc = open (infile, file_open_mode);

    if (input_desc < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

    continue;

  }

@@ -692,7 +700,7 @@ main (int argc, char **argv)



    if (fstat (input_desc, _buf) < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

    goto contin;

  }

@@ -719,7 +727,7 @@ main (int argc, char **argv)

  }

    if (exhausting)

  {

-  error (0, 0, _("%s: input file is output file"), quotef 
(infile));

+  error (0, 0, _("%s: input file is output file"), 
quotef_infile());

    ok = false;

    goto contin;

  }

@@ -794,7 +802,7 @@ main (int argc, char **argv)

  contin:

    if (!reading_stdin && close (input_desc) < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

  }

  }

@@ -807,7 +815,8 @@ main (int argc, char **argv)

  }



    if (have_read_stdin && close (STDIN_FILENO) < 0)

-    error (EXIT_FAILURE, errno, _("closing standard input"));

+    if (ok)

+  error (EXIT_FAILURE, errno, _("closing standard input"));



    return ok ? EXIT_SUCCESS : EXIT_FAILURE;

}



There are a lot more of coreutil utilities that (could) behave weird on a 
closed standard input, here is my first list:

b2sum base32 base64 cat cksum comm csplit cut expand factor fmt fold head join 
md5sum nl nohup numfmt od paste pr ptx sha1sum sha224sum sha256sum sha384sum 
sha512sum shred shuf sort split stdbuf stty sum tac tail tee tr tsort unexpand 
uniq wc





I would like to work on those as well but it would help to have:

1) some kind of consensus on "this is a better way of displaying the error", 
for example other tools also go wild like:

  grep: (standard input): Bad file descriptor

  - why parentheses?

  sed: read error on stdin: Bad file descriptor

  - stdin is the C name for standard input, documented?

2) a better way of offering changes than sending a diff patch in a e-mail.


Yes it would be good to fix up at least the duplicated error.
We did the same on the write side of things with:
https://github.com/coreutils/coreutils/commit/0b2ff7637
Also we generally checked that readers diagnosed errors in:
https://github.com/coreutils/coreutils/blob/master/tests/misc/read-errors.sh

So something similar could probably be done here.

cheers,
Pádraig





bug#70727: cp doesn't recognize the new --update=none-fail option

2024-05-03 Thread Pádraig Brady

On 03/05/2024 05:12, Attila Fidan via GNU coreutils Bug Reports wrote:

Hi,

I wanted to use the new cp --update=none-fail option introduced in 9.5,
but it said "invalid argument ‘none-fail’ for ‘--update’". It turns out
that the commit (49912bac286eb3c0ef7d1567ae790193ad5eb2e8) adding it
forgot to add the new operation to update_type[] and
update_type_string[] in cp.c like it did for mv.c. After patching
coreutils locally the functionality works as expected.

It seems like the test suite didn't catch this because there's no
cp/update.sh test like there is for mv. There's a test for if using
--backup and --update=none-fail are mutually exclusive by checking if cp
returns 1, but an invalid argument also makes cp return 1 :)

I didn't include a patch in case a change to the test suite is wanted,
but the proposed change is tiny and rather obvious.


Well that's embarrassing.
I implemented in cp first, tested that manually,
then must have messed up that hunk when rebasing.

The attached should fix this.

Marking this as done.

thanks,
Pádraig.From de49e993ea8b6dcdf6cada3c0f44a6371514f952 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Fri, 3 May 2024 10:18:50 +0100
Subject: [PATCH] cp: actually support --update=none-fail

* src/cp.c: Add the entries for the --update=none-fail option.
* tests/mv/update.sh: Add a test case.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/70727
---
 NEWS   |  4 
 src/cp.c   |  4 ++--
 tests/mv/update.sh | 11 +++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 389f72516..7e8ccb34f 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,10 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Bug fixes
 
+  cp fixes support for --update=none-fail, which would have been
+  rejected as an invalid option.
+  [bug introduced in coreutils-9.5]
+
   ls and printf fix shell quoted output in the edge case of escaped
   first and last characters, and single quotes in the string.
   [bug introduced in coreutils-8.26]
diff --git a/src/cp.c b/src/cp.c
index 28b0217db..06dbad155 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -104,11 +104,11 @@ ARGMATCH_VERIFY (reflink_type_string, reflink_type);
 
 static char const *const update_type_string[] =
 {
-  "all", "none", "older", nullptr
+  "all", "none", "none-fail", "older", nullptr
 };
 static enum Update_type const update_type[] =
 {
-  UPDATE_ALL, UPDATE_NONE, UPDATE_OLDER,
+  UPDATE_ALL, UPDATE_NONE, UPDATE_NONE_FAIL, UPDATE_OLDER,
 };
 ARGMATCH_VERIFY (update_type_string, update_type);
 
diff --git a/tests/mv/update.sh b/tests/mv/update.sh
index 164357803..39ff677b9 100755
--- a/tests/mv/update.sh
+++ b/tests/mv/update.sh
@@ -38,6 +38,17 @@ for interactive in '' -i; do
   done
 done
 
+# These should accept all options
+for update_option in '--update' '--update=older' '--update=all' \
+ '--update=none' '--update=none-fail'; do
+
+  touch file1 || framework_failure_
+  mv $update_option file1 file2 || fail=1
+  test -f file1 && fail=1
+  cp $update_option file2 file1 || fail=1
+  rm file1 file2 || framework_failure_
+done
+
 # These should perform the rename / copy
 for update_option in '--update' '--update=older' '--update=all' \
  '--update=none --update=all'; do
-- 
2.44.0



bug#70714: realpath no error for unreadable-symlink

2024-05-02 Thread Pádraig Brady

On 02/05/2024 07:16, Nineteendo INC wrote:

coreutils version: stable 9.5 (bottled)
OS version: macOS 13.6.6 (22G630)

`realpath` doesn’t behave correctly for unreadable symlinks:

wannes@Stefans-iMac ~ % ln -s . src
wannes@Stefans-iMac ~ % grealpath -e src/..
/Users
wannes@Stefans-iMac ~ % chmod -h 000 src
wannes@Stefans-iMac ~ % grealpath -e src/..
/Users/wannes

Expected behaviour:

wannes@Stefans-iMac ~ % grealpath -e src/..
grealpath: src/..: Permission denied


Right, looks like we'll have to cater for EACCES on darwin.
I'll have a look





bug#70699: Possible bug in /usr/bin/paste

2024-05-01 Thread Pádraig Brady

On 01/05/2024 15:28, Art Shelest via GNU coreutils Bug Reports wrote:

Good morning,

I am seeing an aberrant behavior from the /usr/bin/paste utility when working 
with Windows-style CR/LF text files.
The repro is for Mint Mate (Virginia).

If I change the line endings in the first file to Unix format (LF), it works as 
expected.
If I change the line endings to Max (CR), it breaks even worse.

$ hexdump -C letters.txt
  61 61 09 41 41 0d 0a 62  62 09 42 42 0d 0a|aa.AA..bb.BB..|

$ cat letters.txt
aa   AA
bb   BB
$ cat numbers.txt
1
2
$ paste letters.txt numbers.txt
aa   1A
bb   2B
$


Expected:
$ paste letters.txt numbers.txt
aa   AA   1
bb   BB   2

Thank you.



paste(1) is treating the CR like a standard character,
and when outputting that back to the terminal it "messes up" the expected 
output.
I suggest you convert any such files to unix format before processing.

thanks,
Pádraig





bug#70532: sort: Mention counting fields from the end

2024-04-23 Thread Pádraig Brady

On 23/04/2024 11:14, Dan Jacobson wrote:

In (info "(coreutils) sort invocation") be sure to add an example of a
way or workaround for counting fields from the end of the line. E.g., we
want to sort on the last field, but don't know for sure how many fields
a line might contain. E.g., sort by surname, when lines consist of First
[Middle...] Surname. perl -a uses $F[-1]. so maybe sort(1) could also
use a negative field number. Same for character number.


All good suggestions. I'll at least add an example along the lines of:

  awk '{print $NF, $0}' | sort -k1,1 | cut -f2- -d' '

cheers,
Pádraig.





bug#70418: ls bug

2024-04-17 Thread Pádraig Brady

On 16/04/2024 23:17, Paul Eggert wrote:

On 4/16/24 14:30, Toby Kelsey wrote:

The man page doesn't explain this format conflict, while the info page
(info '(coreutils) ls invocation' or 'info ls') claims '-f' implies '-1'
which is also incorrect: 'ls -1f' gives different output to to 'ls -f'.


Yes, this area of GNU 'ls' a mess. Option order should not matter here.

Option order didn't matter in 7th Edition Unix, where -f overrode -l
regardless of whether -f came before or after -l. And option order
doesn't matter in FreeBSD, where -f and -l are orthogonal. GNU ls is an
odd hybrid of 7th Edition and FreeBSD and messes this up.

Rather than document the hybrid mess, let's bite the bullet and fix it.
FreeBSD behavior makes more sense, so let's do that. Proposed patch
attached.


+1

Related to this is the recent adjustment of usage() for -f:
https://bugs.gnu.org/67765

Patch looks good, and conforms to POSIX.

thanks!
Pádraig





bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady

On 16/04/2024 15:47, Alejandro Colomar wrote:

Hi Pádraig,

On Tue, Apr 16, 2024 at 03:25:22PM +0100, Pádraig Brady wrote:

What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently


I don't know.  The reporter didn't tell.  I see you also asked on the
Github original report.


Note that check originally came from:
https://github.com/coreutils/coreutils/commit/dea4262fa

I suppose we could relax the check as follows, for files of apparent size 0
which would cater for this, and others that may also have unstable inodes.


H.  Since you couldn't reprodude it in a recent Darwin, maybe it's
just a bug in an old Darwin.  And since noone else seems to have met
this Darwin's bug, maybe we can just ignore it.  (And if it were a
regression in a more recent Darwin, hopefully they should fix their
kernel.)

I'm not happy relaxing a security check, without making sure that there
are no implications at all.

I vote for claiming only limited support to such a Darwin system.  I
already workarounded it in the Linux man-pages, by not piping to
install(1) in a common task; and nobody else seems to be affected.

Unless you feel confident that it's perfectly fine to do it.  But I have
no sympathy for workarounding Darwin bugs here.


I agree if it's older Darwin only, we can ignore.
The version I tested on is 3 years old now though,
so I'm not sure whether the issue is on newer or older.

Note we had similar issue on Solaris,
where we used an fstat() wrapper to adjust things:
https://bugs.gnu.org/35713

A related suggestion was from Marc Chantreux (CC'd)
to support '-' to imply stdin, which would be more portable.
There is some merit to that suggestion too.

cheers,
Pádraig





bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady

On 16/04/2024 12:33, Pádraig Brady wrote:

On 16/04/2024 01:19, Alejandro Colomar wrote:

Hi!

I don't own a Darwin system, so I can't help much reproduce.  However,
I've received a bug report to the Linux man-pages, that our build
system (GNUmakefile-based), which ends up calling

... | install /dev/stdin $@

doesn't work on Darwin.  Here's the original bug report:
<https://github.com/NixOS/nixpkgs/pull/300797>.

Here are the reported error messages:

...
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addseverity.3
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/adjtime.3
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addmntent.3]
 Error 1
make: *** Waiting for unfinished jobs
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/acosh.3]
 Error 1
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
...

I don't see why install(1) should fail to read /dev/stdin under any
POSIX system


What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently


Note that check originally came from:
https://github.com/coreutils/coreutils/commit/dea4262fa

I suppose we could relax the check as follows, for files of apparent size 0
which would cater for this, and others that may also have unstable inodes.

cheers,
Pádraig.

diff --git a/src/copy.c b/src/copy.c
index 2145d89d5..fb5f0a1a0 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1263,8 +1263,11 @@ copy_reg (char const *src_name, char const *dst_name,
 }

   /* Compare the source dev/ino from the open file to the incoming,
- saved ones obtained via a previous call to stat.  */
-  if (! psame_inode (src_sb, _open_sb))
+ saved ones obtained via a previous call to stat.  Restrict
+ the check to files with an apparent size, to support "files"
+ with unstable inodes, like /dev/stdin on macOS.  */
+  if (! psame_inode (src_sb, _open_sb)
+  && (src_sb->st_size || src_open_sb.st_size))
 {
   error (0, 0,
  _("skipping file %s, as it was replaced while being copied"),







bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady

On 16/04/2024 01:19, Alejandro Colomar wrote:

Hi!

I don't own a Darwin system, so I can't help much reproduce.  However,
I've received a bug report to the Linux man-pages, that our build
system (GNUmakefile-based), which ends up calling

... | install /dev/stdin $@

doesn't work on Darwin.  Here's the original bug report:
.

Here are the reported error messages:

...
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addseverity.3
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/adjtime.3
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addmntent.3]
 Error 1
make: *** Waiting for unfinished jobs
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/acosh.3]
 Error 1
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
...

I don't see why install(1) should fail to read /dev/stdin under any
POSIX system


What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently

cheers,
Pádraig





bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-15 Thread Pádraig Brady

On 15/04/2024 15:37, Andreas Grünbacher wrote:

Hello,

Am So., 14. Apr. 2024 um 00:43 Uhr schrieb Pádraig Brady :

On 13/04/2024 20:29, Bruno Haible wrote:

Hi Pádraig,

I wrote:

5) The same thing with 'cp -a' succeeds:

$ build-sparc64/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0
$ build-sparc64-no-acl/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0


You wrote:

The psuedo code that install(1) uses is:

copy_reg()
 if (x->set_mode) /* install */
   set_acl(dest, x->mode /* 600 */)
 ctx->acl = acl_from_mode ( /* 600 */)
 acl_set_fd (ctx->acl) /* fails EACCES */
 if (! acls_set)
must_chmod = true;
 if (must_chmod)
   saved_errno = EACCES;
   chmod (ctx->mode /* 600 */)
   if (save_errno)
 return -1;


And, for comparison, what is the pseudo-code that 'cp -a' uses?
I would guess that there must be a relevant difference between both.


The cp pseudo code is:

copy_reg()
if (preserve_xattr)
  copy_attr()
ret = attr_copy_fd()
if (ret == -1 && require_preserve_xattr /*false*/)
  return failure;
if (preserve_mode)
  copy_acl()
qcopy_acl()
  #if USE_XATTR /* true */
fchmod() /* chmod before setting ACLs as doing after may reset */
return attr_copy_fd() /* successful if no ACLs in source */
  #endif

If however you add ACLs in the source, you induce a similar failure:

$ setfacl -m u:nobody:r /var/tmp/foo3942
$ src/cp -a /var/tmp/foo3942 foo3942; echo $?
src/cp: preserving permissions for ‘foo3942’: Permission denied
1

The corresponding strace is:

fchmod(4, 0100640)  = 0
flistxattr(3, NULL, 0)  = 24
flistxattr(3, "system.posix_acl_access\0", 24) = 24
fgetxattr(3, "system.posix_acl_access", NULL, 0) = 44
fgetxattr(3, "system.posix_acl_access", "\2\0...\4", 44) = 44
fsetxattr(4, "system.posix_acl_access", "\2\0...\4", 44, 0) = -1 EACCES 
(Permission denied)


Why does CIFS think EACCES is an appropriate error to return here? The
fchmod() succeeds, so changing the file permissions via fsetxattr()
should really succeed as well.


Right, it seems like a CIFS bug (already CC'd)
Even if it returned ENOTSUP (like getxattr(...posix...) does) it would be ok as 
we handle that.
It would be good to avoid it though.

You confirmed privately to me that the set_acl() is to clear any default ACLs
that may have been added to the newly created file, so the posted solution in
https://bugs.gnu.org/70214#11 that only does the set_acl() iff file_has_acl()
should avoid the CIFS issue.  It would be a bit less efficient if there were
ACLs, but a bit more efficient in the common case where there weren't ACLs.

cheers,
Pádraig





bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady

On 13/04/2024 20:29, Bruno Haible wrote:

Hi Pádraig,

I wrote:

5) The same thing with 'cp -a' succeeds:

$ build-sparc64/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0
$ build-sparc64-no-acl/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0


You wrote:

The psuedo code that install(1) uses is:

copy_reg()
if (x->set_mode) /* install */
  set_acl(dest, x->mode /* 600 */)
ctx->acl = acl_from_mode ( /* 600 */)
acl_set_fd (ctx->acl) /* fails EACCES */
if (! acls_set)
   must_chmod = true;
if (must_chmod)
  saved_errno = EACCES;
  chmod (ctx->mode /* 600 */)
  if (save_errno)
return -1;


And, for comparison, what is the pseudo-code that 'cp -a' uses?
I would guess that there must be a relevant difference between both.


The cp pseudo code is:

copy_reg()
  if (preserve_xattr)
copy_attr()
  ret = attr_copy_fd()
  if (ret == -1 && require_preserve_xattr /*false*/)
return failure;
  if (preserve_mode)
copy_acl()
  qcopy_acl()
#if USE_XATTR /* true */
  fchmod() /* chmod before setting ACLs as doing after may reset */
  return attr_copy_fd() /* successful if no ACLs in source */
#endif

If however you add ACLs in the source, you induce a similar failure:

$ setfacl -m u:nobody:r /var/tmp/foo3942
$ src/cp -a /var/tmp/foo3942 foo3942; echo $?
src/cp: preserving permissions for ‘foo3942’: Permission denied
1

The corresponding strace is:

fchmod(4, 0100640)  = 0
flistxattr(3, NULL, 0)  = 24
flistxattr(3, "system.posix_acl_access\0", 24) = 24
fgetxattr(3, "system.posix_acl_access", NULL, 0) = 44
fgetxattr(3, "system.posix_acl_access", "\2\0...\4", 44) = 44
fsetxattr(4, "system.posix_acl_access", "\2\0...\4", 44, 0) = -1 EACCES 
(Permission denied)

BTW I was wondering about the need for install(1) to set_acl() at all,
rather than just using chmod.
The following comment in lib/set-permissions.c may be pertinent:

/* If we can't set an acl which we expect to be able to set, try setting
   the permissions to ctx->mode. Due to possible inherited permissions,
   we cannot simply chmod */

BTW this is all under kernel version:

$ uname -r
6.8.5-gentoo-sparc64

With these cifs options:

$ mount | grep cifs
//syslog.matoro.tk/guest-pixelbeat on /media/guest-homedirs/pixelbeat type cifs
(rw,nosuid,relatime,vers=1.0,cache=strict,username=nobody,uid=30017,forceuid,
gid=30017,forcegid,addr=fd05:::::::0001,
soft,unix,posixpaths,serverino,mapposix,acl,
rsize=1048576,wsize=65536,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)

cheers,
Pádraig





bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady

On 13/04/2024 17:39, Pádraig Brady wrote:

install(1) defaults to mode 600 for new files, and uses set_acl() with that
(since 2007 https://github.com/coreutils/coreutils/commit/f634e8844 )
The psuedo code that install(1) uses is:

copy_reg()
if (x->set_mode) /* install */
  set_acl(dest, x->mode /* 600 */)
ctx->acl = acl_from_mode ( /* 600 */)
acl_set_fd (ctx->acl) /* fails EACCES */
if (! acls_set)
   must_chmod = true;
if (must_chmod)
  saved_errno = EACCES;
  chmod (ctx->mode /* 600 */)
  if (save_errno)
return -1;

This issue only only seems to be on CIFS.
I'm seeing lot of weird behavior with ACLs there:

acl_set_fd (acl_from_mode (600)) -> EACCES
acl_set_fd (acl_from_mode (755)) -> EINVAL
getxattr ("system.posix_acl_access") -> EOPNOTSUPP

Note we ignore EINVAL and EOPNOTSUPP errors in set_acl(),
and it's just the EACCES that's problematic.
Note this is quite similar to https://debbugs.gnu.org/65599
where Paul also noticed EACCES with fsetxattr() (and others) on CIFS.

The attached is a potential solution which I tested as working
on the same matoro system that Bruno used.

I think I'll apply that after thinking a bit more about it.


Actually that probably isn't appropriate,
as fsetxattr() can validly return EACESS
even though not documented in the man page at least.
The following demonstrates that:
.
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
int wfd;
/* Note S_IWUSR is not set.  */
if ((wfd=open("writable", O_CREAT|O_WRONLY|O_EXCL, S_IRUSR)) == -1)
fprintf(stderr, "open('writable') error [%m]\n");
if (write(wfd, "data", 1) == -1)
fprintf(stderr, "write() error [%m]\n");
if (fsetxattr(wfd, "user.test", "test", 4, 0) == -1)
fprintf(stderr, "fsetxattr() error [%m]\n");
}

Another solution might be to file_has_acl() in copy.c
and skip if "not supported" or nothing is returned.
The following would do that, but I'm not sure about this
(as I'm not sure about ACLs in general TBH).
Note listxattr() returns 0 on CIFS here,
while getxattr ("system.posix_acl_access") returns EOPNOTSUPP,
and file_has_acl() uses listxattr() first.

diff --git a/src/copy.c b/src/copy.c
index d584a27eb..2145d89d5 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1673,8 +1673,13 @@ set_dest_mode:
 }
   else if (x->set_mode)
 {
-  if (set_acl (dst_name, dest_desc, x->mode) != 0)
-return_val = false;
+  errno = 0;
+  int n = file_has_acl (dst_name, );
+  if (0 < n || (errno && ! (is_ENOTSUP (errno) || errno == ENOSYS)))
+{
+  if (set_acl (dst_name, dest_desc, x->mode) != 0)
+return_val = false;
+}
 }


BTW I'm surprised this wasn't reported previously for CIFS,
so I due to this bug and https://debbugs.gnu.org/65599
I suspect a recentish change in CIFS wrt EACCES.

cheers,
Pádraig





bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady
rtions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index c72165e268..fd094d1091 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-04-13  Pádraig Brady  
+
+	acl-permissions: avoid erroneous failure on CIFS
+	* lib/set-permissions.c (set_acls): On Linux (and others), also discount
+	EACESS as a valid errno with FD operations, as CIFS was seen to
+	return that erroneously in some cases.
+
 2024-04-13  Bruno Haible  
 
 	gnulib-tool.py: Code tweak.
diff --git a/lib/set-permissions.c b/lib/set-permissions.c
index 83a355faa5..7f8e55f5cd 100644
--- a/lib/set-permissions.c
+++ b/lib/set-permissions.c
@@ -520,7 +520,13 @@ set_acls (struct permission_context *ctx, const char *name, int desc,
 ret = acl_set_file (name, ACL_TYPE_ACCESS, ctx->acl);
   if (ret != 0)
 {
-  if (! acl_errno_valid (errno))
+  if (! acl_errno_valid (errno)
+#   if defined __linux__
+  /* Special case EACCES for fd operations as CIFS
+ was seen to erroneously return that in some cases.  */
+  || (HAVE_ACL_SET_FD && desc != -1 && errno == EACCES)
+#   endif
+ )
 {
   ctx->acls_not_supported = true;
   if (from_mode || acl_access_nontrivial (ctx->acl) == 0)
-- 
2.44.0



bug#35247: [PATCH] Add alacritty to terminal list supporting colors

2024-04-10 Thread Pádraig Brady

Things have changed a little since the original request.
alacritty sets $COLORTERM, and dircolors now auto accepts that since:
https://github.com/coreutils/coreutils/commit/75c9fc674

There are some complications with remote shells, but
they should boil down to setting up ssh to send/accept COLORTERM.

For some background for setting colors based on env vars
from the Fedora side of things at least, see:
https://fedoraproject.org/wiki/Features/256_Color_Terminals
https://fedoraproject.org/wiki/Changes/Drop_256term.sh

cheers,
Pádraig.





bug#70298: uname -i returns unknown since fedora 38

2024-04-09 Thread Pádraig Brady

On 09/04/2024 10:17, Collin Funk wrote:

On 4/9/24 12:57 AM, Paul Eggert wrote:

Indeed there is, and I merged your bug report into that old one. It'd be nice 
if someone could get to the bottom of that bug.


I decided to look into this a bit, since I also have an unknown
'uname -i' and 'uname -p'.

It seems that this option is a Solaris thing. I found the commit that
it was introduced [1]. It also adds some other Solaris compatibility
stuff, so everything seems to add up so far.

The first function it tries is sysinfo (SI_PLATFORM, ...) which seems
to be a Solaris thing [2]. I think Linux has sysinfo but not
SI_PLATFORM.

Then it tries sysctl (...) with a UNAME_HARDWARE_PLATFORM macro to deal
with the BSDs. I think OpenBSD might be missing that definition in the
#ifdef, but I have no way of testing it at the moment [3].

I'm not sure if 'uname -i', 'uname -p', or 'uname -m' are supposed to
be any different from each other. Maybe someone who knows Solaris can
help more?

Illumios defines 'sysinfo' as an alias to 'systeminfo' [4]. There it
returns an 'extern char *platform' in the given buffer when passed
SI_PLATFORM [5]. I've found the definition of the variable, but I
don't know where it is actually set to something useful [6]. Hopefully
that information helps someone...

[1] 
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=aaeb7a61c4662fd28cf2bc161b740352711538d2
[2] 
https://github.com/illumos/illumos-gate/blob/cf618897f43ea305e3a426f93bbcef4e8106829c/usr/src/uts/common/sys/systeminfo.h#L86
[3] https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/uname.c#n36
[4] 
https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/common/sys/sysinfo.S
[5] 
https://github.com/illumos/illumos-gate/blob/cf618897f43ea305e3a426f93bbcef4e8106829c/usr/src/uts/common/syscall/systeminfo.c#L124
[6] 
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/conf/param.c#L533


Thanks for looking at this.
From the Fedora side, they dropped a Fedora specific patch for Fedora 38.

https://bugzilla.redhat.com/show_bug.cgi?id=548834
https://bugzilla.redhat.com/show_bug.cgi?id=2208048

cheers,
Pádraig.






bug#70231: Performance issue on sort with zero-sized pseudo files

2024-04-07 Thread Pádraig Brady

On 06/04/2024 23:22, Paul Eggert wrote:

On 2024-04-06 03:09, Pádraig Brady wrote:

I'll apply this.


Heh, I beat you to it by looking for similar errors elsewhere and
applying the attached patches to fix the issues I found. None of them
look like serious bugs.


Cool. I thought the sort(1) change worthy of a NEWS entry so pushed one.
Marking this as done.


BTW we should improve sort buffer handling in general


Oh yes.

PS. My current little task is to get i18n to work better with 'sort'.
Among other things I want Unicode-style full case folding.


Excellent, that will help keep the related uniq(1) and join(1)
commands more aligned in their ordering.

cheers,
Pádraig





bug#70219: Bug/Issue with timeout and signals

2024-04-06 Thread Pádraig Brady

tag 70219 notabug
close 70219
stop

On 06/04/2024 16:50, Branden R. Williams via GNU coreutils Bug Reports wrote:

   -k, --kill-after=DURATION
  also send a KILL signal if COMMAND is still running
this long after the initial signal was sent


If you read the above carefully, please note the words _also_ and _initial_.
I.e. -k will cause _another_ signal to be sent, iff the first doesn't terminate 
the command.

Hopefully this example with all pertinent options explains things.
I.e. the first signal sent after 1s is ignored,
and the second kill signal sent after another 2s forcefully kills the command.

  $ date +%s; timeout -k 2s -s0 1s sleep inf; date +%s
  1712427916
  Killed
  1712427919

cheers,
Pádraig





bug#70231: Performance issue on sort with zero-sized pseudo files

2024-04-06 Thread Pádraig Brady

On 06/04/2024 03:52, Takashi Kusumi wrote:

Hi,

I have found a performance issue with the sort command when used on
pseudo files with zero size. For instance, sorting `/proc/kallsyms`, as
demonstrated below, takes significantly longer than executing with
`cat`, generating numerous temporary files. I confirmed this issue on
v8.32 as well as on commit 8f3989d in the master branch.

$ time cat /proc/kallsyms | sort > /dev/null
real0m0.954s
user0m0.873s
sys 0m0.096s

$ time sort /proc/kallsyms > /dev/null
real0m8.555s
user0m3.367s
sys 0m5.064s

$ strace -e trace=openat sort /proc/kallsyms 2>&1 > /dev/null \
  | grep /tmp/sort | head -100
...
openat(AT_FDCWD, "/tmp/sortM6Y6Y1", ...
openat(AT_FDCWD, "/tmp/sortPrHKMG", ...

$ strace -e trace=openat -c sort /proc/kallsyms > /dev/null
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
100.006.419777  19333258 8 openat
-- --- --- - - 
100.006.419777  19333258 8 total

It appears that the buffer size allocated for pseudo files with zero
size is insufficient, likely because it is based on their file size,
which is zero. As seen in the attached patch, I think using
`INPUT_FILE_SIZE_GUESS` to calculate the buffer size when the file size
is zero would resolve this issue.


I'll apply this.

BTW we should improve sort buffer handling in general. From my TODO...

0. Have sort --debug output memory buffer sizes and space avail at $TMPDIR(s)
1. auto increase buffer when reading from pipe or zero sized files.
This will be more efficient and more importantly enable parallel operation.
See http://superuser.com/questions/938558/sort-parallel-isnt-parallelizing/
At least your more appropriate default buffer sizes in this case.
I.e. bigger mins and probably smaller maxs as half avail mem is too aggressive.
2. check() should not need full buffer size?
only merge buffer size or something small at least.
3. Look at minimizing the amount of mem used by default.
Hmm, sort auto adjusts down to avail mem in initbuf() (Test with ulimit -v)
4. Careful with too small buffers as that may initiate
an extra merge step (see section above).

If anyone wants to look at the above give me a heads up,
or I'll get to it sometime in the next release cycle.

thanks!
Pádraig.






bug#70126: Missing a full stop in a coreutils-9.5-pre2 message

2024-04-01 Thread Pádraig Brady

On 01/04/2024 16:30, Petr Pisar wrote:

Hello,

while translating coreutils-9.5-pre2 I noticed this message:

#: src/chown.c:123
msgid ""
"  --reference=RFILE  use RFILE's ownership rather than specifying values\n"
" RFILE is always dereferenced if a symbolic link.\n"

I believe that there is missing a full stop after "values" on the first line.


Fixed.
Marking this as done.

thanks!
Pádraig






bug#70104: "FAIL: test-canonicalize" coreutils v9.5 on musl libc

2024-03-31 Thread Pádraig Brady

On 31/03/2024 10:02, Adept's Lab wrote:

test-canonicalize.c:411: assertion 'strcmp (result1, "//") == 0' failed

^ the only error log message I get. Fail was not presented with previous
stable versions.


This is on musl 1.1.24 as detailed at:
https://github.com/coreutils/coreutils/issues/83

CC'ing bug-gnulib

cheers,
Pádraig





bug#70070: FAIL: test-localtime_r

2024-03-29 Thread Pádraig Brady

On 29/03/2024 12:40, Andreas Schwab wrote:

FAIL: test-localtime_r
==

test-localtime_r.c:58: assertion 'result->tm_hour == 18' failed
FAIL test-localtime_r (exit status: 134)

FAIL: test-localtime_r-mt
=

thread2 disturbed by thread1!
thread1 disturbed by thread2!
FAIL test-localtime_r-mt (exit status: 134)



CC'ing gnulib as those tests are from gnulib.
It may help to detail your platform.

thanks,
Pádraig.





bug#69995: Untranslatable string

2024-03-25 Thread Pádraig Brady

On 25/03/2024 14:02, Göran Uddeborg wrote:

While translating the new version of coreutils to Swedish, I came
across this code from the end of chown-core.h:

   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the %sgroup of each file only if\n\
  its current owner and/or group match those 
specified\n\
  here.  Either may be omitted, in which case a match\n\
  is not required for the omitted attribute\n\
"), user ? "owner and/or " : "");

As this stands, the part "owner and/or " will not be translated.

Simply marking that part too for translation is not a good solution, even
if it would happen to work for Swedish. See
https://www.gnu.org/savannah-checkouts/gnu/gettext/manual/gettext.html#Entire-sentences


Fixed with:
https://github.com/coreutils/coreutils/commit/26fd96a96

thanks,
Pádraig





bug#69985: Untranslatable message in src/chown-core.h:95

2024-03-24 Thread Pádraig Brady

On 24/03/2024 16:57, Frédéric Marchal wrote:

Hi,

In src/chown-core.h:95 (coreutils-9.5-pre1), the following message is
impossible to translate as it is created by concatenating string fragments:

static inline void
emit_from_option_description (bool user)
{
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the %sgroup of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"), user ? "owner and/or " : "");
}

A translatable message must *never ever* be produced by concatenating
substrings. Even if the substrings are themselves translated, there is no
guaranties that the order of the words is the same in every language.

There is, unfortunately, no easy solution that save keystrokes. It must be
written like this:

if (user)
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the owner and/or group of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"));
else
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the group of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"));

That make sense as both are distinct messages with different purposes.

I also suspect that the two messages should be even more different than they
currently are. The message, when user is false, suspiciously contains one more
"owner and/or" that is not removed and it says "Either may be omitted" when
only the group is mentioned at the beginning of the message.


The composed strings are accurate, but yes not translatable.
The attached simplifies the description a little,
so there is a single string to cater for both chown and chgrp.

Marking this as done.

thanks,
Pádraig
From baea9881f79690552b0aa3ca6844bbf64fa6e55e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 24 Mar 2024 20:12:53 +
Subject: [PATCH] doc: fix translation issue in chown/chgrp amalgamation

* src/chown-core.h (emit_from_option_description): The conditional
string composition here caused issues for translators.
Instead move to a more general description ...
(src/chown.c (usage): ... here.
Fixes https://bugs.gnu.org/69985
---
 src/chown-core.h | 12 
 src/chown.c  |  8 +++-
 2 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/src/chown-core.h b/src/chown-core.h
index e1396e3ea..4bd68fed4 100644
--- a/src/chown-core.h
+++ b/src/chown-core.h
@@ -89,16 +89,4 @@ chown_files (char **files, int bit_flags,
  struct Chown_option const *chopt)
   _GL_ATTRIBUTE_NONNULL ();
 
-static inline void
-emit_from_option_description (bool user)
-{
-  printf (_("\
-  --from=CURRENT_OWNER:CURRENT_GROUP\n\
- change the %sgroup of each file only if\n\
- its current owner and/or group match those specified\n\
- here.  Either may be omitted, in which case a match\n\
- is not required for the omitted attribute\n\
-"), user ? "owner and/or " : "");
-}
-
 #endif /* CHOWN_CORE_H */
diff --git a/src/chown.c b/src/chown.c
index 90ce84d67..3194dc75f 100644
--- a/src/chown.c
+++ b/src/chown.c
@@ -109,7 +109,13 @@ With --reference, change the group of each FILE to that of RFILE.\n\
  (useful only on systems that can change the\n\
  ownership of a symlink)\n\
 "), stdout);
-  emit_from_option_description (chown_mode == CHOWN_CHOWN);
+  fputs (_("\
+  --from=CURRENT_OWNER:CURRENT_GROUP\n\
+ change the ownership of each file only if its\n\
+ current owner and/or group match those specified here.\n\
+ Either may be omitted, in which case a match\n\
+ is not required for the omitted attribute\n\
+"), stdout);
   fputs (_("\
   --no-preserve-root  do not treat '/' specially (the default)\n\
   --preserve-rootfail to operate recursively on '/'\n\
-- 
2.44.0



bug#69979: [PATCH-v2][doc] ls -l doc and size -> maj, min for device files

2024-03-24 Thread Pádraig Brady

On 24/03/2024 12:40, Stephane Chazelas wrote:

Tags: patch

My bad, the patch was incorrect, it should have said
"replaced by the corresponding device major and minor numbers as two decimal
numbers separated by a comma and at least one space.", as
there's not always only once space between the major and minor.

See also
https://unix.stackexchange.com/questions/773014/where-do-i-find-documentation-for-the-output-of-ls-l
where the issue was reported and where the variations of output
formats between ls implementations is being mentioned.

New patch attached.


Pushed.
Marking this as done.

thanks,
Pádraig





bug#69951: coreutils: printf formatting bug for nb_NO and nn_NO locales

2024-03-23 Thread Pádraig Brady

tag 69951 notabug
close 69951
stop

On 22/03/2024 20:22, Thomas Dreibholz wrote:

Hi,

I just discovered a printf bug for at least the nb_NO and nn_NO locales
when printing numbers with thousands separator. To reproduce:

#!/bin/bash
for l in de_DE nb_NO ; do
     echo "LC_NUMERIC=$l.UTF-8"
     for n in 1 100 1000 1 10 100 1000 ; do
    LC_NUMERIC=$l.UTF-8 /usr/bin/printf "<%'10d>\n" $n
     done
done

The expected output of "%'10d" is a right-formatted number string with
10 characters.

The output of the test script is fine for e.g. LC_NUMERIC=de_DE.UTF-8
and LC_NUMERIC=en_US.UTF-8:

LC_NUMERIC=de_DE.UTF-8
< 1>
<   100>
< 1.000>
<    10.000>
<   100.000>
< 1.000.000>
<10.000.000>



However, for LC_NUMERIC=nb_NO.UTF-8 and LC_NUMERIC=nn_NO.UTF-8, the
formatting is wrong:

LC_NUMERIC=nb_NO.UTF-8
< 1>
<   100>
<   1 000>
<  10 000>
< 100 000>
<1 000 000>
<10 000 000>



I reproduced the issue with coreutils-8.32-4.1ubuntu1.1 (Ubuntu 22.04)
as well as coreutils-9.3-5.fc39.x86_64 (Fedora 39).

Under FreeBSD 14.0-RELEASE (coreutils-9.4_1), the output looks slightly
better but is still wrong:

LC_NUMERIC=nb_NO.UTF-8
< 1>
<   100>
<    1 000>
<   10 000>
<  100 000>
<1 000 000>
<10 000 000>
LC_NUMERIC=nn_NO.UTF-8
< 1>
<   100>
<    1 000>
<   10 000>
<  100 000>
<1 000 000>
<10 000 000>

May be the issue is that the thousands separator for the Norwegian
locales is a space " ", while it is "."/"," for German/US English locales.


The issue looks to be that the thousands separator for Norwegian locales
is “NARROW NO-BREAK SPACE", or more problematically the _three_ byte
UTF8 sequence E2 80 AF. So it looks like an issue with libc routines
counting bytes rather than characters in this case.

One suggestion is to do the alignment after. For example:

$ export LC_NUMERIC=nb_NO.UTF-8
$ printf "%'.f\n" $(seq -f '1E%.f' 7) | column --table-right=1 -t
10
   100
 1 000
10 000
   100 000
 1 000 000
10 000 000

Actually I've just noticed that specifying the %'10.f format
does count characters and not bytes! So another solution is:

$ export LC_NUMERIC=nb_NO.UTF-8
$ printf "%'10.f\n" $(seq -f '1E%.f' 7)
10
   100
 1 000
10 000
   100 000
 1 000 000
10 000 000

The issue if there is one is in libc at least.
It would be worth checking existing glibc reports about this
and reporting if not mentioned.

cheers,
Pádraig.





bug#69939: test utility bug: "test -a -a -a" and "test -o -o -o" fail

2024-03-22 Thread Pádraig Brady

On 22/03/2024 11:20, Vincent Lefevre wrote:

With GNU Coreutils 9.4, both "test -a -a -a" and "test -o -o -o" fail:

$ export POSIXLY_CORRECT=1
$ /usr/bin/test -a -a -a ; echo $?
/usr/bin/test: ‘-a’: unary operator expected
2
$ /usr/bin/test -o -o -o ; echo $?
/usr/bin/test: ‘-o’: unary operator expected
2

According to POSIX, they should return 0.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
says for 3 arguments:

   If $2 is a binary primary, perform the binary test of $1 and $3.

Here, $2 is -a and -o respectively, which are binary primaries.
And both $1 and $3 are non-null strings.


Agreed. Any leading '-' triggers this:

  $ env test -. -a -.
  test: ‘-.’: unary operator expected

Note we already advise (as does POSIX) to avoid '-a' and '-o':
https://debbugs.gnu.org/22909

BTW POSIX advises to use parenthesis to avoid some parsing ambiguities,
and it helps in this case too:

  $ env test \( -a \) -a \( -a \); echo $?
  0

Though I see bash 5.2.26 has issue with it :/

  $ test \( -a \) -a \( -a \)
  bash: test: `)' expected

bash doesn't like the -a in particular, and is ok with other strings:

  $ test \( -. \) -a \( -. \); echo $?
  0

cheers,
Pádraig





bug#69807: questioning automatic -i in multicolumn pr

2024-03-21 Thread Pádraig Brady

On 14/03/2024 20:31, Douglas McIlroy wrote:

Multicolumn options in pr imply option -i (tabification). The introduction
of tabs with physical rather than logical meaning makes output that is OK
for viewing only if you have correct tab stops, and is complicated for
further processing.  It caters for obsolete equipment--typewriters, on
which tabbing was appreciably faster than spacing.

As a wish-list item I propose abolishing implicit tabification. A second
choice (that doesn't buck Posix) is to provide an option to suppress
implicit tabification. At a bare minimum, document a workaround for the
inconvenient tabs, e.g. post-processing with pr -t -e.


Good call on the documentation.  I'll add this now:

commit 91e69cd2d02f015fc296e02388e0b18a293faa56 (HEAD -> master)
Author: Pádraig Brady 
Date:   Thu Mar 21 15:26:48 2024 +

doc: pr: give solution to expanding TABs in multicolumn output

* doc/coreutils.texi (pr invocation): Explicitly state that
multicolumn output will convert spaces to TABs, and show that
this can be undone with the `pr -t -e` or `expand` commands.
Suggested by Douglas McIlroy in https://bugs.gnu.org/69807

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index e36269588..37d729089 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -2636,9 +2636,11 @@ This option might well cause some lines to be truncated.>
 lines in the columns on each page are balanced.  The options @option{-e}
 and @option{-i} are on for multiple text-column output.  Together with
 @option{-J} option column alignment and line truncation is turned off.
+Since spaces are converted to TABs in multicolumn output, they can be converted
+back by further processing through @command{pr -t -e} or @command{expand}.


thanks,
Pádraig





bug#10311: RFE: Give chmod a "-h" option as well

2024-03-20 Thread Pádraig Brady

On 16/12/2011 16:29, Jan Engelhardt wrote:

Hi,

chown(1) has a -h option by which it affects symlinks directly rather
than the pointed-to file. The bonus side effect is that the
pointed-to files don't get changed in any way, which is kinda welcome
if you attempt to "fix" permissions/ownership in a directory where an
evil user could create a symlink to e.g. /etc/shadow.

Attempting chmod -R g+w /home/groups/evilgroup is still a risk, and
would necessity a more long-winded command involving find(1). It
would therefore be welcome that chmod receive an -h option that just
skips over them (besides perhaps attempting to change their
permissions as well).


Pushed at
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v9.4-162-g07a69fc3b

Marking as done.

cheers,
Pádraig.





bug#11108: [PATCH] chmod: fix symlink race condition

2024-03-20 Thread Pádraig Brady

On 28/03/2012 21:28, Paul Eggert wrote:

On 03/28/2012 01:13 PM, Jim Meyering wrote:

 $ ./chmod u+w f
 ./chmod: changing permissions of 'f': Operation not supported


Yeouch.  I undid the change for now.
Hmm, why did "make check" work for me?
I'll have to investigate later, alas.


Patch for this pushed at:
https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=v9.4-163-g425b8a2f5

Marking this as done.

cheers,
Pádraig.





bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-17 Thread Pádraig Brady

On 17/03/2024 11:32, Pádraig Brady wrote:

On 17/03/2024 06:10, Paul Eggert wrote:

On 2024-03-05 06:16, Pádraig Brady wrote:

I think I'll remove the as yet unreleased mv --swap from coreutils,
given that
util-linux is as widely available as coreutils on GNU/Linux platforms.


Although removing that "mv --swap" implementation was a win, I don't
think we can simply delegate this to util-linux's exch command.
Exchanging files via a renameat-like call is not limited to the Linux
kernel; it's also possible on macOS via renameatx_np with RENAME_SWAP,
and there have been noises about adding similar things to other
operating systems.

I just now added support for macOS renameatx_np to Gnulib, so coreutils
does not need to worry about the macOS details; it can simply use
renameatu with the Linux flags. See:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=af32ee824ee18255839f9812b8ed61aa5257a82b

Even with Linux it's dicey. People may have older util-linux installed
and so lack the 'exch' utility; this is true for both Fedora 39 and
Ubuntu 23.10, the current releases. Ubuntu is also odd in that it
doesn't install all the util-linux utilities as part of the util-linux
package, so it's not clear what they will do with 'exch'.

So I propose that we implement the idea in coreutils in a better way,
that interacts more nicely with -t, -T, etc. Also, I suggest using the
Linuxish name "--exchange" instead of the macOSish name "--swap", and
(for now at least) not giving the option a single-letter equivalent as I
expect it to be useful from scripts, not interactively.

After looking at various ways to do it I came up with the attached
proposed patch. This should work on both GNU/Linux and macOS, if your OS
is recent enough and the file system supports atomic exchange.


The implementation looks good.

Re exch(1) on macos, I see util-linux is on homebrew,
so it would be relatively easy to ifdef renameatx_np in util-linux also.

I think the --no-copy situation is brittle, as scripts not using it now
would be atomic, but then if we ever supported cross fs swaps
it may become non atomic. I'd at least doc with a line in the --exchange
description in usage() to say something like:
"Use --no-copy to enforce atomic operation"

While the most flexible, it's also quite awkward to need
`mv --exchange --no-copy --no-target-directory` for most uses.
I.e. it's tempting to imply the --no-... options with --exchange,
but I suppose since scripting is the primary use case for this
flexibility trumps conciseness, so I'm ok with the verbosity I think.


Oh also in the texinfo I think it's important to mention that the swap
will "exchange all data and metadata". That's not obvious otherwise.
For example users may be wondering if only data was being exchanged
with the macos exchangedata(2) or equivalent.

cheers,
Pádraig





bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-17 Thread Pádraig Brady

On 17/03/2024 06:10, Paul Eggert wrote:

On 2024-03-05 06:16, Pádraig Brady wrote:

I think I'll remove the as yet unreleased mv --swap from coreutils,
given that
util-linux is as widely available as coreutils on GNU/Linux platforms.


Although removing that "mv --swap" implementation was a win, I don't
think we can simply delegate this to util-linux's exch command.
Exchanging files via a renameat-like call is not limited to the Linux
kernel; it's also possible on macOS via renameatx_np with RENAME_SWAP,
and there have been noises about adding similar things to other
operating systems.

I just now added support for macOS renameatx_np to Gnulib, so coreutils
does not need to worry about the macOS details; it can simply use
renameatu with the Linux flags. See:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=af32ee824ee18255839f9812b8ed61aa5257a82b

Even with Linux it's dicey. People may have older util-linux installed
and so lack the 'exch' utility; this is true for both Fedora 39 and
Ubuntu 23.10, the current releases. Ubuntu is also odd in that it
doesn't install all the util-linux utilities as part of the util-linux
package, so it's not clear what they will do with 'exch'.

So I propose that we implement the idea in coreutils in a better way,
that interacts more nicely with -t, -T, etc. Also, I suggest using the
Linuxish name "--exchange" instead of the macOSish name "--swap", and
(for now at least) not giving the option a single-letter equivalent as I
expect it to be useful from scripts, not interactively.

After looking at various ways to do it I came up with the attached
proposed patch. This should work on both GNU/Linux and macOS, if your OS
is recent enough and the file system supports atomic exchange.


The implementation looks good.

Re exch(1) on macos, I see util-linux is on homebrew,
so it would be relatively easy to ifdef renameatx_np in util-linux also.

I think the --no-copy situation is brittle, as scripts not using it now
would be atomic, but then if we ever supported cross fs swaps
it may become non atomic. I'd at least doc with a line in the --exchange
description in usage() to say something like:
  "Use --no-copy to enforce atomic operation"

While the most flexible, it's also quite awkward to need
`mv --exchange --no-copy --no-target-directory` for most uses.
I.e. it's tempting to imply the --no-... options with --exchange,
but I suppose since scripting is the primary use case for this
flexibility trumps conciseness, so I'm ok with the verbosity I think.

thanks,
Pádraig





bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-15 Thread Pádraig Brady

On 15/03/2024 05:21, Paul Eggert wrote:

On 2024-03-14 06:03, Pádraig Brady wrote:



How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
is used when the compiler supports -O1? That way we don't have to worry
about running the program, because (with the "volatile") clang will
error out.

Alternatively perhaps there's some way to check for the bug using
preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
__clang_major__, and __aarch64__. (This could be more fragile, though,
as clang presumably will fix the bug eventually.)


That would probably work for this edge case, but it's brittle
and I'm worried about other combinations of
compiler versions, complier options, and architectures.


Sure, but one cannot resolve such worries completely, as compiler errors
are inherently brittle: even a runtime test cannot prove their absence.

As I understand it, __bf16 is a real zoo: sometimes hardware supports
it, sometimes it doesn't, sometimes the software emulation library is
there, sometimes it's not, and so forth. Running a program on a
development host is likely to not match the runtime behavior on a
production host, so AC_RUN_IFELSE is reasonably likely to produce a
false positive, which is worse than a false negative.

Another issue, which is somewhat related, is that coreutils's current
macro BF16_SUPPORTED is too broad. Because __bf16 has so many bugs,
currently it's probably a mistake to say __bf16 is fully supported
anywhere. Even GCC is buggy; see for example:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114347

which I just now discovered when writing this email.


Nice one! (though that should not impact od)


The question isn't
so much whether __bf16 is supported, it's whether it's supported well
enough to compile and run GNU od. >

Come to think of it, for 'od' we might be better off avoiding __bf16
entirely, and simply converting the bits by hand into a 'float'
variable. This would likely be more portable, as it would work even with
compilers that lack __bf16, or that have __bf16 bugs relevant to 'od'.
And that way we could remove the configure-time test entirely. (There
would be endianness issues but they're easily addressible.)


Right.

My thinking though was that it was nice code wise to treat __bf16 like
_Float16 (and like float etc.).
More importantly I only see __bf16 support improving going forward,
so preferred to keep the code simpler (and the configure check robust).
Currently both the main compilers, and main architectures are supported
(when not cross-compiling), and tested quite robustly in the configure check.
Another thing that dissuaded me from implementing the conversions ourselves
is the plethora of floating point config options for compilers,
which made me think in general that floating point conversion
was best left to the compiler.

I've just now realized that AC_RUN_IFELSE needs an explicit default
for cross-compiling, and I've just pushed a conservative default
(which may default the other way in future when support broadens).

thanks,
Pádraig.





bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-14 Thread Pádraig Brady

On 14/03/2024 13:35, Collin Funk wrote:

On 3/14/24 6:03 AM, Pádraig Brady wrote:

It would disable this feature for cross-compilation yes,
but this isn't the first instance of AC_RUN_IFELSE we use.


For completeness I should add that the above check can be
overridden if cross-compiling or whatever like:

  ./configure utils_cv_ieee_16_bit_supported=yes 
utils_cv_brain_16_bit_supported=yes


Sorry if this is not the proper place to ask, but would it be possible
to make Autoconf use an emulator when cross-compiling? This issue
would be *less* of a problem in that case.

I am more familiar with CMake where check_c_source_runs is used
performs a similar task [1]. It is essentially a wrapper function
around try_run [2]. CMake allows you to set
CMAKE_CROSSCOMPILING_EMULATOR which will make try_run use an emulator
to run programs instead [3].

I've used this with Wine and QEMU User a few times and it makes
CMake's configure tests work fine (though much slower). If this
functionality seems beneficial then I can look into adding it to
Autoconf.

[1] https://cmake.org/cmake/help/latest/module/CheckCSourceRuns.html
[2] 
https://github.com/Kitware/CMake/blob/4285dec5f012260337193f29f18899d6cf2536a4/Modules/Internal/CheckSourceRuns.cmake#L93
[3] 
https://cmake.org/cmake/help/latest/variable/CMAKE_CROSSCOMPILING_EMULATOR.html


Interesting.
Perhaps this is something that could be configured separately
on the build system, through something like binfmt_misc ?

cheers,
Pádraig





bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-14 Thread Pádraig Brady

On 14/03/2024 05:59, Paul Eggert wrote:

On 2024-03-12 19:24, Grisha Levit wrote:

- AC_COMPILE_IFELSE(
+ AC_RUN_IFELSE(


This sort of change would break cross-compilation, no?


It would disable this feature for cross-compilation yes,
but this isn't the first instance of AC_RUN_IFELSE we use.


How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
is used when the compiler supports -O1? That way we don't have to worry
about running the program, because (with the "volatile") clang will
error out.

Alternatively perhaps there's some way to check for the bug using
preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
__clang_major__, and __aarch64__. (This could be more fragile, though,
as clang presumably will fix the bug eventually.)


That would probably work for this edge case, but it's brittle
and I'm worried about other combinations of
compiler versions, complier options, and architectures.

For example sse availability is in the mix here,
so this will now correctly avoid this feature with:
  CFLAGS=-mno-sse ./configure --quiet
Whereas previously it would have built but produced wrong values.

cheers,
Pádraig.





bug#68708: [PATCH] env,kill: Handle unnamed signals

2024-03-13 Thread Pádraig Brady

On 25/01/2024 19:52, Grisha Levit wrote:

On Thu, Jan 25, 2024, 09:50 Pádraig Brady  wrote:
This mostly looks good, except:

- No need to clear the errno before kill(3).
- Better to use SIG%d rather than the bare %d for signal _names_, as we already 
parse this format


Makes sense, done below.

* src/operand2sig.c (operand2sig): Drop signame argument, accept all
signal numbers <= SIGNUM_BOUND.  All callers updated.
* src/env.c (parse_signal_action_params, reset_signal_handlers)
(parse_block_signal_params, set_signal_proc_mask)
(list_signal_handling): Accept all signal numbers <= SIGNUM_BOUND,
use SIG%d for printing if necessary.
* src/kill.c (list_signals, main): Likewise.
(send_signals): Check errno from kill(3) for bad signo.
* src/timeout.c (main): Update operand2sig call.
* tests/misc/kill.sh: Test listing all signal numbers.


I've made a few adjustments:

- Clarified the commit message.

- I've gone back to have kill -l produce a bare number for unnamed signals, 
because:
this is consistent with util-linux,
SIG%d is only parseable by coreutils (not util-linux or bash),
easier to programatically determine if a name is defined.
I've left as SIG%d for -t output as that's a coreutils specific option,
and not really programatically consumable anyway.

- I've added a validation check for `env --block=32` so that it fails
like `env --default=32` and `env --ignore=32`.
I.e. exits with EXIT_CANCELED.

- Added a NEWS entry.

I'll apply this later.

thanks!
PádraigFrom b8d1b00e21a86270fbbe5903a41319894db266df Mon Sep 17 00:00:00 2001
From: Grisha Levit 
Date: Thu, 25 Jan 2024 14:52:50 -0500
Subject: [PATCH] env,kill,timeout: support unnamed signals

Some signals with values less that the max signal number for the system
do not have defined names.  For example, currently on amd64 Linux,
signals 32 and 33 do not have defined names, and Android has a wider
gap of undefined names where it reserves some realtime signals.

Previously the signal listing in env ended up reusing the name
of the last printed valid signal (the repeated HUP below):

$ env --list-signal-handling true
HUP( 1): IGNORE
HUP(32): BLOCK
HUP(38): IGNORE

..and the corresponding signal numbers were rejected as operands for the
env, kill, and timeout commands.

This patch removes the requirement that sig2str returns 0 for a signal
number associated with an operand.  This allows unnamed signals to be in
the sets `env' attempts to manipulate when a --*-signal option is used
with no argument, and kill(1) and timeout(1) to send such unnamed
signals.

* src/operand2sig.c (operand2sig): Drop signame argument, accept all
signal numbers <= SIGNUM_BOUND.  All callers updated.
* src/env.c (parse_signal_action_params, reset_signal_handlers)
(parse_block_signal_params, set_signal_proc_mask)
(list_signal_handling): Accept all signal numbers <= SIGNUM_BOUND,
use SIG%d for printing if necessary.
* src/kill.c (list_signals, main): Likewise.
(send_signals): Check errno from kill(3) for bad signo.
* src/timeout.c (main): Update operand2sig call.
* tests/misc/kill.sh: Test listing all signal numbers.
* NEWS: Mention the improvement.
---
 NEWS   |  3 +++
 src/env.c  | 29 ++---
 src/kill.c | 24 
 src/operand2sig.c  |  8 
 src/operand2sig.h  |  2 +-
 src/timeout.c  |  3 +--
 tests/misc/kill.sh | 10 +-
 7 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/NEWS b/NEWS
index 20cadf183..f21efc7c0 100644
--- a/NEWS
+++ b/NEWS
@@ -92,6 +92,9 @@ GNU coreutils NEWS-*- outline -*-
   This was previously 128KiB and increasing to 256KiB was seen to increase
   throughput by 10-20% when reading cached files on modern systems.
 
+  env,kill,timeout now support unnamed signals. kill(1) for example now
+  supports sending such signals, and env(1) will list them appropriately.
+
   SELinux operations in file copy operations are now more efficient,
   avoiding unneeded MCS/MLS label translation.
 
diff --git a/src/env.c b/src/env.c
index 73d9847f4..ed6628f8f 100644
--- a/src/env.c
+++ b/src/env.c
@@ -538,7 +538,6 @@ parse_split_string (char const *str, int *orig_optind,
 static void
 parse_signal_action_params (char const *arg, bool set_default)
 {
-  char signame[SIG2STR_MAX];
   char *opt_sig;
   char *optarg_writable;
 
@@ -548,8 +547,7 @@ parse_signal_action_params (char const *arg, bool set_default)
  Some signals cannot be set to ignore or default (e.g., SIGKILL,
  SIGSTOP on most OSes, and SIGCONT on AIX.) - so ignore errors.  */
   for (int i = 1 ; i <= SIGNUM_BOUND; i++)
-if (sig2str (i, signame) == 0)
-  signals[i] = set_default ? DEFAULT_NOERR : IGNORE_NOERR;
+signals[i] = set_default ? DEFAULT_NOERR : IGNORE_NOERR;
   return;
 }
 
@@ -558,7 +556,7 @@ parse_signal_action_params (char const *arg, bool set_default)
   o

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-13 Thread Pádraig Brady

On 13/03/2024 02:24, Grisha Levit wrote:

Recent clang provides __bf16 on aarch64 but it is broken.

If built with -O0, the conversion is wrong:

 $ printf '\x3F\x80' | od --end=big -An -tfB | tr -d ' '
 1.875

If built with -O1 or higher, compilation fails:

 fatal error: error in backend: Cannot select: 0xb47a58d29780: f32 = 
fp_extend 0xb47a58d31720
   0xb47a58d31720: bf16,ch = CopyFromReg 0xb47b78c53720, 
Register:bf16 %13
 0xb47a58d29470: bf16 = Register %13
 In function: print_bfloat

The latter issue does not cause the existing configure test to fail
because the promotion is optimized out.

* configure.ac: Ensure 16 bit float promotion code does not get
optimized out, and produces an expected result.


Looks good!

I'll follow up with another patch to all these involved checks to be cached / 
bypassed,
by wrapping them in AC_CACHE_VAL().

Marking this as done.

thanks!
Pádraig





bug#69724: [PATCH] dircolors: add more archive extensions

2024-03-11 Thread Pádraig Brady

I'm erring on the side of applying this,
though I'm a bit wary of an endlessly expanding list,
as there is no end to what can be compressed for example.

cheers,
Pádraig





bug#69636: Re: [PATCH] Improve quality of format-checking code for seq.

2024-03-08 Thread Pádraig Brady

tag 69636 notabug
close 69636
stop

On 08/03/2024 12:29, User wrote:


Jim Meyering  wrote:


Paul Eggert  wrote:

* src/seq.c (validate_format): Remove. Migrate its checks into...
(long_double_format): Report an error and exit if an error is found,
instead of returning NULL.  All callers changed.
Use a more-consistent format for diagnostics.
* tests/misc/seq: Adjust to the more-consistent format for diagnostics.
---
   src/seq.c  |   71

+++

   tests/misc/seq |   10 
   2 files changed, 25 insertions(+), 56 deletions(-)


Thanks.
As far as I can see this is a "no semantic change"
patch (modulo diagnostics), and it looks fine, so I pushed it.


Dear Paul Eggert and Jim Meyering,


Thank you very much for the contribution and care! I hope you have
awesome and cozy day!

It is a nice addition to check the format in the most and explicitly
safe cases, I believe, however it disallows to use it in cases when you
need to just repeat a character without ANY processing directive! For
example, `a=10; seq -f 'x' -s '' -- "$a";` would print 10 characters "x"
without the check added prior and significantly (or little but still)
support development in long codes which would require explicit loops or
additional dependencies like awesome Perl.

Is it possible to reconsider the check or add an option to ignore the
format which does not have a required directive? If you know a better
alternative to the standard utility which would result in the same small
code but great result, please suggest if possible!

Again, appreciated, and please stay safe!


This suggestion has some merit,
but there are a few ways to generate repetitive data already:

  a=10

  seq $a | xargs printf -- 'x%.0s'

  printf "%${a}s" '' | tr ' ' x

  yes x | head -n$a | tr -d '\n'

Also if only outputting to a terminal, then you can get the
repetition to happen within the terminal code like:

  printf -- "x\e[${a}b"

Therefore I'm not sure it's worth relaxing the validation in seq.

thanks,
Pádraig





bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-05 Thread Pádraig Brady

On 05/03/2024 04:10, Paul Eggert wrote:

On 3/4/24 16:43, Dominique Martinet wrote:

Adding Rob to the loop because this impacts compatibility with
toybox/maybe busybox implementations


Busybox does not use RENAME_EXCHANGE, so this isn't a Busybox issue.

Toybox mv added -x to its development version yesterday:

https://github.com/landley/toybox/commit/a2419ad52d489bf1a84a9f3aa73afb351642c765

so there's little prior art there, and there's still plenty of time to
fix its problems before exposing it to the world.



I also see --swap mostly used by scripts and this actually feels a bit
dangerous to me -- I'd *always* use this with -T.


Yes, it's a problem.

By "see --swap mostly used by scripts" I assume you mean scripts that
haven't been written yet, assuming that nobody had -x until yesterday



(by the way, what's this "rename" command you speak of?


https://mirrors.edge.kernel.org/pub/linux/utils/util-linux/

Now that I've looked into it further, util-linux already has an "exch"
command that does exactly what you want. This is the command that toybox
should implement rather than try to simulate it with "mv -x" (which
causes all sorts of problems).

That is, toybox should revert yesterday's change to "mv", and should
implement "exch" instead.


I think having the functionality in mv(1) is better than in rename(1),
but since exch(1) is already released that's probably
the best place for this functionality now.

A separate exch command may be overkill for just this,
but perhaps related functionality might be added to that command in future.
For e.g. some of the discussed functionality for a "replace" command
might reside there.

So I think I'll remove the as yet unreleased mv --swap from coreutils, given 
that
util-linux is as widely available as coreutils on GNU/Linux platforms.

cheers,
Pádraig






bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-04 Thread Pádraig Brady

On 04/03/2024 15:47, Pádraig Brady wrote:

On 04/03/2024 00:44, Paul Eggert wrote:

Although I like the idea of exposing file swaps to the user, the first
cut of 'mv -x' has significant problems.

I expect 'mv -x A B' to act like 'mv A B' except the destination must
exist and is renamed back to A. However, this is not true for 'mv -x A
B' when B is a directory; it renames B to A rather than renaming B/A to
A as I expect. That is, 'mv -x' acts as if -T (--no-target-directory) is
also specified. There is no way to get mv's traditional behavior, or to
get mv -t behavior.

To fix this, 'mv -x' should respect the usual mv behavior with respect
to directories. For example, when D is a directory 'mv -x A B C D'
should act like 'mv A B C D' except that D's old entries should be
renamed back to A B and C. And the -t and -T options should work with -x
the same way they work when -x is not specified.

This needs to happen before the next coreutils release, to avoid
confusion about 'mv -x'.


So you're saying the current mv -x behavior should only be with -T explicitly 
specified.
With the current implementation, we should at least document that -x implies -T.

By changing as you suggest, we'd not be adding any significant functionality,
but potentially reducing confusion by following mv traditional arg handling.
I agree with you I think, but not strongly.

It's worth stating though that if users want to swap 2 directories
they'd have to `mv -Tx d1 d2`, which may also be a little confusing.

The use case we don't currently support directly is:
cd source_dir && mv -x * "$dest_dir"
Instead that currently requires:
for f in *; do mv -x "$f" "$dest_dir"; done

For completeness, stating disadvantages of pulling this use case within mv,
is that by requiring the for loop for this would allow users
to programatically determine which swap failed, and I see --swap
as primarily a programatic interface used by scripts.
Also if we made this change, We'd have to document that `mv -x 1 2 ... d`
was not atomic over the whole set.

I will look at making the change before the upcoming release.


Another point worth mentioning before changing this,
is that changing would make the --swap operation non symmetric.
I.e. `mv -x a d` would be different to `mv -x d a` where d in a directory.

cheers,
Pádraig

p.s. Re dir swapping I noticed that specifying '/' or '.' dirs always gives 
EBUSY:
  $ mv -x . .
  mv: swap of '.' and '.' failed: Device or resource busy





bug#69546: cksum: inconsistent handling of invalid length values

2024-03-04 Thread Pádraig Brady

On 04/03/2024 15:44, Daniel Hofstetter wrote:

Hi,

When specifying an invalid length value followed by a valid length
value I get the following error:

$ printf "hello" | cksum --algo=blake2b --length=12 --length=8
cksum: invalid length: ‘12’
cksum: length is not a multiple of 8

However, if the invalid length value is a multiple of 8 and greater
than 512 (the maximum digest length for blake2b), there is no error:

$ printf "hello" | cksum --algo=blake2b --length=123456 --length=8
BLAKE2b-8 (-) = 29

I think the behavior should be the same in the two scenarios, whether
it's showing an error or ignoring the invalid value.

I'm using coreutils 9.4.


I pushed a fix at:
https://github.com/coreutils/coreutils/commit/fea833591

Now only the last used --length is validated.

Marking this as done.

cheers,
Pádraig






bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-04 Thread Pádraig Brady

On 04/03/2024 00:44, Paul Eggert wrote:

Although I like the idea of exposing file swaps to the user, the first
cut of 'mv -x' has significant problems.

I expect 'mv -x A B' to act like 'mv A B' except the destination must
exist and is renamed back to A. However, this is not true for 'mv -x A
B' when B is a directory; it renames B to A rather than renaming B/A to
A as I expect. That is, 'mv -x' acts as if -T (--no-target-directory) is
also specified. There is no way to get mv's traditional behavior, or to
get mv -t behavior.

To fix this, 'mv -x' should respect the usual mv behavior with respect
to directories. For example, when D is a directory 'mv -x A B C D'
should act like 'mv A B C D' except that D's old entries should be
renamed back to A B and C. And the -t and -T options should work with -x
the same way they work when -x is not specified.

This needs to happen before the next coreutils release, to avoid
confusion about 'mv -x'.


So you're saying the current mv -x behavior should only be with -T explicitly 
specified.
With the current implementation, we should at least document that -x implies -T.

By changing as you suggest, we'd not be adding any significant functionality,
but potentially reducing confusion by following mv traditional arg handling.
I agree with you I think, but not strongly.

It's worth stating though that if users want to swap 2 directories
they'd have to `mv -Tx d1 d2`, which may also be a little confusing.

The use case we don't currently support directly is:
cd source_dir && mv -x * "$dest_dir"
Instead that currently requires:
for f in *; do mv -x "$f" "$dest_dir"; done

For completeness, stating disadvantages of pulling this use case within mv,
is that by requiring the for loop for this would allow users
to programatically determine which swap failed, and I see --swap
as primarily a programatic interface used by scripts.
Also if we made this change, We'd have to document that `mv -x 1 2 ... d`
was not atomic over the whole set.

I will look at making the change before the upcoming release.

cheers,
Pádraig





bug#69488: tr (question)

2024-03-01 Thread Pádraig Brady

On 01/03/2024 15:33, lacsaP Patatetom wrote:

hi,

I did a few tests with tr and I'm surprised by the results...

$ echo éèçà
éèçà

these characters are encoded in utf-8 on 2 bytes :

$ echo éèçà | xxd
: c3a9 c3a8 c3a7 c3a0 0a   .

now I use tr to remove non-printable characters :

$ echo éèçà | tr -cd '[:print:]'
$ echo éèçà | tr -cd '[:print:]' | wc
   0   0   0

all characters are deleted by tr
now I want to keep the "é" character :

$ echo éèçà | tr -cd '[:print:]é'
��

why do the "�" characters appear ?

regards, lacsaP.



It's a known issue that tr is currently non multi-byte aware.

thanks,
Pádraig





bug#69418: test failure when no french locale is installed

2024-02-27 Thread Pádraig Brady

On 26/02/2024 23:05, Bruno Haible wrote:

Testing the current coreutils git master:

On a Debian 12 system, in which I have not installed a French UTF-8 locale,
I see a test failure of tests/misc/join-utf8.

The essential lines from test-suite.log:

+ test set = set
+ LC_ALL=none
../tests/misc/join-utf8.sh: line 24: warning: setlocale: LC_ALL: cannot change 
locale (none): No such file or directory

The cause is that on such a system, LOCALE_FR_UTF8, as determined by
gnulib/m4/locale-fr.m4, is 'none', not empty or absent.

The attached patch fixes the failure.



Applied.

This also indicates that gnulib ensures that LOCALE_FR_UTF8
is set to "none" when not present or usable,
so I've made that simplification in other tests now.

Marking this as done.

thanks!
Pádraig.





bug#69369: wc -w ignores breaking space over UCHAR_MAX

2024-02-25 Thread Pádraig Brady

On 24/02/2024 20:44, Aearil via GNU coreutils Bug Reports wrote:

Hi,

wc -w doesn't seem to recognize whitespace characters with a codepoint
over UCHAR_MAX (255) as word separators. For example, using the
character EM SPACE U+2003:

$ printf "foo\u2003bar" | ./wc -w
1

I should get a word count of 2, but instead the space is ignored while
counting words. Meanwhile, wc v9.4 gives the correct answer:

$ printf "foo\u2003bar" | wc -w
2

It looks like the regression has been introduced by [f40c6b5] and
would be fixed by something like the following change:

diff --git a/src/wc.c b/src/wc.c
index f5a921534..9d456f8c0 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -528,7 +528,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, 
off_t current_pos)
if (width > 0)
  linepos += width;
  }
-  in_word2 = !iswnbspace (wide_char);
+  in_word2 = !iswspace (wide_char) && !iswnbspace 
(wide_char);
  }

/* Count words by counting word starts, i.e., each


Nice one.
Great to catch this before release.
I've augmented your patch with a test,
and will push the attached later.

Marking this as done.

thanks!
Pádraig
From ced8c64c986b79c0bfa74028a9581e07d5df1974 Mon Sep 17 00:00:00 2001
From: Aearil 
Date: Sat, 24 Feb 2024 21:44:24 +0100
Subject: [PATCH] wc: fix -w with breaking space over UCHAR_MAX

* src/wc.c (wc): Fix regression introduced in commit v9.4-48-gf40c6b5cf.
* tests/wc/wc-nbsh.sh: Add test cases for "standard" spaces.
Fixes https://bugs.gnu.org/69369
---
 src/wc.c| 2 +-
 tests/wc/wc-nbsp.sh | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/wc.c b/src/wc.c
index f5a921534..9d456f8c0 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -528,7 +528,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, off_t current_pos)
   if (width > 0)
 linepos += width;
 }
-  in_word2 = !iswnbspace (wide_char);
+  in_word2 = !iswspace (wide_char) && !iswnbspace (wide_char);
 }
 
   /* Count words by counting word starts, i.e., each
diff --git a/tests/wc/wc-nbsp.sh b/tests/wc/wc-nbsp.sh
index 371cc8b5b..39a8baccc 100755
--- a/tests/wc/wc-nbsp.sh
+++ b/tests/wc/wc-nbsp.sh
@@ -38,10 +38,15 @@ fi
 
 export LC_ALL=en_US.UTF-8
 if test "$(locale charmap 2>/dev/null)" = UTF-8; then
+  #non breaking space class
   check_word_sep '\u00A0'
   check_word_sep '\u2007'
   check_word_sep '\u202F'
   check_word_sep '\u2060'
+
+  #sampling of "standard" space class
+  check_word_sep '\u0020'
+  check_word_sep '\u2003'
 fi
 
 export LC_ALL=ru_RU.KOI8-R
-- 
2.43.0



bug#69368: [PATCH] Allow --zero with --check

2024-02-25 Thread Pádraig Brady

On 24/02/2024 19:07, Ricardo Branco wrote:

The --zero option is useless if not supported by --check.

Attached patch fixes it.

https://github.com/coreutils/coreutils/pull/81


I'm not sure about this.

By adding this support we diverge the checksum file formats supported by check.
I.e. users may inadvertently create such files that are not usable
by any previous version of the checksum utilities.

Also having --check support NUL delimited files
doesn't add any additional functionality, as the checksum files
are already automatically escaped appropriately.

Also NUL delimited checksum files can be harder to manage with other text tools.

--zero was added for the case of piping to other tools
that would find the filename escaping problematic.

cheers,
Pádraig





bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-02-24 Thread Pádraig Brady

On 01/02/2024 00:36, Paul Eggert wrote:

On 1/31/24 06:06, Pádraig Brady wrote:

To my mind the most protective option takes precedence.


That's not how POSIX works with mv -i and mv -f. The last flag wins. I
assume this is so that people can have aliases or shell scripts that
make -i the default, but you can override by specifying -f on the
command line. E.g., in mymv:

 #!/bin/sh
 mv -i "$@"

then "mymv -f a b" works as expected.

Wouldn't a similar argument apply to cp's --update options?

Or perhaps we should play it safe, and reject any combination of
--update etc. options that are incompatible. We can always change our
mind later and say that later options override earlier ones, or do
something else that's less conservative.


OK I err'd on the side of changing as little as possible wrt precedence.
-n still has precedence over any -u,--update option.
That's simplest to understand (while not changing existing precedence),
and shouldn't cause any practical issues.

I plan to push the 2 attached patches for this tomorrow.

thanks,
PádraigFrom 51cf6f3ff272467bc9cde75c48d0250446be9b9c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sat, 24 Feb 2024 19:51:56 +
Subject: [PATCH 1/2] cp,mv: reinstate that -n exits with success if files
 skipped

* src/cp.c (main): Adjust so that -n will exit success if skipped files.
* src/mv.c (main): Likewise.
* doc/coreutils.texi (cp invocation): Adjust the description of -n.
* src/system.h (emit_update_parameters_note): Adjust --update=none
comparison.
* tests/cp/cp-i.sh: Adjust -n exit status checks.
* tests/mv/mv-n.sh: Likewise.
* NEWS: Mention the change in behavior.
Fixes https://bugs.gnu.org/62572
---
 NEWS   |  4 
 doc/coreutils.texi | 17 +
 src/cp.c   | 14 --
 src/mv.c   | 10 ++
 src/system.h   |  4 ++--
 tests/cp/cp-i.sh   | 11 +--
 tests/mv/mv-n.sh   | 11 +--
 7 files changed, 39 insertions(+), 32 deletions(-)

diff --git a/NEWS b/NEWS
index 36b7fd1fe..a52b4cf66 100644
--- a/NEWS
+++ b/NEWS
@@ -43,6 +43,10 @@ GNU coreutils NEWS-*- outline -*-
   basenc --base16 -d now supports lower case hexadecimal characters.
   Previously an error was given for lower case hex digits.
 
+  cp --no-clobber, and mv -n no longer exit with failure status if
+  existing files are encountered in the destination.  Instead they revert
+  to the behavior from before v9.2, silently skipping existing files.
+
   ls --dired now implies long format output without hyperlinks enabled,
   and will take precedence over previously specified formats or hyperlink mode.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index c42126955..911e15b46 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9059,12 +9059,13 @@ a regular file in the destination tree.
 @itemx --no-clobber
 @opindex -n
 @opindex --no-clobber
-Do not overwrite an existing file; silently fail instead.
-This option overrides a previous
-@option{-i} option.  This option is mutually exclusive with @option{-b} or
-@option{--backup} option.
-See also the @option{--update=none} option which will
-skip existing files but not fail.
+Do not overwrite an existing file; silently skip instead.
+This option overrides a previous @option{-i} option.
+This option is mutually exclusive with @option{-b} or @option{--backup} option.
+This option is deprecated due to having a different exit status from
+other platforms.  See also the @option{--update} option which will
+give more control over how to deal with existing files in the destination,
+and over the exit status in particular.
 
 @item -P
 @itemx --no-dereference
@@ -9335,8 +9336,8 @@ This is the default operation when an @option{--update} option is not specified,
 and results in all existing files in the destination being replaced.
 
 @item none
-This is similar to the @option{--no-clobber} option, in that no files in the
-destination are replaced, but also skipping a file does not induce a failure.
+This is like the deprecated @option{--no-clobber} option, where no files in the
+destination are replaced, and also skipping a file does not induce a failure.
 
 @item older
 This is the default operation when @option{--update} is specified, and results
diff --git a/src/cp.c b/src/cp.c
index 0355ed97f..36ae4fb66 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -195,8 +195,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
   -L, --dereferencealways follow symbolic links in SOURCE\n\
 "), stdout);
   fputs (_("\
-  -n, --no-clobber ensure no existing files overwritten, and fail\n\
- silently instead. See also --update\n\
+  -n, --no-clobber silently skip existing files.\n\
+ See also --update\n\
 "), stdout);
   fputs (_("\
   -P, --no-dereference never follow symb

bug#68871: Can't use od to print half-precision floats

2024-02-05 Thread Pádraig Brady

On 04/02/2024 21:59, Paul Eggert wrote:

Thanks. One minor comment about this:


+@item B
+brain 16 bit float
+@item H
+half precision float


It might be helpful explain these two formats, e.g., to cite:

https://en.wikipedia.org/wiki/Bfloat16_floating-point_format

and

https://en.wikipedia.org/wiki/Half-precision_floating-point_format

respectively, since the formats are new compared to the other formats
being discussed.


Good call.
I added those references and pushed.

cheers,
Pádraig.






bug#68871: Can't use od to print half-precision floats

2024-02-04 Thread Pádraig Brady

On 02/02/2024 01:47, Paul Eggert wrote:

On 2/1/24 13:59, Pádraig Brady wrote:


bfloat16 looks like a truncated single precision IEEE,
so we should be able to just pad the extra 16 bits with zeros
when converting to single precision internally for processing.


Sounds good. This would mean od could work even the platform doesn't
support bfloat16_t, since od.c could fall back on the above code (though
I suppose it could be endianness-dependent).


I see that __bf16 is supported by released versions of gcc and clang,
so rather than add conversion complexity to the core od print loop,
the attached relies on compiler support for _Float16 and __bf16 types,
which compilers will expand to more targets going forward.

I tested the attached on:

gcc 13, clang 17 x86 (Both types supported)
gcc 7 aarch64 (Only -fH supported)
gcc 13 ppc(be) (Neither supported (both will be with GCC 14))

I'll commit this later.
Marking this bug as done.

thanks,
PádraigFrom 5c1349185e050b8bf8fb1d356c54170112d0e673 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Thu, 1 Feb 2024 17:59:51 +
Subject: [PATCH] od: support half precision floating point

Rely on compiler support for _Float16 and __bf16
to support -fH and -fB formats respectively.
I.e. IEEE 16 bit, and brain 16 bit floats respectively.
Modern GCC and LLVM compilers support both types.

clang-sect=half-precision-floating-point
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
https://clang.llvm.org/docs/LanguageExtensions.html#$clang-sect
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0192r4.html
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1467r9.html

This was tested on:
gcc 13, clang 17 x86 (Both types supported)
gcc 7 aarch64 (Only -fH supported)
gcc 13 ppc(be) (Neither supported. -fH will be with GCC 14)

* src/od.c: Support -tf2 or -tfH to print IEEE 16 bit floating point,
or -tfB to print Brain 16 bit floating point.
* configure.ac: Check for _Float16 and __bf16 types.
* doc/coreutils.texi (od invocation): Mention the new -f types.
* tests/od/od-float.sh: Add test cases.
* NEWS: Mention the new feature.
Addresses https://bugs.gnu.org/68871
---
 NEWS |  3 ++
 configure.ac |  3 ++
 doc/coreutils.texi   |  4 +++
 src/od.c | 73 ++--
 tests/od/od-float.sh | 21 +
 5 files changed, 101 insertions(+), 3 deletions(-)

diff --git a/NEWS b/NEWS
index dc5d875dd..5b5befd2c 100644
--- a/NEWS
+++ b/NEWS
@@ -57,6 +57,9 @@ GNU coreutils NEWS-*- outline -*-
   chgrp now accepts the --from=OWNER:GROUP option to restrict changes to files
   with matching current OWNER and/or GROUP, as already supported by chown(1).
 
+  od now supports printing IEEE half precision floating point with -t fH,
+  or brain 16 bit floating point with -t fB, where supported by the compiler.
+
   tail now supports following multiple processes, with repeated --pid options.
 
 ** Improvements
diff --git a/configure.ac b/configure.ac
index 48ab4ef53..64ff32a96 100644
--- a/configure.ac
+++ b/configure.ac
@@ -522,6 +522,9 @@ CFLAGS=$ac_save_CFLAGS
 LDFLAGS=$ac_save_LDFLAGS
 ac_c_werror_flag=$cu_save_c_werror_flag
 
+# Test compiler support for half precision floating point types (for od)
+AC_CHECK_TYPES([_Float16, __bf16])
+
 ac_save_CFLAGS=$CFLAGS
 CFLAGS="-mavx -mpclmul $CFLAGS"
 AC_MSG_CHECKING([if pclmul intrinsic exists])
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index d13e0f834..c7a49de31 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -2127,6 +2127,10 @@ long
 For floating point (@code{f}):
 
 @table @asis
+@item B
+brain 16 bit float
+@item H
+half precision float
 @item F
 float
 @item D
diff --git a/src/od.c b/src/od.c
index 75bed5e7d..1e2489367 100644
--- a/src/od.c
+++ b/src/od.c
@@ -19,6 +19,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -49,6 +50,26 @@ typedef unsigned long long int unsigned_long_long_int;
 typedef unsigned long int unsigned_long_long_int;
 #endif
 
+#if HAVE__FLOAT16
+  /* Available since clang 6 (2018), and gcc 7 (2017).  */
+  typedef _Float16 float16;
+#else
+# define HAVE__FLOAT16 0
+  /* This is just a place-holder to avoid a few '#if' directives.
+ In this case, the type isn't actually used.  */
+  typedef float float16;
+#endif
+
+#if HAVE___BF16
+  /* Available since clang 11 (2020), and gcc 13 (2023). */
+  typedef __bf16 bfloat16;
+#else
+# define HAVE___BF16 0
+  /* This is just a place-holder to avoid a few '#if' directives.
+ In this case, the type isn't actually used.  */
+  typedef float bfloat16;
+#endif
+
 enum size_spec
   {
 NO_SIZE,
@@ -58,6 +79,7 @@ enum size_spec
 LONG,
 LONG_LONG,
 /* FIXME: add INTMAX support, too */
+FLOAT_HALF,
 FLOAT_SINGLE,
 FLOAT_DOUBLE,
 FLOAT_LONG_DOUBLE,
@@ -71,6 +93,8 @@ enum output_format
 OCTAL,
 HEXADECIMAL,
 FLOATING_POINT,
+HFLOA

bug#68871: Can't use od to print half-precision floats

2024-02-01 Thread Pádraig Brady

On 01/02/2024 19:16, Paul Eggert wrote:

Oh, and another thought: suppose someone wants to use od on bfloat16_t
values? They're popular in machine learning applications, and likely
will be more popular than float16_t overall. See:

https://sourceware.org/pipermail/libc-alpha/2024-February/154382.html


True. I suppose we would select between these like:

 -t f2, -t fh = IEEE half precision
-t fb = brain floating point

bfloat16 looks like a truncated single precision IEEE,
so we should be able to just pad the extra 16 bits with zeros
when converting to single precision internally for processing.

cheers,
Pádraig





bug#68871: Can't use od to print half-precision floats

2024-02-01 Thread Pádraig Brady

On 01/02/2024 12:43, Rozenberg, Eyal (Consultant) wrote:

If I try to use the od utility to print half-precision (FP16) floating-point 
values, I get:

$ od -t f2 myfloats.bin
od: invalid type string 'f2';
this system doesn't provide a 2-byte floating point type

I'm not exactly sure what "this system" means, but that should work and print 
out my floats.

Eyal

PS - This is my first bug-coreutils post, please be gentle.


I just had a read of these:
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0192r4.html

On the face of it it seems that various 16 bit floating point formats have 
consolidated,
and we could add support for _Float16 in od.
I see gnulib already enables __STDC_WANT_IEC_60559_TYPES_EXT__
so we should be able to do something like the attached.

This is just off the top of my head, and I haven't tested
or thought about this much at all.
Any testing your could do would be appreciated.

thanks,
Pádraigcommit 82fc25ab7f34a8198a8f435c1396801d30e8c6b6
Author: Pádraig Brady 
Date:   Thu Feb 1 17:59:51 2024 +

od: support half precision floating point

* src/od.c: Suport -t f2 to print 16 bit floating point.
* NEWS: Mention the new feature.
Addresses https://bugs.gnu.org/68871

diff --git a/src/od.c b/src/od.c
index 75bed5e7d..2f03a729e 100644
--- a/src/od.c
+++ b/src/od.c
@@ -19,6 +19,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -49,6 +50,12 @@ typedef unsigned long long int unsigned_long_long_int;
 typedef unsigned long int unsigned_long_long_int;
 #endif
 
+#ifndef FLT16_MAX
+/* This is just a place-holder to avoid a few '#if' directives.
+   In this case, the type isn't actually used.  */
+typedef float _Float16;
+#endif
+
 enum size_spec
   {
 NO_SIZE,
@@ -58,6 +65,7 @@ enum size_spec
 LONG,
 LONG_LONG,
 /* FIXME: add INTMAX support, too */
+FLOAT_HALF,
 FLOAT_SINGLE,
 FLOAT_DOUBLE,
 FLOAT_LONG_DOUBLE,
@@ -156,6 +164,7 @@ static const int width_bytes[] =
   sizeof (int),
   sizeof (long int),
   sizeof (unsigned_long_long_int),
+  sizeof (_Float16),
   sizeof (float),
   sizeof (double),
   sizeof (long double)
@@ -477,6 +486,7 @@ PRINT_TYPE (print_int, unsigned int)
 PRINT_TYPE (print_long, unsigned long int)
 PRINT_TYPE (print_long_long, unsigned_long_long_int)
 
+PRINT_FLOATTYPE (print_halffloat, _Float16, ftoastr, FLT_BUFSIZE_BOUND)
 PRINT_FLOATTYPE (print_float, float, ftoastr, FLT_BUFSIZE_BOUND)
 PRINT_FLOATTYPE (print_double, double, dtoastr, DBL_BUFSIZE_BOUND)
 PRINT_FLOATTYPE (print_long_double, long double, ldtoastr, LDBL_BUFSIZE_BOUND)
@@ -824,6 +834,11 @@ decode_one_format (char const *s_orig, char const *s, char const **next,
 
 switch (size_spec)
   {
+  case FLOAT_HALF:
+print_function = print_halffloat;
+field_width = FLT_STRLEN_BOUND_L (decimal_point_len);
+break;
+
   case FLOAT_SINGLE:
 print_function = print_float;
 field_width = FLT_STRLEN_BOUND_L (decimal_point_len);
@@ -1598,6 +1613,9 @@ main (int argc, char **argv)
   for (i = 0; i <= MAX_FP_TYPE_SIZE; i++)
 fp_type_size[i] = NO_SIZE;
 
+#ifdef FLT16_MAX
+  fp_type_size[sizeof (_Float16)] = FLOAT_HALF;
+#endif
   fp_type_size[sizeof (float)] = FLOAT_SINGLE;
   /* The array entry for 'double' is filled in after that for 'long double'
  so that if they are the same size, we avoid any overhead of


bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-31 Thread Pádraig Brady

On 30/01/2024 18:31, Paul Eggert wrote:

On 2024-01-30 03:18, Pádraig Brady wrote:

So we now have the proposed change as:

    - revert -n to old silent success behavior
    - document -n as deprecated
    - Leave --update=none as is (will be synonymous with -n)
    - Provide --update=none-fail to diagnose and exit failure


Thanks, that's a better proposal, but I still see several opportunities
for confusion.

If I understand things correctly, cp --update=none is not synonymous
with the proposed (i.e., old-behavior) cp -n, because -n overrides
previous -i options but --update=none does not. Also, -n overrides
either previous or following --update=UPDATE options, but --update=none
overrides only previous --update=UPDATE options. (For what it's worth,
FreeBSD -n overrides


Good point.
Well -n is a protection mechanism really, so should override.
Since --update now incorporates a protection mode, it should also override I 
think.


Some of this complication seems to be for consistency with how mv
behaves with -f, -i, -n, and --update, and similarly with how rm behaves
with -f, -i, -I, and --interactive. To be honest I don't quite
understand the reason for all this complexity, which suggests it should
be documented somewhere (the manual?) if it isn't already.


To my mind the most protective option takes precedence.
So from least to most that would be -f -n --update=none --update=none-fail -i
So the main consideration there is that -n should not override 
--update=none-fail


This raises more questions:

* If we deprecate cp -n, what about mv -n? FreeBSD mv -n behaves like
Coreutils mv -n: it silently does nothing and exits successfully. So
there's no compatibility argument for changing mv -n's behavior.
However, whatever --update option we add to cp (to output a diagnostic
and exit with failure) should surely also be added to mv, to aid
consistency.


Yes I agree.


* Should cp --update=none be changed so that it really behaves like the
old cp -n, in that it overrides other options in ways that differ from
how the other --update=UPDATE options behave? I'm leaning toward "no" as
this adds complexity that I don't see the use for.

* If we don't change cp --update=none's overriding behavior, is it still
OK to tell users to substitute --update=none for -n even though the two
options are not exactly equivalent? I'm leaning towards "yes" but would
like other opinions.


Yes I think we should still give that advice to deprecate -n,
if we ensure that -n does not override --update=none=fail.

thanks,
Pádraig





bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-30 Thread Pádraig Brady

On 29/01/2024 21:44, Paul Eggert wrote:

On 1/29/24 08:11, Pádraig Brady wrote:


Right, that's why I'm still leaning towards my proposal in the last mail.


Well, I won't insist on doing nothing; however, the proposal needs
ironing out and now's a good time to do it before installing changes.



    - revert to previous exit success -n behavior
    - document -n as deprecated
    - provide --update=noclobber to give exit failure functionality


So --update=noclobber would differ in meaning from the deprecated-in-9.5
--no-clobber, but would agree in meaning with 9.4 --no-clobber? That
sounds pretty confusing for future users. (And a nit: why should one
spelling have a hyphen but the other doesn't?)


That's a fair point; just avoid "no-clobber" entirely.
We could choose a new name so. How about --update=none-fail


      - BTW, it probably makes sense to print a diagnostic for each
skipped file here
    as it's exceptional behavior, for which we're exiting with
failure for.


Coreutils 9.4 cp -n already does that, no? So I'm not sure what's being
proposed here.

$ touch a b
$ cp -n a b; echo $?
cp: not replacing 'b'
1


Sorry I got confused between your suggestion of added diagnostics,
and FreeBSD's silent behavior here. Looks like we just keep the
existing behavior of diagnosing 'not replacing' iff exiting with failure.


    - the existing --update=none provides the exit success functionality


It seems to me that this proposal conflates two questions:

* What rules should cp use to decide whether to update a destination?

* When cp decides not to update a destination, what should it do? Exit
with nonzero status? Output a diagnostic? Both? Neither?

Aren't these independent axes? If so, shouldn't they have independent
options? For example, since we have --update=older, shouldn't there be a
way to say "I want to copy A to B only if B is older than A, and I want
the exit status to be zero only if A was copied to B"?


Well they're not entirely independent.
It's more appropriate to output a diagnostic if exiting failure,
and POSIX also advises along the same lines.
We also have --verbose to control diagnostics somewhat
for the non failure case.

So we now have the proposed change as:

  - revert -n to old silent success behavior
  - document -n as deprecated
  - Leave --update=none as is (will be synonymous with -n)
  - Provide --update=none-fail to diagnose and exit failure

thanks,
Pádraig.





bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-29 Thread Pádraig Brady

On 29/01/2024 14:01, Michael Stone wrote:

On Sun, Jan 28, 2024 at 11:14:14PM -0800, Paul Eggert wrote:

I'm not sure reverting would be best. It would introduce more
confusion, and would make coreutils incompatible with FreeBSD again.


Reverting makes more sense than the current situation. I do not
understand why you seem to value FreeBSD compatibility more than
compatibility with the vast majority of installed coreutils/linux
systems.


Yes, it's not a good place to be. Surely current coreutils is better
than what Debian is doing.


You've introduced a silent incompatibility and I'm trying to find some
way to make that clear. If upstream would provide a better solution I
would certainly use it. I have despaired of there being such since your
attitude thus far seems to be entirely dismissive of compatibility
concerns.


That's a bit unfair.  The current upstream -n behavior is with a view
to being _more_ compat across all systems.
Now I agree this may not be worth it in this case,
but it is a laudable goal.


Another possibility is to add a warning that is emitted only at the
end of 'cp'. The warning would occur only if the exit code differs
because of this cp -n business.


You'd only emit a notification of a change in behavior if some
(potentially uncommon/rarely encountered) situation arises which would
actually trigger breakage? So people can't prepare ahead of time and
change their script to handle the necessary change in logic, they can
only maybe figure out why something broke at 2am when the uncommon event
occurred?


Yes I agree with this point, mostly.
Outputting a diagnostic would help users diagnose what's going on,
and help users to fix forward and avoid their problematic -n usage.
But yes, the crux of the issue with fixing issues as they occur,
is it depends on the state of the destination and so can become an issue
at any time.  Now I previously did an audit with github and debian code search
and noticed very few problematic uses of cp -n, but that does miss
the mountain of private code.


At the end of the day, -n is basically a useless option with unknowable
semantics which should be avoided by everyone. In the past it was an
option which wasn't portable between coreutils/linux and freebsd systems,
and I guess you've "fixed" that (by making it an option everyone should
avoid entirely), but let's be honest about how common that concern was.


Right, that's why I'm still leaning towards my proposal in the last mail.
  - revert to previous exit success -n behavior
  - document -n as deprecated
  - provide --update=noclobber to give exit failure functionality
- BTW, it probably makes sense to print a diagnostic for each skipped file 
here
  as it's exceptional behavior, for which we're exiting with failure for.
  - the existing --update=none provides the exit success functionality

With the above in place for the next coreutils release,
then debian could remove its noisy patch.

thanks,
Pádraig





bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-01-28 Thread Pádraig Brady

On 17/12/2023 14:46, Pádraig Brady wrote:

On 16/12/2023 21:46, Bernhard Voelker wrote:

On 12/15/23 21:13, Michael Stone wrote:

On Fri, Dec 15, 2023 at 11:21:06AM -0800, Paul Eggert wrote:

Stlll, Pádraig gave a reasonable summary of why the change was made,


To clarify my summary a little, there I said that -n now _immediately_ fails.
I should have said _silently_ fails.  I.e. the complete copy operation
proceeds as before, and only the exit status is at issue here.


despite its incompatibility with previous behavior. (One thing I'd add
is that the FreeBSD behavior is inherently less race-prone.)


Whether the implementation is race-prone or not is an internal thing.
I think we're currently discussing more on a user-perspective level.

IIUC then the question is whether `cp -n` should continue to behave like
the (new) `cp --update=none` which returns EXIT_SUCCESS.

Regardless what other implementations do, when reading the -n description
from a user's point of view:

 -n, --no-clobber do not overwrite an existing file (overrides a
-u or previous -i option). See also --update

then I'd expect the tool to just skip existing files like `rsync 
--ignore-existing`
does.  In that regard I would be surprised if skipping files would result in an 
error.
Well, I would understand if there'd be a '--no-clobber=fail' option.


Agreed we should improve the docs a bit for this option.
I'll apply this at least:

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 1f8b356d1..bf0f424d3 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9057,6 +9057,8 @@ Do not overwrite an existing file; silently fail instead.
   This option overrides a previous
   @option{-i} option.  This option is mutually exclusive with @option{-b} or
   @option{--backup} option.
+See also the @option{--update=none} option which will
+skip existing files but not fail.

   @item -P
   @itemx --no-dereference
diff --git a/src/cp.c b/src/cp.c
index 04a5cbee3..3ccc4c4e6 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -192,8 +192,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
 -L, --dereferencealways follow symbolic links in SOURCE\n\
   "), stdout);
 fputs (_("\
-  -n, --no-clobber do not overwrite an existing file (overrides 
a\n\
- -u or previous -i option). See also 
--update\n\
+  -n, --no-clobber ensure no existing files overwritten, and 
fail\n\
+ silently instead. See also --update\n\
   "), stdout);
 fputs (_("\
 -P, --no-dereference never follow symbolic links in SOURCE\n\



As Kamil added the option in 2009, I'd assume that the same patch was already
active in RHEL versions for quite some longer time.
Now changing the exit code feels kind of rough.


Well RHEL 6 came out a bit after (2010), and had the --no-clobber change,
while RHEL 5 before that did not.

Taking about distros, it's worth noting that the change is Fedora 39
which has been released for a month now.
We'll keep a close eye on issues, but haven't heard much as
of yet at least.


Therefore, from a pure user's perspective and regarding many years of 
precedence,
I am 80:20 for reverting the exit code change.


Thanks for your thoughts,
appreciated as always.


We were undecided how to proceed as of the above discussion,
with some hoping to consolidate -n behavior across all systems,
with others preferring to keep compat with original Linux behavior.

Things have changed in Debian I see, so that cp -n now aggressively warns like:
https://sources.debian.org/patches/coreutils/9.4-3/cp-n.diff/

  $ cp -n /bin/true tmp
  cp: warning: behavior of -n is non-portable and may change in future; use 
--update=none instead

This is problematic as:

  - It's noisy
  - There is no way to get the behavior of indicating failure if existing files 
present
  - The --update=none advice is only portable to newer coreutils

To summarise existing use cases of -n:

  1. User expected -n to exit failure if existing files
  2. User didn't care about existing status
  3. User expected -n to exit success if existing files

Use case 3 is the problematic case we changed behavior for,
but also I think that's the minority of the 3 use cases
and could have been fixed forward.

The debian advice forces use case 2 to change,
and provides bad advice for use case 1,
and in fact there is no solution now for use case 1.

At this stage it seems best for us go back to the original Linux behiavor (use 
case 3),
and to silently deprecate -n in docs to document the portability issues with it.
We should also provide --update=noclobber for use case 1.
Having the control on the --update option, allows use to more clearly deprecate 
-n.

I do realise folks may expect use case 1 now with -n,
especially on releases like Fedora 39, but given the feedback
that would be the least bad option.

thanks,
Pádraig





bug#68708: [PATCH] env,kill: Handle unnamed signals

2024-01-25 Thread Pádraig Brady

On 24/01/2024 20:40, Grisha Levit wrote:

Android reserves [1] some realtime signals and redefines [2] SIGRTMIN,
leaving a gap between the signals that have SIG* constants defined in
signal.h and SIGRTMIN.

When passed such a signal number, gnulib sig2str returns -1 and leaves
its signame argument unchanged.

The signal listing in env ends up reusing the name of the last printed
valid signal:

 $ env --list-signal-handling true
 HUP( 1): IGNORE
 HUP(32): BLOCK
 HUP(38): IGNORE

..and the corresponding signal numbers are rejected as operands for the
env, kill, and timeout commands.

This patch removes the requirement that sig2str returns 0 for a signal
number associated with an operand, and allows unnamed signals to be in
the sets `env' attempts to manipulate when a --*-signal option is used
with no argument.

This does leave the possibility of numbers lower than SIGNUM_BOUND that
are not valid signals, but I'm not sure that's really a problem for the
existing code paths (if it is, adding checks for sigset_t manipulations
would probably be enough).

To be on the safe side, added a check to report kill(3) EINVAL as a bad
signo (rather than always blaming the pid).

This does not change the default list printed with `kill -l'.  When a
name is to be printed, the signal number associated with an unnamed
signal is used.

[1]: 
https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/platform/bionic/reserved_signals.h
[2]: 
https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/include/signal.h;l=51



This mostly looks good, except:

- No need to clear the errno before kill(3).
- Better to use SIG%d rather than the bare %d for signal _names_, as we already 
parse this format

thanks!
Pádraig





bug#68267: [PATCH] maint: add attributes to two functions without side effects

2024-01-06 Thread Pádraig Brady

On 05/01/2024 16:44, Samuel Tardieu wrote:

* src/date.c (res_width): This function computes its result solely
from the value of its parameter and qualifies for the const attribute.
* src/tee.c (get_next_out): This function has no side effect and
qualifies for the pure attribute.

Those two functions were flagged by GCC 12.3.0.


I'm guessing GCC 12 needs help here as it can't determine the loops are finite.
(as per: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109914 )

Though I'm not seeing this suggestion with GCC 13.2.1,
so perhaps GCC 12 can determine the loops are finite?

I'll apply this since GCC 13 is less that a year old,
but in general we try to avoid littering code like this.

thanks,
Pádraig





bug#68283: ls -l issues on cifs mounted directories

2024-01-06 Thread Pádraig Brady

On 06/01/2024 13:08, Bjoern Voigt via GNU coreutils Bug Reports wrote:

After upgrading coreutils from version 9.3 to 9.4, "ls -l" shows error
messages, if the files or directories are in a cifs/smb3 mounted directory.

Example:
/mnt/cifstest here is mounted with "mount -t cifs -o username=myuser
//192.168.1.2/all /mnt/cifstest/". The server is a Samba 4.19.2 server.
The kernel is version 6.1.71.

# ls -l /mnt/cifstest/
ls: /mnt/cifstest/: Resource temporarily unavailable




"ls -l" from coreutils 9.3 does not show any errors.

Probably it is not just the error message. "dmesg -w" shows several
errors after executing "ls -l":

[17563.999109] smb2_reconnect: 3 callbacks suppressed
[17563.999113] CIFS: VFS: reconnect tcon failed rc = -11



I debugged this problem a bit. The changes in
coreutils-9.4/lib/file-has-acl.c:file_has_acl are probably responsible
for the errors. The system call listxttr produces the error message
"Resource temporarily unavailable". I could not find the reasons for
error messages shown with "dmesg".

The listxattr.c test program in the manual page (man listxattr) shows
the following output:

# ./listxattr /mnt/cifstest/
listxattr: Resource temporarily unavailable

Work-around are to downgrade to coreutils 9.3 or not to use the "-l"
option for ls.


Thanks for the details.
EAGAIN is not a documented return from listxattr,
and it seems like the cifs client should not let that leak.
Perhaps the kernel needs another change along the lines of:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6efa994e35a
(That was included in v5.18).

Perhaps something to ask at linux-c...@vger.kernel.org ?

If this is a common enough issue then it might be appropriate for us to handle,
but given the current info, ls is just informing you of an actual issue on your 
system.

thanks,
Pádraig





bug#68256: df fails on /run/user/1000/doc with "Operation not permitted"

2024-01-05 Thread Pádraig Brady

tag 68256 notabug
close 68256
stop

On 05/01/2024 09:22, Nada Machkova wrote:

hello
I have just upgraded Debian Bullseye
and simple df command respond at user CLI
$ df -hT
df: /run/user/1000/doc: Operation not permitted
...
but when I do the same as root there is NO error.
So I UNmounted relevant file and AFTER that df response has NO error for user
# fusermount -u /run/user/1000/doc
But I need to do it after each reboot :-(

I've just analyzed system status and versions
and checked related bug https://github.com/flatpak/xdg-desktop-portal/issues/553
PLS see the following details and let me know if it is coreutils bug
thank you for your time
Nada


Debian Bullseye (using coreutils 8.32) would need to apply:
https://github.com/coreutils/gnulib/commit/9a38d499ca.patch

That patch was included in the coreutils >= 9.0 releases,
and avoids including this flatpak related fuse.portal mount in the default df 
list.

I also have such a fuse.portal mount on my Fedora 39 system:

  # grep /run/user/1001/doc /proc/mounts
  portal /run/user/1001/doc fuse.portal ...

and weirdly root can statfs() but not stat(), and non root is vice versa :/

  # stat /run/user/1001/doc
  stat: cannot statx '/run/user/1001/doc': Permission denied

  $ stat /run/user/1001/doc
  File: /run/user/1001/doc
  Size: 0   Blocks: 0  IO Block: 4096   directory
  Device: 0,77  Inode: 1   Links: 2
  Access: (0500/dr-x--)  Uid: ..
  Context: system_u:object_r:fusefs_t:s0

  $ stat -f /run/user/1001/doc
  stat: ... '/run/user/1001/doc': Operation not permitted
  # stat -f /run/user/1001/doc
  File: "/run/user/1001/doc"
  ID: 0Namelen: 0   Type: fuseblk
  Block size: 0  Fundamental block size: 0
  Blocks: Total: 0  Free: 0  Available: 0
  Inodes: Total: 0  Free: 0

cheers,
Pádraig





bug#68216: An instance of DD receiving data from another instance with bs=1M, reports the wrong record count.

2024-01-02 Thread Pádraig Brady

On 02/01/2024 22:45, ja...@johnstone.net.au wrote:

Hi Pádraig,

iflag=fullblock seems to resolve this experience and the man page
excerpt provided makes sense though those lines make no appearance in
the man pages on Archlinux. I will follow up with the distro maintainers.


As would explicitly specifying obs on the second dd instance.

cheers,
Pádraig






bug#68216: An instance of DD receiving data from another instance with bs=1M reports the wrong record count.

2024-01-02 Thread Pádraig Brady

forcemerge 8171 68216
stop

On 02/01/2024 10:45, ja...@johnstone.net.au wrote:

Hi all,

It seems an instance of `dd`  with only an output file doesn't report
the count of records accurately despite bs= being described as
influencing both ibs= and obs=

dd'ing from an unstable remote over ssh today I had it continue from the
last received block between crashes in preparation for a P2V. The remote
had bs=1M set with its in=device and the local instance had bs=1M too
with the destination of= device but upon referring to the receiving
end's stable count of records I noticed it ignored my bs=1M setting.
This were worked around using obs=1M to make the counter show a value I
could reliable skip=/seek= with to resume.

I have an example here:

 /$ dd if=/dev/zero bs=1M count=1 | dd bs=1M of=/dev/null
 1+0 records in
 1+0 records out
 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0010021 s, 1.0 GB/s
 *0+16 records in
 0+16 records out*
 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.000771722 s, 1.4 GB/s/

This is being experienced on Archlinux with kernel '6.6.4' and package
'core/coreutils 9.4-2'

dd's man page for this distribution shows:

 /bs=BYTES
    read and write up to BYTES bytes at a time (default:
 512); overrides ibs and obs/

So I feel it not overriding obs= when invoked with only of= and bs=
arguments could be a bug.

Running a single dd with /if=source of=dest bs=1M/ works as intended.

While in my case this isn't a fatal bug working with virtual block
devices it could be problematic for use cases which require a proper
treatment of block sizes such as tape devices.


Well the behavior is expected and documented:

‘bs=BYTES’
 Set both input and output block sizes to BYTES.  This makes ‘dd’
 read and write BYTES per block, overriding any ‘ibs’ and ‘obs’
 settings.  In addition, if no data-transforming ‘conv’ operand is
 specified, input is copied to the output as soon as it’s read, even
 if it is smaller than the block size.

For details please see https://bugs.gnu.org/8171

thanks,
Pádraig.






bug#68064: Date addition error with “day Monthname” versus “Monthname day”

2023-12-27 Thread Pádraig Brady

tag 68064 notabug
close 68064
stop

On 27/12/2023 17:29, Larry Ploetz wrote:

It seems like there might be a problem with date addition when the base
date is specified as “day Monthname” instead of “Monthname day”, where
the offset is being interpreted as an absolute year value. This may be
locale-specific.

 :bin larry$ locale
 LANG="en_US.UTF-8"
 LC_COLLATE="en_US.UTF-8"
 LC_CTYPE="en_US.UTF-8"
 LC_MESSAGES="en_US.UTF-8"
 LC_MONETARY="en_US.UTF-8"
 LC_NUMERIC="en_US.UTF-8"
 LC_TIME="en_US.UTF-8"
 LC_ALL=
 :bin larry$ ./date -d "$(./date -d today +%d\ %b) + 1 day"
 Fri Dec 28 00:00:00 LMT 0001
 :bin larry$ ./date -d "$(./date -d today +%b\ %d) + 1 day"
 Thu Dec 28 00:00:00 CST 2023
 :bin larry$ ./date --version
 date (GNU coreutils) 9.4.97-98d463
 Copyright (C) 2023 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or 
later.
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.


This is due to ambiguity in date input formats.
Your case is:

  $ date --debug -d "27 Dec + 1 day"
  date: parsed date part: (Y-M-D) -001-12-27
  date: parsed relative part: +1 day(s)

A slightly adjusted case highlighting the ambiguity:

  $ date --debug -d "27 Dec + 1 week"
  date: parsed date part: (Y-M-D) -001-12-27
  date: parsed relative part: +7 day(s)

So really the solution here is to avoid the ambiguity
by explicitly specifying the year, or more abstractly
specifying "today" rather than a partial date.

Another gotcha highlighted by the --debug option above is:

  date: warning: using midnight as starting time: 00:00:00
  date: starting date/time: '(Y-M-D) 0001-12-27 00:00:00'
  date: warning: when adding relative days, it is recommended to specify noon

Another gotcha is that only English month abbreviations
are supported on input, so your command will fail in other locales.

For a summary of date input issues see:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e
https://www.gnu.org/software/coreutils/manual/html_node/Date-input-formats.html#index-date-input-formats

cheers,
Pádraig





bug#62572: cp --no-clobber behavior has changed

2023-12-17 Thread Pádraig Brady

On 16/12/2023 21:46, Bernhard Voelker wrote:

On 12/15/23 21:13, Michael Stone wrote:

On Fri, Dec 15, 2023 at 11:21:06AM -0800, Paul Eggert wrote:

Stlll, Pádraig gave a reasonable summary of why the change was made,


To clarify my summary a little, there I said that -n now _immediately_ fails.
I should have said _silently_ fails.  I.e. the complete copy operation
proceeds as before, and only the exit status is at issue here.


despite its incompatibility with previous behavior. (One thing I'd add
is that the FreeBSD behavior is inherently less race-prone.)


Whether the implementation is race-prone or not is an internal thing.
I think we're currently discussing more on a user-perspective level.

IIUC then the question is whether `cp -n` should continue to behave like
the (new) `cp --update=none` which returns EXIT_SUCCESS.

Regardless what other implementations do, when reading the -n description
from a user's point of view:

-n, --no-clobber do not overwrite an existing file (overrides a
   -u or previous -i option). See also --update

then I'd expect the tool to just skip existing files like `rsync 
--ignore-existing`
does.  In that regard I would be surprised if skipping files would result in an 
error.
Well, I would understand if there'd be a '--no-clobber=fail' option.


Agreed we should improve the docs a bit for this option.
I'll apply this at least:

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 1f8b356d1..bf0f424d3 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9057,6 +9057,8 @@ Do not overwrite an existing file; silently fail instead.
 This option overrides a previous
 @option{-i} option.  This option is mutually exclusive with @option{-b} or
 @option{--backup} option.
+See also the @option{--update=none} option which will
+skip existing files but not fail.

 @item -P
 @itemx --no-dereference
diff --git a/src/cp.c b/src/cp.c
index 04a5cbee3..3ccc4c4e6 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -192,8 +192,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
   -L, --dereferencealways follow symbolic links in SOURCE\n\
 "), stdout);
   fputs (_("\
-  -n, --no-clobber do not overwrite an existing file (overrides 
a\n\
- -u or previous -i option). See also 
--update\n\
+  -n, --no-clobber ensure no existing files overwritten, and 
fail\n\
+ silently instead. See also --update\n\
 "), stdout);
   fputs (_("\
   -P, --no-dereference never follow symbolic links in SOURCE\n\



As Kamil added the option in 2009, I'd assume that the same patch was already
active in RHEL versions for quite some longer time.
Now changing the exit code feels kind of rough.


Well RHEL 6 came out a bit after (2010), and had the --no-clobber change,
while RHEL 5 before that did not.

Taking about distros, it's worth noting that the change is Fedora 39
which has been released for a month now.
We'll keep a close eye on issues, but haven't heard much as
of yet at least.


Therefore, from a pure user's perspective and regarding many years of 
precedence,
I am 80:20 for reverting the exit code change.


Thanks for your thoughts,
appreciated as always.

cheers,
Pádraig






bug#62572: cp --no-clobber behavior has changed

2023-12-15 Thread Pádraig Brady

On 15/12/2023 15:56, Michael Stone wrote:

I tend to think this was a serious mistake: it breaks the behavior of
existing scripts with no deprecation period. A stated advantage is
better compatibility with freebsd, but I don't understand why that is
more desirable than compatibility with all deployed gnu/linux systems? I
also don't think it's sufficient to try to lawyer out by saying that the
current behavior was undocumented: the previous documentation said that
-n would "silently do nothing" and that the return code would be zero on
success. Logically, unless cp fails to "do nothing", it should exit with
a zero code.

Such a drastic change in behavior demands a new flag, not a radical
repurposing of a widely used existing flag.

I was hoping to see more action on this bug, but that hasn't happened.
I'm not sure I see a way forward for debian other than reverting to the
old behavior. I am reluctant to do so as that will likely lead to
divergent behavior between distributions, but breaking scripts without a
compelling reason is also not good. I would encourage coreutils to
reconsider the change and finding a non-breaking way forward.


Yes it's a fair point.
It's an awkward case, and worth discussing.

To summarise:

  coreutils >= 7.1 had -n skip existing in dest (2009)
  coreutils >= 9.2 has -n immediately fail if existing in dest
  coreutils >= 9.3 has --update=none to skip existing in dest

  FreeBSD >= 4.7/macos has -n immediately fail if existing in dest

  bash has noclobber as a file protection mechanism,
  and fails immediately upon trying to overwrite a file.
  This is more consistent with the new coreutils behavior.

I see a reasonable amount of cp -n usage across github:
https://github.com/search?q=/cp+.*+-n+.*/+path:*.sh=code

Now it's not clear which behavior these github usages expect,
and the original docs didn't make it clear which behavior to expect.
A quick scan of the github usages also seem mainly to expect
a protection rather than an update use case, so failing
immediately would be the most appropriate action there too.
Also the original coreutils bug report here expected the new behaviour.

So we probably all agree that failing immediately is the
most appropriate / consistent -n behavior,
but GNU had diverged from that so there are about 10 years
of scripts that may expect the silent skip behavior.

Two options I see are:

- Leave as is and fix -n usages that expected the skip behavior
- Deprecate -n entirely and prompt to use --update={fail,none}

Advantages of leaving as is:
We get consistency of "noclobber" behavior across systems / shells.
We fix cases where previously scripts could have proceeded with
stale old files in place.

Disadvantages of leaving as is:
Users expecting the skip behavior, have to change to --update=none.

There is no potential for data loss etc. so it just comes
down to how disruptive it is, or how often -n was used
with the "skip behavior" assumption.

We've not had much push back as of yet,
and my current thinking is it's not that disruptive a change.
So I'd be 55:45 if favor of keeping things as is.

thanks,
Pádraig.





bug#67756: autoconf-2.72d: bootstrap: The macro 'AC_PROG_GCC_TRADITIONAL' is obsolete.

2023-12-11 Thread Pádraig Brady

On 10/12/2023 21:34, Bjarni Ingi Gislason wrote:

   From:

bootstrap.loc: bootstrapping with --no-git --gnulib-srcdir=/home/bg/git/gnulib 
--skip-po --bootstrap-sync

[...]
autoreconf: running: /usr/local/bin/autoconf --include=m4 
--include=/usr/share/aclocal --force
configure.ac:58: warning: The macro 'AC_PROG_GCC_TRADITIONAL' is obsolete.
configure.ac:58: You should run autoupdate.
../lib/autoconf/c.m4:1673: AC_PROG_GCC_TRADITIONAL is expanded from...
configure.ac:58: the top level
autoreconf: running: /usr/local/bin/autoheader --include=m4 
--include=/usr/share/aclocal --force
[...]

   In "autoconf-2.72d/lib/autoconf/c.m4" is

# AC_PROG_GCC_TRADITIONAL
# ---
AU_DEFUN([AC_PROG_GCC_TRADITIONAL],
   [AC_REQUIRE([AC_PROG_CC])],
   [$0 is obsolete; use AC_PROG_CC]
)


OK pushed the following.
Marking this as done.

thanks,
Pádraig

commit 0d9547474d001e2903c18164d8929f5d2d2c0dd7 (HEAD -> master)
Author: Pádraig Brady 
Date:   Mon Dec 11 17:03:33 2023 +

maint: remove obsolete AC_PROG_GCC_TRADITIONAL

* configure.ac: Remove obsolete macro call.
Recent autoconf warns that it is obsolete.
AC_PROG_CPP sets up the -traditional-cpp option if required.
GCC ignores -traditional since commit f458d1d5 (2002).
Fixes https://bugs.gnu.org/67756

diff --git a/configure.ac b/configure.ac
index eec110790..de57b3216 100644
--- a/configure.ac
+++ b/configure.ac
@@ -55,7 +55,6 @@ gl_ASSERT_NO_GNULIB_POSIXCHECK])
 AC_PROG_CC
 AM_PROG_CC_C_O
 AC_PROG_CPP
-AC_PROG_GCC_TRADITIONAL
 AC_PROG_RANLIB
 AC_PROG_EGREP
 AC_PROG_LN_S






bug#67765: ls: manual entry for -f mentions incorrect option

2023-12-11 Thread Pádraig Brady

On 11/12/2023 09:38, Daniel Hofstetter wrote:

Hi,

The manual entry for -f on
https://www.gnu.org/software/coreutils/manual/html_node/Sorting-the-output.html#index-_002df-9
contains:

"[...] This is equivalent to the combination of --all (-a),
--sort=none (-U), -1, --color=none, and --hyperlink=none [...]."

The "-1" option seems to be incorrect because "ls -f" lists more than
one file per line with ls 9.4.


Right, that description regressed in coreutils 9.0.
I've pushed a fix.

Marking this as done.

thanks,
Pádraig






bug#67731: bootstrap, tries to fetch files although using --no-git --skip-po and --gnulib-srcdir...

2023-12-10 Thread Pádraig Brady

On 09/12/2023 20:22, Bjarni Ingi Gislason wrote:

bootstrap.loc: bootstrapping with --no-git
--gnulib-srcdir=/home/bg/git/gnulib --skip-po --bootstrap-sync
./bootstrap: updating bootstrap and restarting...
./bootstrap: Bootstrapping from checked-out coreutils sources...
./bootstrap: consider installing git-merge-changelog from gnulib
./bootstrap: getting translations into po/.reference for
coreutils...
OUTPUT:INVALID,NEW,REJECT IN= OUT=eth0 SRC=192.168.1.72
DST=80.69.83.146 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=21973 DF
PROTO=TCP
SPT=51544 DPT=443 WINDOW=64240 RES=0x00 SYN URGP=0
OUTPUT:INVALID,NEW,REJECT IN= OUT=eth0 SRC=192.168.1.72
DST=80.69.83.146 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=21974 DF
PROTO=TCP
SPT=51544 DPT=443 WINDOW=64240 RES=0x00 SYN URGP=0
failed: Connection refused.
failed: Network is unreachable.
./bootstrap: could not fetch auxiliary files


It looks like --bootstrap-sync is the issue here.
Adding a little debug shows that upgrade_bootstrap()
does not propagate params to the subsequent bootstrap run:

$ ./bootstrap --gnulib-srcdir=/home/padraig/g/gnulib --no-git --skip-po 
--bootstrap-sync
bootstrap params: --gnulib-srcdir=/home/padraig/g/gnulib --no-git --skip-po 
--bootstrap-sync
./bootstrap: updating bootstrap and restarting...
bootstrap params: --no-bootstrap-sync

The attached adjusts bootstrap so the options are passed
through the upgrade_bootstrap() function.

Marking this as done,

thanks,
PádraigFrom cb632d68ef3cb976c5b366f633478419d1791121 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 10 Dec 2023 14:46:58 +
Subject: [PATCH] bootstrap: fix option propagation with --bootstrap-sync

* build-aux/bootstrap: Ensure options are propagated through
upgrade_bootstrap().
Fixes https://bugs.gnu.org/67731
---
 ChangeLog   | 6 ++
 build-aux/bootstrap | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1a3429e03b..b0d748029d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2023-12-10  Pádraig Brady  
+
+	bootstrap: fix option propagation with --bootstrap-sync
+	* build-aux/bootstrap: Ensure options are propagated through
+	upgrade_bootstrap().
+
 2023-12-01  Paul Eggert  
 
 	frexp, frexpf: pacify clang re address-of-volatile
diff --git a/build-aux/bootstrap b/build-aux/bootstrap
index 979b3af62e..15bda089f6 100755
--- a/build-aux/bootstrap
+++ b/build-aux/bootstrap
@@ -777,7 +777,7 @@ autopull()
   if $use_gnulib || $bootstrap_sync; then
 prepare_GNULIB_SRCDIR
 if $bootstrap_sync; then
-  upgrade_bootstrap
+  upgrade_bootstrap "$@"
 fi
   fi
 
@@ -1496,7 +1496,7 @@ check_build_prerequisites $use_git
 
 if $bootstrap_sync; then
   prepare_GNULIB_SRCDIR
-  upgrade_bootstrap
+  upgrade_bootstrap "$@"
   # Since we have now upgraded if needed, no need to try it a second time below.
   bootstrap_sync=false
 fi
-- 
2.41.0



bug#67690: Bug in command sort?

2023-12-07 Thread Pádraig Brady

tag 67690 notabug
close 67690
stop

On 07/12/2023 14:36, Oleg Moiseichuk via GNU coreutils Bug Reports wrote:

Hello!

I've got a list of IP addresses, each of them is prepended by its frequency 
counter (please find attached in the file list-1.txt). I need to sort them from 
most frequent to least. I tried using this command:
sort -t '.' -n -k 1.1,1.8r -k 1.9 -k 2,2 -k 3,3 -k 4,4 list-1.txt
But I've got some weird results.


Right, once you have multiple delimiters you generally need to adjust the data


Ok, I merged these counters with IP addresses using awk (file list-2.txt). Now 
they use the same separator and I can simplify the command:
sort -t '.' -n -k 1,1r -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt > sorted-a.txt
It looks like as sorted properly but some entries with the counters 13 and 10 
are misplaced.
Strangely enough, when I use direct order, they are sorted correctly:
sort -t '.' -n -k 1,1 -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt > sorted-b.txt


You're using the correct approach here, but missed this from the docs:

"A position in a sort field specified with ‘-k’ may have any of the
option letters ‘MbdfghinRrV’ appended to it, in which case no global
ordering options are inherited by that particular field."

I.e. the 'r' is cancelling out the global 'n'.

So you need to specify both options from that field like:

sort -t '.' -n -k 1,1rn -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt

cheers,
Pádraig

p.s. the --debug option can be useful with sort to
help identify what's being compared, and various edge cases.





bug#67680: nohup is not giving immunity to SIGHUP

2023-12-07 Thread Pádraig Brady

tag 67680 notabug
close 67680
stop

On 07/12/2023 08:13, Bob Hepple wrote:

$ nohup --version
nohup (GNU coreutils) 9.3
$ uname -a
Linux achar-void 6.5.12_1 #1 SMP PREEMPT_DYNAMIC Mon Nov 20 18:31:50
UTC 2023 x86_64 GNU/Linux

This is on void, but it also behaves the same on fedora-38

I have extracted a minimal example - it looks like a stupid thing to
do but it demonstrates the problem. A full and non-stupid script is at

https://gitlab.com/wef/dotfiles/-/blob/master/bin/fzf-launcher

Nevertheless - the following commands are run from bash in a graphical
session. 'foot' is a terminal emulator - any terminal emulator can be
used - I have tried mrxvt, urxvt, alacritty.

'imv' is a simple program that opens a window. Anything else with a
.desktop file could be used, such as 'rofi' or 'xsane'.

$ foot -e bash -c "nohup gtk-launch imv &"
... fails - the program imv briefly opens a window and immediately closes

$ foot -e bash -c "nohup gtk-launch imv & sleep 3"
 the 'imv' window shows but it is closed after 3 seconds - clearly
the program 'imv' starts but is killed by something when the script
terminates.

$ foot -e bash -c "trap '' HUP; nohup gtk-launch imv &"
... works perfectly - the 'something' that is killing imv is
apparently the HUP signal! So why does nohup not grant immunity to
HUP?


In the last case you're telling bash to ignore HUP,
so I suspect in the other case bash is causing the exit.
I'd look into `strace -e signal ...` to diagnose what's
happening exactly, and perhaps setsid(1) may be useful for your case.

thanks,
Pádraig





bug#67656: 'man touch' text about --time hard to read compared to 'touch --help'

2023-12-06 Thread Pádraig Brady

On 06/12/2023 09:03, Arsen Arsenović via GNU coreutils Bug Reports wrote:

Hi,

In the manpage of touch, 'touch' flag --time is described as:

--time=WORD
   change the specified time: WORD is access, atime, or use:
   equivalent to -a WORD is modify or mtime: equivalent to -m

... whereas, in the --help text, --time is described as:

 --time=WORDchange the specified time:
  WORD is access, atime, or use: equivalent to -a
  WORD is modify or mtime: equivalent to -m

The latter is far more readable, and the manpage appears to be
mis-formatted (perhaps by help2man?).

IMO, the help text should also say 'if WORD is ..., or ...: ...' to
provide further clarity in the text.

The formatting issue is more pressing, however, as it makes the manpage
highly confusing compared to the help or manual texts.

Could this be corrected?


I pushed the following:

  --time=WORDspecify which time to change:
   access time (-a): 'access', 'atime', 'use';
   modification time (-m): 'modify', 'mtime'

Marking this as done.

thanks,
Pádraig





bug#67626: column with -t and piped extends a single line file into multiple lines

2023-12-04 Thread Pádraig Brady

tag 67626 notabug
close 67626
stop

On 04/12/2023 16:43, Remigiusz Suwalski via GNU coreutils Bug Reports wrote:

Hi,
today I have discovered accidentally that column utility behaves oddly when 
piped to another command, as shown on example below. I am not sure whether this 
is a bug or intended behaviour.

First "column" built from sources (commit 
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/commit/?id=94feb5a20d23904cc15cd857c4e00f35f495116c):
$ echo '_ __ ___  __ ' | ./column -t | nl # strange
1 __
2 ___
3 ___
4 _
5 __
6 ___



Bugs for column(1) should go to:

  E-MAIL: util-li...@vger.kernel.org
  Web:https://github.com/util-linux/util-linux/issues

cheers,
Pádraig





bug#67619: stty on Chromebook

2023-12-04 Thread Pádraig Brady

tag 67619 notabug
close 67619
stop

On 04/12/2023 07:41, Dan Jacobson wrote:

Seen on Chromebook:
$ stty speed
38400
$ stty speed 1200
38400
$ stty speed 1200
1200

coreutils 9.4


"speed" only prints the current setting.
A number on its own, sets the input and output speeds.
So depending on the order of the supplied "speed" or ,
it will print the current speed before or after it is set.
If you want to set speed with explicit directives you would need:
  stty ispeed 1200 ospeed 1200

cheers,
Pádraig





bug#67593: `split --number=l/N` no longer splits evenly

2023-12-03 Thread Pádraig Brady

On 03/12/2023 09:37, Paul Eggert wrote:

That's not a bug, in that 'split' is behaving as documented. The first
input line is one byte shorter than the second one. 'Split' divides the
input into two regions, and because the first region happens to be one
byte longer than the second region both input lines are sent to the
first output file.

In older coreutils, 'split' used a different algorithm to compute region
sizes, which worked better for your test case but considerably worse in
others. For example, in older coreutils:

seq 50 >in
split -n l/71 in

created 43 files of size 0, 9 files of size 2, 18 files of size 3, and
one file of size 69. Current coreutils splits much better: it creates 21
files of size 0, 9 files of size 2, and 41 files of size 3.


Related to this, I think it would be useful to add a new
split --number=L/N` mode (note the capital L), which tries harder
to evenly distribute lines.
It would only be supported when we can determine the number of lines up front,
and so wouldn't be supported when reading from a pipe for e.g.

cheers,
Pádraig.





bug#67490: [PATCH v2] tail: fix tailing sysfs files on large page kernels

2023-12-01 Thread Pádraig Brady

On 01/12/2023 01:54, Paul Eggert wrote:

On 11/30/23 12:11, Pádraig Brady wrote:

Though that will generally give 128K, which is good when processing all
of a file,
but perhaps overkill when processing just the last part of a file.


The 128 KiB number was computed as being better for apps like 'sed' that
typically read all or most of the file. 'tail' sometimes behaves that
way (e.g., 'tail -c +10') and so 'tail' should use 128 KiB in those
cases. The simplest way to do that is for 'tail' to use 128 KiB all the
time - that would cost little for uses like plain 'tail' and it could be
a significant win for uses like 'tail -c +10'.


Yes I agree we should use io_blksize() in other routines in tail
where we may dump lots of a file. However in this (most common) case
the routine is dealing with the end of a regular file,
so it's probably best to somewhat minimize the amount of data read,
and more directly check the page_size which is issue at hand.
I've pushed the fix at https://github.com/coreutils/coreutils/commit/73d119f4f
where the adjustment (which also corresponds to what we do in wc) is:

  if (sb->st_size % page_size == 0)
bufsize = MAX (BUFSIZ, page_size);

I'll follow up with another patch to address the performance aspect,
which uses io_blksize() where appropriate.


(As an aside, the 128 KiB number was computed in 2014. These days 256
KiB might be better if someone could take the time to measure)


I periodically check with the documented script,
but I should update the comment when I do that.
I'll update the date in the comment now at least.
I quickly tested a few systems here,
which suggests 256KiB _may_ be a more appropriate default now.
More testing required.

Marking this as done.

thanks,
Pádraig





bug#67490: [PATCH v2] tail: fix tailing sysfs files on large page kernels

2023-11-30 Thread Pádraig Brady

Much clearer thanks.

On my system:

  $ stat /sys/kernel/profiling
File: /sys/kernel/profiling
Size: 4096  Blocks: 0  IO Block: 4096   regular file

I can easily repro by setting the buffer size < PAGE_SIZE.

So this patch handles the case where sysfs reports a file is a certain size,
but it isn't really. In that case seeking to anywhere other than the start
doesn't give an error, but reading returns nothing. So we use a buffer size
large enough (>= PAGE_SIZE as inferred from st_blksize) so that we'll be
reading from the start of the file in this case.

Note st_blksize can have unusual values, so it might be better
to use the io_blksize() wrapper to sanitize the values.
Though that will generally give 128K, which is good when processing all of a 
file,
but perhaps overkill when processing just the last part of a file.

So perhaps it's better to use buffer_size = MAX (BUFSIZ, getpagesize ())
and avoid all the st_blksize edge cases.

I'll think about it a little, and make the adjustments.

thanks!
Pádraig.





bug#67490: [PATCH] tail: fix following /proc and /sys files when using a 64K page size

2023-11-27 Thread Pádraig Brady

On 27/11/2023 16:24, dann frazier wrote:

* src/tail.c (file_lines): Use fstat() to determine a file's block
size and dynamically allocate a buffer of that size for traversing
backwards.


Thanks for the patch.
Could you describe it a bit more.
What happens if we use smaller reads?
Also what about all the other safe_read() calls in tail.c ?

thanks,
Pádraig






bug#47103: numfmt: invalid suffix 'k'

2023-11-26 Thread Pádraig Brady

On 26/11/2023 16:09, Sven Köhler wrote:

So Pádraig's patch does allow for parsing lowercase k, but it does not
change numfmt to use lowercase k in its output in si mode.

As Pádraig has shown, ls uses lowercase k in --si mode. So it uses
lowercase k for 1000. I think that numfmt should behave the same for
consistency reasons.


It does output lowercase 'k' in SI mode.

Attached is the full patch.

Marking this as done.

Will push this tomorrow.

thanks,
PádraigFrom 48b1b54d458897ebec98fe53139cb6b972ecc274 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 26 Nov 2023 16:41:56 +
Subject: [PATCH] numfmt: support lowercase 'k' for Kilo and Kibi

For consistency with the "SI" standard, and with other coreutils
which output a lowercase 'k' in "SI" mode.

* src/numfmt.c (suffix_power): Treat 'k' like 'K' on input.
(double_to_human): Output lowercase 'k' in SI mode.
(usage): Adjust accordingly.
* doc/coreutils.texi: Mention 'k' accepted, and printed in SI mode.
* tests/misc/numfmt.pl: Adjust accordingly.
* NEWS: Mention the change in behavior.
Fixes https://bugs.gnu.org/47103
---
 NEWS |   3 +
 doc/coreutils.texi   |  35 +
 src/numfmt.c |  11 ++-
 tests/misc/numfmt.pl | 170 +++
 4 files changed, 119 insertions(+), 100 deletions(-)

diff --git a/NEWS b/NEWS
index b1088f683..1a08076a4 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,9 @@ GNU coreutils NEWS-*- outline -*-
   ls --dired now implies long format output without hyperlinks enabled,
   and will take precedence over previously specified formats or hyperlink mode.
 
+  numfmt will accept lowercase 'k' to indicate Kilo or Kibi units on input,
+  and user lowercase 'k' when outputting such units in '--to=si' mode.
+
   wc no longer ignores encoding errors when counting words.
   Instead, it treats them as non white space.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 75a310fc8..7ad8c0db7 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -19109,7 +19109,7 @@ The default is no scaling, meaning all the digits of the number are printed.
 @opindex --to-unit
 Specify the output unit size (instead of the default 1).  Use this option when
 the output numbers represent other units (e.g. to represent @samp{4,000,000}
-bytes in blocks of 1KB, use @samp{--to=si --to-unit=1000}).
+bytes in blocks of 1kB, use @samp{--to=si --to-unit=1000}).
 Suffixes are handled as with @samp{--from=auto}.
 
 @optZeroTerminated
@@ -19137,7 +19137,8 @@ For output numbers, values larger than 1000 will be rounded, and printed with
 one of the following suffixes:
 
 @example
-@samp{K}  =>  @math{1000^1 = 10^3} (Kilo)
+@samp{K}  =>  @math{1000^1 = 10^3} (Kilo) (uppercase accepted on input)
+@samp{k}  =>  @math{1000^1 = 10^3} (Kilo) (lowercase used on output)
 @samp{M}  =>  @math{1000^2 = 10^6} (Mega)
 @samp{G}  =>  @math{1000^3 = 10^9} (Giga)
 @samp{T}  =>  @math{1000^4 = 10^{12}} (Tera)
@@ -19157,7 +19158,8 @@ For output numbers, values larger than 1024 will be rounded, and printed with
 one of the following suffixes:
 
 @example
-@samp{K}  =>  @math{1024^1 = 2^{10}} (Kibi)
+@samp{K}  =>  @math{1024^1 = 2^{10}} (Kibi) (uppercase used on output)
+@samp{k}  =>  @math{1024^1 = 2^{10}} (Kibi) (lowercase accepted on input)
 @samp{M}  =>  @math{1024^2 = 2^{20}} (Mebi)
 @samp{G}  =>  @math{1024^3 = 2^{30}} (Gibi)
 @samp{T}  =>  @math{1024^4 = 2^{40}} (Tebi)
@@ -19182,7 +19184,8 @@ For output numbers, values larger than 1024 will be rounded, and printed with
 one of the following suffixes:
 
 @example
-@samp{Ki}  =>  @math{1024^1 = 2^{10}} (Kibi)
+@samp{Ki}  =>  @math{1024^1 = 2^{10}} (Kibi) (uppercase used on output)
+@samp{ki}  =>  @math{1024^1 = 2^{10}} (Kibi) (lowercase accepted on input)
 @samp{Mi}  =>  @math{1024^2 = 2^{20}} (Mebi)
 @samp{Gi}  =>  @math{1024^3 = 2^{30}} (Gibi)
 @samp{Ti}  =>  @math{1024^4 = 2^{40}} (Tebi)
@@ -19212,7 +19215,7 @@ are interpreted as @emph{IEC} values.
 Converting a single number from/to @emph{human} representation:
 @example
 $ numfmt --to=si 50
-500K
+500k
 
 $ numfmt --to=iec 50
 489K
@@ -19258,9 +19261,9 @@ output sizes in human-readable format):
 @example
 # Third field (file size) will be shown in SI representation
 $ ls -log | numfmt --field 3 --header --to=si | head -n4
--rw-r--r--  1 94K Aug 23  2011 ABOUT-NLS
--rw-r--r--  13.7K Jan  7 16:15 AUTHORS
--rw-r--r--  1 36K Jun  1  2011 COPYING
+-rw-r--r--  1 94k Aug 23  2011 ABOUT-NLS
+-rw-r--r--  13.7k Jan  7 16:15 AUTHORS
+-rw-r--r--  1 36k Jun  1  2011 COPYING
 -rw-r--r--  1   0 Jan  7 15:15 ChangeLog
 
 # Second field (size) will be shown in IEC representation
@@ -19277,30 +19280,30 @@ Output can be tweaked using @option{--padding} or @option{--format}:
 @example
 # Pad to 10 characters, right-aligned
 $ du -s * | numfmt --to=si --padding=10
-  2.5K config.log
+  2.5k config.log
108 config.status
-

bug#47103: numfmt: invalid suffix 'k'

2023-11-26 Thread Pádraig Brady

On 25/11/2023 21:27, Sven Köhler wrote:

Not only --from=si is broken. Also --to=si is broken:

$ numfmt --to=si 3000
3,0K

In order to not break backwards compatibility, you probably have to
introduce a switch --lowercase-kilo such that --to=si produces proper SI
compliant output. Then have --from=si accept both uppercase and lowercase k.

I have to say, that uppercause K is quite common, but it is not correct
as far as SI prefixes are concerned.

Also note that Ki in iec-i mode is quite correct. I'm torn about iec
mode. I believe that people silently switch 1000 for 1024 and use the
lower case k as well as uppercase K. Maybe numfmt should have an option
to accept/produce both here as well?

Is there really a standard/specification that allows k/K for 1024?
Wikipedia only lists Ki as IEC prefixes.


I was thinking we only supported uppercase K for compat with output from 
existing coreutils.
But in fact it's quite the opposite. Other coreutils output lowercase k
when operating in SI mode. For e.g. this gives an error:

  $ ls -lh --si /bin/ | numfmt --from=si --field=5
  numfmt: invalid suffix in input: ‘54k’

So we should at least accept lowercase k.

As for outputting lowercase k for the SI case, the coreutils texinfo has the 
following
in relation to these Kilo prefixes:

  ‘kB’
   kilobyte: 10^3 = 1000.
  ‘k’
  ‘K’
  ‘KiB’
   kibibyte: 2^{10} = 1024.  ‘K’ is special: the SI prefix is ‘k’ and
   the ISO/IEC 8-13 prefix is ‘Ki’, but tradition and POSIX use
   ‘k’ to mean ‘KiB’.

So one might be conservative here and keep outputting uppercase K in SI mode.
However the above is really in relation to specifying block sizes, to df or dd 
etc.,
so we should probably output lower case k for consistency with other coreutils 
at least.
We could be conservative here and have a new --to=Si option (note the casing)
to explicitly select/allow variable cased SI prefixes, but I'm not sure that's 
needed.

For IEC mode, we should could just allow uppercase K,
but it's simpler and more flexible to accept lowercase k here, without much 
ambiguity.

As for not allowing uppercase K in SI mode, that's probably overkill,
and would cause more problems than it would solve. One edge case
it would solve is when working with a Kelvin suffix, to avoid
the ambiguity in the first case of the following.
That's too much of an edge case to worry about I think:

  $ numfmt --suffix=K --from=si 500K
  500K
  $ numfmt --suffix=K --from=si 500M
  5K
  $ numfmt --suffix=K --from=si 500KK
  50K


The attached make the adjustment to allow 'k' always,
and output 'k' in SI mode. Tests will need adjusting,
but no need to clutter the discussion patch with that.

cheers,
Pádraig.diff --git a/src/numfmt.c b/src/numfmt.c
index a5bdd2f4f..3a684e709 100644
--- a/src/numfmt.c
+++ b/src/numfmt.c
@@ -228,7 +228,7 @@ default_scale_base (enum scale_type scale)
 }
 }
 
-static char const zero_and_valid_suffixes[] = "0KMGTPEZYRQ";
+static char const zero_and_valid_suffixes[] = "0KkMGTPEZYRQ";
 static char const *valid_suffixes = 1 + zero_and_valid_suffixes;
 
 static inline bool
@@ -242,6 +242,7 @@ suffix_power (const char suf)
 {
   switch (suf)
 {
+case 'k':  /* kilo.  */
 case 'K':  /* kilo or kibi.  */
   return 1;
 
@@ -811,7 +812,7 @@ double_to_human (long double val, int precision,
   int prec = user_precision == -1 ? show_decimal_point : user_precision;
 
   return snprintf (buf, buf_size, fmt, prec, val,
-   suffix_power_char (power),
+   power == 1 && scale == scale_SI ? "k" : suffix_power_char (power),
&"i"[! (scale == scale_IEC_I && 0 < power)],
suffix ? suffix : "");
 }
@@ -946,12 +947,13 @@ UNIT options:\n"), stdout);
   fputs (_("\
   auto   accept optional single/two letter suffix:\n\
1K = 1000,\n\
+   1k = 1000,\n\
1Ki = 1024,\n\
1M = 100,\n\
1Mi = 1048576,\n"), stdout);
   fputs (_("\
   si accept optional single letter suffix:\n\
-   1K = 1000,\n\
+   1k = 1000,\n\
1M = 100,\n\
...\n"), stdout);
   fputs (_("\


bug#67160: defect with linux expr command on chromebook build 118

2023-11-14 Thread Pádraig Brady

tag 67160 notabug
close 67160
stop

On 14/11/2023 06:42, r n wrote:

I have a lenovo slim 3 chromebook with linux version 5.15.130-etc and when
expr encounter an asterisk for multiplying args, it return a syntax error
unexpected argument 'code 1'. All other arithmetic operators are correctly
interpreted. Please let me know if you need more information to reproduce
the issue.


This sounds like the '*' is not appropriately quoted,
and the shell is expanding it.

For details see:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#expr-2-_002a-3-does-not-work

thanks,
Pádraig





bug#66919: ls -l output is misaligned

2023-11-03 Thread Pádraig Brady

On 03/11/2023 15:00, Vitaly Chikunov wrote:

Hi,

coreutils 9.4.0.24.75e248 seems to have regression where ls -l output is
misaligned starting from size column. Example:

   /tmp$ ls -la
   total 8
   drwxrwxrwt 18 root root 380 Nov  3 17:50 .
   drwxr-xr-x 26 root root 4096 Jul 24 18:44 ..
   drwxrwxrwt  2 root root 40 Nov  2 15:20 .ICE-unix
   -r--r--r--  1 root root 11 Nov  2 15:20 .X0-lock
   drwxrwxrwt  2 root root 60 Nov  2 15:20 .X11-unix
   drwxrwxrwt  2 root root 40 Nov  2 15:20 .XIM-unix
   drwxrwxrwt  2 root root 40 Nov  2 15:20 .font-unix
   drwxr-xr-x  5 root root 100 Nov  2 15:21 .private

Thanks,

Downstream bug report for the record:
   [1] https://bugzilla.altlinux.org/47724


Fixed with:
https://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=7f2c97a24

Marking this as done.

thanks!
Pádraig





bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive

2023-10-25 Thread Pádraig Brady

On 25/10/2023 02:30, Paul Eggert wrote:

On 10/23/23 06:08, Pádraig Brady wrote:

However the default operation should be the
most common requirement (and also the RFC documented operation in this
case).
A similar case I hit very frequently is pasting hex into bc, and it's
very annoying to have to convert to uppercase before doing this.


Doesn't the isbase16 function also need updating?


Good spot. I'm adding this, and also a test
for this --ignore-garbage case.

diff --git a/src/basenc.c b/src/basenc.c
 isbase16 (char ch)
 {
-  return ('0' <= ch && ch <= '9') || ('A' <= ch && ch <= 'F');
+  return isxdigit (to_uchar (ch));
 }



Also, shouldn't we do something similar for base 32, for consistency?


I was wondering about that.

I previously checked the RFC which didn't mention lower case for base32.
But thinking about it more we probably should allow lower case for base32.

This is also related to the base64 padding change I think, in that we
might add a --strict option to only accept canonical (upper case base32) inputs.
That would also only accept canonical padding, and canonical encoding
of the trailing bits. For example would reject 'SGVsbG9='. See:
https://eprint.iacr.org/2022/361.pdf

thanks,
Pádraig.





bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive

2023-10-23 Thread Pádraig Brady

On 23/10/2023 13:50, Niels Möller wrote:

Pádraig Brady  writes:


Will apply the attached later.
Marking this as done.


Thanks! It would make some sense to me to also have options
--upper/--lower; on encoding, they would specify case of the output, on
decoding, they would reject the other case (with default being to accept
either). But less important than fixing the default behavior.


I was thinking `tr '[:lower:]' '[:upper:]'` would suffice for that when 
encoding.
When decoding I don't see much need for the strictness, but that could
also be enforced easily by prefiltering with something like `tr 'A-F' x`

The same argument could be made of course for not needing this patch at all,
by prefiltering through tr.  However the default operation should be the
most common requirement (and also the RFC documented operation in this case).
A similar case I hit very frequently is pasting hex into bc, and it's
very annoying to have to convert to uppercase before doing this.


+  basenc --base16 -d no supports lower case hexadecimal characters.
+  Previously an error was given for lower case hex digits.


s/ no / now /


Thanks, pushed.

Pádraig.






bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive

2023-10-23 Thread Pádraig Brady

On 23/10/2023 10:37, Niels Möller wrote:

Hi,

the docs for basenc --base16 says "hex encoding (RFC4648 section 8)".
The referenced section in that RFC says

   Essentially, Base 16 encoding is the standard case-insensitive hex
   encoding and may be referred to as "base16" or "hex".

I think it would be both more useful, and consistent with docs, if
basenc -d --base16 accepted either upper- or lowercase hex digits.

Current behavior, with basenc (GNU coreutils) 9.1:

   $ echo 666F6F0A |basenc --base16 -d
   foo
   $ echo 666F6f0A |basenc --base16 -d
   fobasenc: invalid input

I think both inputs should give the same output, "foo\n", at least by
default. Possibly configurable with options like --strict, --upper,
--lower, etc (--upper/--lower would be useful also for the --base16
encoding, i.e., no -d).


Agreed.
Will apply the attached later.
Marking this as done.

thanks,
PádraigFrom 69f8e90185e518d1722ed6a036f4b18779553e49 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 23 Oct 2023 12:51:19 +0100
Subject: [PATCH] basenc: --base16: support lower case hex digits

* src/basenc.c (base16_decode_ctx): Convert to uppercase
before converting from hex.
* tests/basenc/basenc.pl: Add a test case.
* NEWS: Mention the change in behavior.
Addresses https://bugs.gnu.org/66698
---
 NEWS   | 3 +++
 src/basenc.c   | 2 +-
 tests/basenc/basenc.pl | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 93f98b99d..56c2a4785 100644
--- a/NEWS
+++ b/NEWS
@@ -19,6 +19,9 @@ GNU coreutils NEWS-*- outline -*-
   base32 and base64 no longer require padding when decoding.
   Previously an error was given for non padded encoded data.
 
+  basenc --base16 -d no supports lower case hexadecimal characters.
+  Previously an error was given for lower case hex digits.
+
   ls --dired now implies long format output without hyperlinks enabled,
   and will take precedence over previously specified formats or hyperlink mode.
 
diff --git a/src/basenc.c b/src/basenc.c
index 12021e900..74cf03a49 100644
--- a/src/basenc.c
+++ b/src/basenc.c
@@ -577,7 +577,7 @@ base16_decode_ctx (struct base_decode_context *ctx,
   continue;
 }
 
-  int nib = *in++;
+  int nib = c_toupper (*in++);
   if ('0' <= nib && nib <= '9')
 nib -= '0';
   else if ('A' <= nib && nib <= 'F')
diff --git a/tests/basenc/basenc.pl b/tests/basenc/basenc.pl
index de20d2dbc..2b0e79e93 100755
--- a/tests/basenc/basenc.pl
+++ b/tests/basenc/basenc.pl
@@ -159,6 +159,7 @@ my @Tests =
  ['b16_7', '--base16 -d', {IN=>'G'}, {EXIT=>1},
   {ERR=>"$prog: invalid input\n"}],
  ['b16_8', '--base16 -d', {IN=>"AB\nCD"}, {OUT=>"\xAB\xCD"}],
+ ['b16_9', '--base16 -d', {IN=>lc ($base16_out)},  {OUT=>$base16_in}],
 
 
 
-- 
2.41.0



bug#66519: b2sum, md5sum sha*sum etc broken on filenames including backslash

2023-10-13 Thread Pádraig Brady

tag 66519 notabug
close 66519
stop

On 13/10/2023 13:31, Simon Richter M. Sc. wrote:

I noticed some broken checksums with leading backslash and wrong
filenames in my checksum files because the original filenames contained
a backslash.

Way to reproduce:
% touch test\\test.file
% b2sum test\\test.file

expected output:
786a02f742015903c6c6fd852552d272912f4740e15847618a86e217f71f5419d25e1031afee585313896444934eb04b903a685b1448b755d56f701afe9be2ce
   test\test.file

real broken output:
\786a02f742015903c6c6fd852552d272912f4740e15847618a86e217f71f5419d25e1031afee585313896444934eb04b903a685b1448b755d56f701afe9be2ce
   test\\test.file


This is expected.
File names with problematic characters like \n are escaped as above.
Note we escape '\' itself to provide some forward compatibility
to introduce escaping of other characters.


Tested with coreutils 9.3 and coreutils 9.4 and LC_ALL=C
Btw any chance we get b3sum included in coreutils?


Yes we'll look at this,
but it will only be provided through the `cksum -a blake3` interface.

cheers,
Pádraig





bug#66265: Make padding optional with base64

2023-10-07 Thread Pádraig Brady

Pushed.

Marking this as done.

cheers,
Pádraig





bug#66265: Make padding optional with base64

2023-10-06 Thread Pádraig Brady

On 29/09/2023 15:11, Pádraig Brady wrote:

On 29/09/2023 10:46, Paul Millar wrote:

Hi,

RFC 4648 says[1]:
   >   In some circumstances, the use of padding ("=") in base-encoded data
   >   is not required or used.

Currently, the 'base64' application always includes the padding when
encoding, and prints an warning/error message (on stderr) if padding is
omitted when decoding.  Decoding is nonetheless successful (the correct
data is emitted on stdout) if the base64-encoded data omits the padding.

I think the base64 application should be updated to support
base64-encoded data without padding.

My suggestion would be to add an option to base64 to control whether
padding is added when encoding.  For decoding, it might make sense to
add an option to control whether padding is expected.

(although, other approaches might be possible)

Cheers,
Paul.

[1] https://datatracker.ietf.org/doc/html/rfc4648#section-3.2


I agree with this actually.

The main advantage of padding as I see it is when concatenating encoded data.
I.e. that's the only use case where there could be ambiguity in the received
encoded data, as otherwise one can auto assume appropriate padding based on
the input length. I can't see `base64` or `basenc --base64` being used in a
problematic streaming context like that. If the utils themselves read from
a stream in a _single invocation_ they'll deal with partial blocks 
appropriately.
Looking at it another way, I don't see how base64 data over _separate 
invocations_
of these utils could be handled now, as they just give errors in this case 
anyway.
Now there would be a slight loss in error detection as truncated data would
not be noticed, however that is the case for a third of truncated sizes already.

So on input I'm inclined to auto pad.

On output one can trivially remove padding with `tr -d =`,
so I'm inclined to leave that as is.

I.e. we shouldn't need a new option for any of this.


Implementation attached

cheers,
PádraigFrom 6b00e41e5ae6de8e9f269172e62cb40d73fce5c9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Thu, 5 Oct 2023 17:00:51 +0100
Subject: [PATCH] basenc: auto pad base32 and base64 inputs when decoding

Padding of encoded data is useful in cases where
base64 encoded data is concatenated / streamed.
I.e. where there are padding chars _within_ the stream.
In other cases padding is optional and can be inferred.
Note we continue to treat partially padded encoded data
as invalid, as that may be indicative of truncation.

* src/basenc.c (do_decode): Auto pad the end of the input.
* NEWS: Mention the change in behavior.
* tests/misc/base64.pl: Adjust to not fail for missing padding.
Addresses https://bugs.gnu.org/66265
---
 NEWS |  3 +++
 src/basenc.c | 62 +---
 tests/misc/base64.pl |  6 ++---
 3 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index 18f80cb4c..93f98b99d 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,9 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Changes in behavior
 
+  base32 and base64 no longer require padding when decoding.
+  Previously an error was given for non padded encoded data.
+
   ls --dired now implies long format output without hyperlinks enabled,
   and will take precedence over previously specified formats or hyperlink mode.
 
diff --git a/src/basenc.c b/src/basenc.c
index ce259c482..12021e900 100644
--- a/src/basenc.c
+++ b/src/basenc.c
@@ -23,6 +23,7 @@
 #include 
 
 #include "system.h"
+#include "assure.h"
 #include "c-ctype.h"
 #include "fadvise.h"
 #include "quote.h"
@@ -172,10 +173,37 @@ from any other non-alphabet bytes in the encoded stream.\n"),
   exit (status);
 }
 
+#if BASE_TYPE != 64
+static int
+base32_required_padding (int len)
+{
+  int partial = len % 8;
+  return partial ? 8 - partial : 0;
+}
+#endif
+
+#if BASE_TYPE != 32
+static int
+base64_required_padding (int len)
+{
+  int partial = len % 4;
+  return partial ? 4 - partial : 0;
+}
+#endif
+
+#if BASE_TYPE == 42
+static int
+no_required_padding (int len)
+{
+  return 0;
+}
+#endif
+
 #define ENC_BLOCKSIZE (1024 * 3 * 10)
 
 #if BASE_TYPE == 32
 # define BASE_LENGTH BASE32_LENGTH
+# define REQUIRED_PADDING base32_required_padding
 /* Note that increasing this may decrease performance if --ignore-garbage
is used, because of the memmove operation below.  */
 # define DEC_BLOCKSIZE (1024 * 5)
@@ -191,6 +219,7 @@ static_assert (DEC_BLOCKSIZE % 40 == 0); /* Complete encoded blocks are used. */
 # define isbase isbase32
 #elif BASE_TYPE == 64
 # define BASE_LENGTH BASE64_LENGTH
+# define REQUIRED_PADDING base64_required_padding
 /* Note that increasing this may decrease performance if --ignore-garbage
is used, because of the memmove operation below.  */
 # define DEC_BLOCKSIZE (1024 * 3)
@@ -208,6 +237,7 @@ static_assert (DEC_BLOCKSIZE % 12 == 0); /* Complete enco

bug#66265: Make padding optional with base64

2023-09-29 Thread Pádraig Brady

On 29/09/2023 10:46, Paul Millar wrote:

Hi,

RFC 4648 says[1]:
  >   In some circumstances, the use of padding ("=") in base-encoded data
  >   is not required or used.

Currently, the 'base64' application always includes the padding when
encoding, and prints an warning/error message (on stderr) if padding is
omitted when decoding.  Decoding is nonetheless successful (the correct
data is emitted on stdout) if the base64-encoded data omits the padding.

I think the base64 application should be updated to support
base64-encoded data without padding.

My suggestion would be to add an option to base64 to control whether
padding is added when encoding.  For decoding, it might make sense to
add an option to control whether padding is expected.

(although, other approaches might be possible)

Cheers,
Paul.

[1] https://datatracker.ietf.org/doc/html/rfc4648#section-3.2


I agree with this actually.

The main advantage of padding as I see it is when concatenating encoded data.
I.e. that's the only use case where there could be ambiguity in the received
encoded data, as otherwise one can auto assume appropriate padding based on
the input length. I can't see `base64` or `basenc --base64` being used in a
problematic streaming context like that. If the utils themselves read from
a stream in a _single invocation_ they'll deal with partial blocks 
appropriately.
Looking at it another way, I don't see how base64 data over _separate 
invocations_
of these utils could be handled now, as they just give errors in this case 
anyway.
Now there would be a slight loss in error detection as truncated data would
not be noticed, however that is the case for a third of truncated sizes already.

So on input I'm inclined to auto pad.

On output one can trivially remove padding with `tr -d =`,
so I'm inclined to leave that as is.

I.e. we shouldn't need a new option for any of this.

cheers,
Pádraig





bug#66256: sorting NAN values with "general-numeric’

2023-09-28 Thread Pádraig Brady

On 28/09/2023 11:43, Jorge Stolfi wrote:


The full documentation of the "--general-numeric-sort" option of
{sort} says that NaN values are sorted "in a consistent but
machine-dependent order".

This is not good. The point of the IEEE floating-point standard was to
make the results of floating-point computations be independent of the
platform or implementation.

Please consider extending that goal to the handling of NaNs by {sort}.
   That it, all flavors of NaN (determined by their char tails, as
parsed by {strtod}) should be treated as equal.

The fact that different flavors of NaN have distinct binary
representation is not an excuse to sort them as distinct, since the
same is true of +0 and -0, which "general-numeric" sort already treats
as equal.

As a separate suggestion, please consider having {sort} abort with an
error message if any field that is supposed to be sorted with
"general-numeric" is not a valid {double} value, or has some leftover
chars that are not parsed by {strtod}.

Whether these solutions are accepted or not, please change the manpage
explanation of "-g"/"--general-numeric-sort" to say, at least, "the
field is parsed as a double-precision (64-bit) floating-point number
and sorted by its numeric value".

Thanks, and all the best,


No comment on the actual ordering of NaNs, but
note NaN ordering changed recently in coreutils 9.2,
as discussed at https://bugs.gnu.org/55212

cheers,
Pádraig





  1   2   3   4   5   6   7   8   9   10   >