bug#48189: ginstall: memory leak when omitting a directory

2021-05-03 Thread Paul Eggert
That one's not a real bug. 'install' is exiting, rather than calling 
'free' a couple of times just before exiting; calling 'free' would 
simply chew up runtime resources for no reason other than to pacify 
AddressSanitizer. So I'll close this particular bug report.


Most memory leaks found by AddressSanitizer in coreutils are false 
alarms. That being said, if you find one that isn't a false alarm we'd 
be interested in hearing about it. Stack overflows are also good to 
report too, except for tricky user-specified regular expressions (which 
require exponential resources in the worst case, no matter what the 
implementation is).






bug#48106: bug: touch utility does not handle file create error properly

2021-05-01 Thread Paul Eggert

Thanks for reporting the problem. I installed the attached to fix it.
>From aaa0f003303ab90778b6b426cd9e5a1f1d137ffc Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 1 May 2021 15:19:16 -0700
Subject: [PATCH] touch: fix wrong diagnostic (Bug#48106)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Roland (Bug#48106).
* src/touch.c (touch): Take more care when deciding whether
to use open_errno or utime_errno in the diagnostic.
Stop worrying about SunOS 4 (which as part of the problem),
as it’s long obsolete.  For Solaris 10, verify that EINVAL
really means the file was a directory.
---
 src/touch.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/src/touch.c b/src/touch.c
index 653fd313b..46ddd86bb 100644
--- a/src/touch.c
+++ b/src/touch.c
@@ -122,7 +122,6 @@ get_reldate (struct timespec *result,
 static bool
 touch (char const *file)
 {
-  bool ok;
   int fd = -1;
   int open_errno = 0;
   struct timespec const *t = newtime;
@@ -134,12 +133,7 @@ touch (char const *file)
   /* Try to open FILE, creating it if necessary.  */
   fd = fd_reopen (STDIN_FILENO, file,
   O_WRONLY | O_CREAT | O_NONBLOCK | O_NOCTTY, MODE_RW_UGO);
-
-  /* Don't save a copy of errno if it's EISDIR, since that would lead
- touch to give a bogus diagnostic for e.g., 'touch /' (assuming
- we don't own / or have write access to it).  On Solaris 5.6,
- and probably other systems, it is EINVAL.  On SunOS4, it's EPERM.  */
-  if (fd == -1 && errno != EISDIR && errno != EINVAL && errno != EPERM)
+  if (fd < 0)
 open_errno = errno;
 }
 
@@ -162,9 +156,10 @@ touch (char const *file)
   t = NULL;
 }
 
-  ok = (fdutimensat (fd, AT_FDCWD, (fd == STDOUT_FILENO ? NULL : file), t,
- (no_dereference && fd == -1) ? AT_SYMLINK_NOFOLLOW : 0)
-== 0);
+  char const *file_opt = fd == STDOUT_FILENO ? NULL : file;
+  int atflag = no_dereference ? AT_SYMLINK_NOFOLLOW : 0;
+  int utime_errno = (fdutimensat (fd, AT_FDCWD, file_opt, t, atflag) == 0
+ ? 0 : errno);
 
   if (fd == STDIN_FILENO)
 {
@@ -177,13 +172,22 @@ touch (char const *file)
   else if (fd == STDOUT_FILENO)
 {
   /* Do not diagnose "touch -c - >&-".  */
-  if (!ok && errno == EBADF && no_create)
+  if (utime_errno == EBADF && no_create)
 return true;
 }
 
-  if (!ok)
+  if (utime_errno != 0)
 {
-  if (open_errno)
+  /* Don't diagnose with open_errno if FILE is a directory, as that
+ would give a bogus diagnostic for e.g., 'touch /' (assuming we
+ don't own / or have write access).  On Solaris 10 and probably
+ other systems, opening a directory like "." fails with EINVAL.
+ (On SunOS 4 it was EPERM but that's obsolete.)  */
+  struct stat st;
+  if (open_errno
+  && ! (open_errno == EISDIR
+|| (open_errno == EINVAL
+&& stat (file, ) == 0 && S_ISDIR (st.st_mode
 {
   /* The wording of this diagnostic should cover at least two cases:
  - the file does not exist, but the parent directory is unwritable
@@ -193,9 +197,9 @@ touch (char const *file)
 }
   else
 {
-  if (no_create && errno == ENOENT)
+  if (no_create && utime_errno == ENOENT)
 return true;
-  error (0, errno, _("setting times of %s"), quoteaf (file));
+  error (0, utime_errno, _("setting times of %s"), quoteaf (file));
 }
   return false;
 }
-- 
2.27.0



bug#48034: git coreutils ./bootstrap failure

2021-04-27 Thread Paul Eggert

On 4/26/21 3:29 AM, David L. Craig wrote:


configure.ac:55: warning: The macro `AC_PROG_CC_STDC' is obsolete.


I reproduced these warnings; they come from Autoconf 2.71. You can 
ignore them, or you can update to the latest coreutils on Savannah, 
where I installed some patches (attached) to silence these.



doc/local.mk:19: installing 'build-aux/mdate-sh'
Makefile.am:213:   'doc/local.mk' included from here
doc/local.mk:19: error: required file 'build-aux/texinfo.tex' not found
Makefile.am:213:   'doc/local.mk' included from here


I don't observe this problem. gnulib-tool sets up a symlink that works:

$ ls -l build-aux/texinfo.tex gnulib/build-aux/texinfo.tex
lrwxrwxrwx 1 eggert eggert 31 Apr 26 23:59 build-aux/texinfo.tex -> 
../gnulib/build-aux/texinfo.tex

-rw-rw-r-- 1 eggert eggert 379274 Apr 26 23:56 gnulib/build-aux/texinfo.tex

If you still observe a problem with a fresh build from scratch, please 
investigate why gnulib-tool isn't setting up that link for you
From 1c97d9e2af40003a8ad872fb7f31d4e616f52532 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 26 Apr 2021 18:02:16 -0700
Subject: [PATCH 1/3] build: update gnulib submodule to latest

* src/csplit.c (load_buffer):
* src/pinky.c (create_fullname):
Use intprops-based checks rather than xalloc_oversized,
since Gnulib xalloc.h no longer includes xalloc-oversized.h.
---
 gnulib   | 2 +-
 src/csplit.c | 3 +--
 src/pinky.c  | 6 +++---
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gnulib b/gnulib
index e54b645fc..354b9691a 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit e54b645fc6b8422562327443bda575c65d931fbd
+Subproject commit 354b9691accd00a531358b652689ce7f580fbe54
diff --git a/src/csplit.c b/src/csplit.c
index ee9aa6503..79bd034e3 100644
--- a/src/csplit.c
+++ b/src/csplit.c
@@ -518,9 +518,8 @@ load_buffer (void)
   if (lines_found || have_read_eof)
 break;
 
-  if (xalloc_oversized (2, b->bytes_alloc))
+  if (INT_MULTIPLY_WRAPV (b->bytes_alloc, 2, _wanted))
 xalloc_die ();
-  bytes_wanted = 2 * b->bytes_alloc;
   free_buffer (b);
   free (b);
 }
diff --git a/src/pinky.c b/src/pinky.c
index 23a43f5e4..6fea94923 100644
--- a/src/pinky.c
+++ b/src/pinky.c
@@ -110,9 +110,9 @@ create_fullname (char const *gecos_name, char const *user_name)
   if (ampersands != 0)
 {
   size_t ulen = strlen (user_name);
-  size_t product = ampersands * ulen;
-  rsize += product - ampersands;
-  if (xalloc_oversized (ulen, ampersands) || rsize < product)
+  size_t product;
+  if (INT_MULTIPLY_WRAPV (ulen, ampersands - 1, )
+  || INT_ADD_WRAPV (rsize, product, ))
 xalloc_die ();
 }
 
-- 
2.27.0

From 1671951e05096200aeb87ca6729fcccd2901 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 26 Apr 2021 20:04:19 -0700
Subject: [PATCH 2/3] csplit: size_t overflow check

* src/csplit.c (get_new_buffer): Fix unlikely size_t overflow.
---
 src/csplit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/csplit.c b/src/csplit.c
index 79bd034e3..f188e8894 100644
--- a/src/csplit.c
+++ b/src/csplit.c
@@ -416,7 +416,8 @@ get_new_buffer (size_t min_size)
   if (alloc_size < min_size)
 {
   size_t s = min_size - alloc_size + INCR_SIZE - 1;
-  alloc_size += s - s % INCR_SIZE;
+  if (INT_ADD_WRAPV (alloc_size, s - s % INCR_SIZE, _size))
+xalloc_die ();
 }
 
   new_buffer = create_new_buffer (alloc_size);
-- 
2.27.0

From 96a034f490595258f9069a3fed037ddc65df2c71 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 26 Apr 2021 23:27:59 -0700
Subject: [PATCH 3/3] maint: port to Autoconf 2.71

* configure.ac: Use AC_PROG_CC, not AC_PROG_CC_STDC.
* gl/modules/smack (configure.ac):
* m4/jm-macros.m4 (coreutils_MACROS):
* m4/xattr.m4 (gl_FUNC_XATTR):
Use AS_HELP_STRING, not AC_HELP_STRING.
* m4/check-decl.m4 (gl_CHECK_DECLS):
Do not require AC_HEADER_TIME; we no longer care about it directly.
* m4/jm-macros.m4 (coreutils_MACROS):
Do not require AC_ISC_POSIX, which became obsolete in 2006.
Use AC_LINK_IFELSE instead of AC_TRY_LINK.
---
 configure.ac |  2 +-
 gl/modules/smack |  2 +-
 m4/check-decl.m4 |  4 +---
 m4/jm-macros.m4  | 54 +++-
 m4/xattr.m4  |  4 ++--
 5 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/configure.ac b/configure.ac
index 7fbecbf8d..02291a4ae 100644
--- a/configure.ac
+++ b/configure.ac
@@ -52,7 +52,7 @@ m4_syscmd([test "${GNULIB_POSIXCHECK+set}" = set])
 m4_if(m4_sysval, [0], [], [dnl
 gl_ASSERT_NO_GNULIB_POSIXCHECK])
 
-AC_PROG_CC_STDC
+AC_PROG_CC
 AM_PROG_CC_C_O
 AC_PROG_CPP
 AC_PROG_GCC_TRADITIONAL
diff --git a/gl/modules/smack b/gl/modules/smack
index a6dcbaa62..1c4a541a6 100644
--- a/gl/modules/smack
+++ b/gl/modules/smack
@@ -10,7 +10,7 @@ configure.ac:
 # Check whether libsmack is available
 LIB_SMACK=
 AC_ARG_ENABLE([libsmack],
-  AC_HELP_STRING([--disable-libsmack

bug#48036: [PATCH] copy: do not refuse to copy a swap file

2021-04-26 Thread Paul Eggert

Thanks, I installed that into Savannah master coreutils.





bug#47883: sort -o loses data when it crashes

2021-04-24 Thread Paul Eggert
As I wrote you privately last month, the coreutils maintainers (who are 
not me) are pretty busy. The proposed change in bug#47883 would be 
incompatible with longstanding tradition and would almost certainly 
break some existing scripts running on GNU/Linux. This is not something 
to do lightly.


It might be possible to come up with a different change that would 
address the issue raised without being so disruptive. Whatever change 
(if any) is chosen, someone needs to think it through, code it up, 
document it, and test it. Although nobody's found the time to do that, 
perhaps you could volunteer or find someone who could volunteer; that 
would surely accelerate the process.


You mentioned that we have multiple bug reports (now 47059, 47883, 
48002) on basically the same topic, so I have taken the liberty of 
merging them.






bug#47883: sort -o loses data when it crashes

2021-04-21 Thread Paul Eggert

On 4/18/21 10:46 AM, Peter van Dijk wrote:

While the manual (but not the manpage) mentions the data loss, I think it would 
be great if sort did not have this problem at all, and I think the OpenGroup 
text also says it should not have this problem.


I don't know of any 'sort' implementation that does not have the problem 
at all. For example, FreeBSD 'sort -o file file' can lose 'file' in some 
(rare) cases. The only portable way to avoid this problem in a shell 
script is to output to some other file first and make sure that worked, 
before attempting to replace the input file.


Also, I don't see where the Open Group spec says what you're saying. On 
the contrary, the spec merely says that '-o output' should cause output 
to be sent to the output file. If there are multiple hard links to the 
output file, this suggests 'sort' should update the output file's 
contents without breaking any hard links. Admittedly the Open Group spec 
is a bit vague in this area, but I certainly don't see anything implying 
that GNU 'sort' does not conform to POSIX in this area.


FreeBSD 'sort' has a problem, in that 'sort -o A B' preserves all hard 
links to A's file, but 'sort -o A A' does not because it breaks the link 
from A. That's confusing.


Traditional Unix 'sort -o A' behaves the way GNU 'sort' does; it 
preserves all hard links to A's file. So there is a compatibility 
argument for doing things the way GNU 'sort' does them, even if that 
might lead to more data loss in rare cases.






bug#47412: env: fragile argument parsing

2021-03-29 Thread Paul Eggert

On 3/26/21 3:21 PM, Paul Eggert wrote:
The -S code could use some more fixes in this area too - it can 
probably still dump core on platforms like the Hurd that don't limit 
exec arg size - but one thing at a time. 


I fixed the (unlikely) bugs I found in this area by installing the attached.

>From e3766c5db176ca7abbb8212d5b0b7862fb98a5be Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 29 Mar 2021 21:42:44 -0700
Subject: [PATCH] env: simplify --split-string memory management

* bootstrap.conf (gnulib_modules): Add idx.
* src/env.c: Include idx.h, minmax.h.
Prefer idx_t to ptrdiff_t when values are nonnegative.
(valid_escape_sequence, escape_char, validate_split_str)
(CHECK_START_NEW_ARG):
Remove; no longer needed now that we validate as we go.
(struct splitbuf): New type.
(splitbuf_grow, splitbuf_append_byte, check_start_new_arg)
(splitbuf_finishup): New functions.
(build_argv): New arg ARGC.  Validate and process in one go, using
the new functions; this is simpler and more reliable than the old
approach (as witness the recent bug).  Avoid integer overflow in
the unlikely case where the string contains more than INT_MAX
arguments.
(parse_split_string): Simplify by exploiting the new build_argv.
---
 bootstrap.conf |   1 +
 src/env.c  | 385 ++---
 2 files changed, 173 insertions(+), 213 deletions(-)

diff --git a/bootstrap.conf b/bootstrap.conf
index ab6b3ef0c..f55da99db 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -134,6 +134,7 @@ gnulib_modules="
   host-os
   human
   idcache
+  idx
   ignore-value
   inttostr
   inttypes
diff --git a/src/env.c b/src/env.c
index 11db374d9..e2ab39fd5 100644
--- a/src/env.c
+++ b/src/env.c
@@ -26,6 +26,8 @@
 #include "system.h"
 #include "die.h"
 #include "error.h"
+#include "idx.h"
+#include "minmax.h"
 #include "operand2sig.h"
 #include "quote.h"
 #include "sig2str.h"
@@ -41,14 +43,14 @@
 /* Array of envvars to unset.  */
 static const char **usvars;
 static size_t usvars_alloc;
-static ptrdiff_t usvars_used;
+static idx_t usvars_used;
 
 /* Annotate the output with extra info to aid the user.  */
 static bool dev_debug;
 
 /* Buffer and length of extracted envvars in -S strings.  */
 static char *varname;
-static ptrdiff_t vnlen;
+static idx_t vnlen;
 
 /* Possible actions on each signal.  */
 enum SIGNAL_MODE {
@@ -175,7 +177,7 @@ append_unset_var (const char *var)
 static void
 unset_envvars (void)
 {
-  for (ptrdiff_t i = 0; i < usvars_used; ++i)
+  for (idx_t i = 0; i < usvars_used; ++i)
 {
   devmsg ("unset:%s\n", usvars[i]);
 
@@ -190,29 +192,6 @@ unset_envvars (void)
   IF_LINT (usvars_alloc = 0);
 }
 
-static bool _GL_ATTRIBUTE_PURE
-valid_escape_sequence (const char c)
-{
-  return (c == 'c' || c == 'f' || c == 'n' || c == 'r' || c == 't' || c == 'v' \
-  || c == '#' || c == '$' || c == '_' || c == '"' || c == '\'' \
-  || c == '\\');
-}
-
-static char _GL_ATTRIBUTE_PURE
-escape_char (const char c)
-{
-  switch (c)
-{
-/* \a, \b not supported by FreeBSD's env.  */
-case 'f': return '\f';
-case 'n': return '\n';
-case 'r': return '\r';
-case 't': return '\t';
-case 'v': return '\v';
-default: assume (false);
-}
-}
-
 /* Return a pointer to the end of a valid ${VARNAME} string, or NULL.
'str' should point to the '$' character.
First letter in VARNAME must be alpha or underscore,
@@ -241,7 +220,7 @@ scan_varname (const char *str)
 static char *
 extract_varname (const char *str)
 {
-  ptrdiff_t i;
+  idx_t i;
   const char *p;
 
   p = scan_varname (str);
@@ -263,150 +242,127 @@ extract_varname (const char *str)
   return varname;
 }
 
-/* Validate the "-S" parameter, according to the syntax defined by FreeBSD's
-   env(1).  Terminate with an error message if not valid.
+/* Temporary buffer used by --split-string processing.  */
+struct splitbuf
+{
+  /* Buffer address, arg count, and half the number of elements in the buffer.
+ ARGC and ARGV are as in 'main', and ARGC + 1 <= HALF_ALLOC so
+ that the upper half of ARGV can be used for string contents.
+ This may waste up to half the space but keeps the code simple,
+ which is better for this rarely-used but security-sensitive code.
+
+ ARGV[0] is not initialized; that is the caller's responsibility
+ after finalization.
+
+ During assembly, ARGV[I] (where 0 < I < ARGC) contains the offset
+ of the Ith string (relative to ARGV + HALF_ALLOC), so that
+ reallocating ARGV does not change the validity of its contents.
+ The integer offset is cast to char * during assembly, and is
+ converted to a true char * pointer on finalization.
+
+ During assembly, ARGV[ARGC] contains the offset of the first
+ unused string byte (relative to ARGV + HALF_ALLOC).  */
+  char **argv;
+  int argc;
+  idx_t half_alloc;
+
+ 

bug#47412: env: fragile argument parsing

2021-03-26 Thread Paul Eggert
I also installed the attached two followup patches to document this and 
issue a better warning in rare cases.


The -S code could use some more fixes in this area too - it can probably 
still dump core on platforms like the Hurd that don't limit exec arg 
size - but one thing at a time.
>From 6c4efdc0f51c8e253f16da2ec60cdf647bec3c06 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 26 Mar 2021 14:00:37 -0700
Subject: [PATCH] doc: document env fix

* NEWS, doc/coreutils.texi (env invocation): Document recent change.
---
 NEWS   | 3 +++
 doc/coreutils.texi | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/NEWS b/NEWS
index 97cb4bd64..802f4b427 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,9 @@ GNU coreutils NEWS-*- outline -*-
   heavily changed during the run.
   [bug introduced in coreutils-8.25]
 
+  env -S no longer crashes when given unusual whitespace characters
+  [bug introduced in coreutils-8.30]
+
   expr no longer mishandles unmatched \(...\) in regular expressions.
   [bug introduced in coreutils-6.0]
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index ac0b4467d..06ecdd74c 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -17592,6 +17592,8 @@ hello
 
 Running @command{env -Sstring} splits the @var{string} into
 arguments based on unquoted spaces or tab characters.
+(Newlines, carriage returns, vertical tabs and form feeds are treated
+like spaces and tabs.)
 
 In the following contrived example the @command{awk} variable
 @samp{OFS} will be @code{xyz} as these spaces are inside
-- 
2.30.2

>From 5f99c7533df49f25819d7bb850be5c6cb49aa13d Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 26 Mar 2021 14:51:55 -0700
Subject: [PATCH] env: improve whitespace warning

* src/env.c (main): Issue -S warning for any whitespace,
not just space.
---
 src/env.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/env.c b/src/env.c
index d07918fee..341777cb8 100644
--- a/src/env.c
+++ b/src/env.c
@@ -942,7 +942,7 @@ main (int argc, char **argv)
   int exit_status = errno == ENOENT ? EXIT_ENOENT : EXIT_CANNOT_INVOKE;
   error (0, errno, "%s", quote (argv[optind]));
 
-  if (exit_status == EXIT_ENOENT && strchr (argv[optind], ' '))
+  if (exit_status == EXIT_ENOENT && strpbrk (argv[optind], C_ISSPACE_CHARS))
 error (0, 0, _("use -[v]S to pass options in shebang lines"));
 
   return exit_status;
-- 
2.30.2



bug#47412: env: fragile argument parsing

2021-03-26 Thread Paul Eggert
Thanks for the bug report. I installed the attached to fix it and am 
closing the report.
>From 6dd466eda6fa3f1f7d2a9474ec926ccd2ede98e9 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 26 Mar 2021 13:49:49 -0700
Subject: [PATCH] env: fix address violation with '\v' in -S

Problem reported by Frank Busse (Bug#47412).
* src/env.c (C_ISSPACE_CHARS): New macro.
(shortopts, build_argv, main): Treate all C-locale space
characters like space and tab, for compatibility with FreeBSD.
(validate_split_str, build_argv, parse_split_string):
Use the C locale, not the current locale, to determine whether a
byte is a space character.
---
 src/env.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/env.c b/src/env.c
index ba9da1113..e13a312cd 100644
--- a/src/env.c
+++ b/src/env.c
@@ -73,7 +73,10 @@ static bool sig_mask_changed;
 /* Whether to list non default handling.  */
 static bool report_signal_handling;
 
-static char const shortopts[] = "+C:iS:u:v0 \t";
+/* isspace characters in the C locale.  */
+#define C_ISSPACE_CHARS " \t\n\v\f\r"
+
+static char const shortopts[] = "+C:iS:u:v0" C_ISSPACE_CHARS;
 
 /* For long options that have no equivalent short option, use a
non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
@@ -277,7 +280,7 @@ validate_split_str (const char* str, size_t* /*out*/ bufsize,
   size_t buflen;
   int cnt = 1;
 
-  assert (str && str[0] && !isspace (str[0])); /* LCOV_EXCL_LINE */
+  assert (str && str[0] && !c_isspace (str[0])); /* LCOV_EXCL_LINE */
 
   dq = sq = sp = false;
   buflen = strlen (str)+1;
@@ -286,7 +289,7 @@ validate_split_str (const char* str, size_t* /*out*/ bufsize,
 {
   const char next = *(str+1);
 
-  if (isspace (*str) && !dq && !sq)
+  if (c_isspace (*str) && !dq && !sq)
 {
   sp = true;
 }
@@ -392,7 +395,7 @@ build_argv (const char* str, int extra_argc)
   } \
   } while (0)
 
-  assert (str && str[0] && !isspace (str[0])); /* LCOV_EXCL_LINE */
+  assert (str && str[0] && !c_isspace (str[0]));  /* LCOV_EXCL_LINE */
 
   validate_split_str (str, , );
 
@@ -433,13 +436,12 @@ build_argv (const char* str, int extra_argc)
   ++str;
   continue;
 
-case ' ':
-case '\t':
-  /* space/tab outside quotes starts a new argument. */
+case ' ': case '\t': case '\n': case '\v': case '\f': case '\r':
+  /* Start a new argument if outside quotes.  */
   if (sq || dq)
 break;
   sep = true;
-  str += strspn (str, " \t"); /* skip whitespace. */
+  str += strspn (str, C_ISSPACE_CHARS);
   continue;
 
 case '#':
@@ -540,7 +542,7 @@ parse_split_string (const char* str, int /*out*/ *orig_optind,
   char **newargv, **nextargv;
 
 
-  while (isspace (*str))
+  while (c_isspace (*str))
 str++;
   if (*str == '\0')
 return;
@@ -848,8 +850,7 @@ main (int argc, char **argv)
 case 'S':
   parse_split_string (optarg, , , );
   break;
-case ' ':
-case '\t':
+case ' ': case '\t': case '\n': case '\v': case '\f': case '\r':
   /* These are undocumented options. Attempt to detect
  incorrect shebang usage with extraneous space, e.g.:
 #!/usr/bin/env -i command
-- 
2.30.2



bug#47412: env: fragile argument parsing

2021-03-26 Thread Paul Eggert

On 3/26/21 1:12 PM, Pádraig Brady wrote:


I'll fix it up.


I've got a fix. My goodness, that part of the code is messy.





bug#47384: [PATCH 1/2] hostname: fix a memory leak with -Dlint

2021-03-25 Thread Paul Eggert

On 3/25/21 11:16 AM, Paul Eggert wrote:
I'd prefer it to use IF_LINT (as in the earlier 
change), as that makes it cleaner since it's just one line of useless 
code, not three.


Installed as attached, and closing the bug report. Thanks again.
>From d3749c46056ddeb1314f35b1644f52179d7a3502 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 25 Mar 2021 11:20:18 -0700
Subject: [PATCH] hostname: pacify valgrind

* src/hostname.c (main) [IF_LINT]: Free hostname (Bug#47384).
---
 src/hostname.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/hostname.c b/src/hostname.c
index 008682f39..94b070582 100644
--- a/src/hostname.c
+++ b/src/hostname.c
@@ -104,6 +104,7 @@ main (int argc, char **argv)
   if (hostname == NULL)
 die (EXIT_FAILURE, errno, _("cannot determine hostname"));
   puts (hostname);
+  IF_LINT (free (hostname));
 }
 
   if (optind + 1 < argc)
-- 
2.27.0



bug#47384: [PATCH 1/2] hostname: fix a memory leak with -Dlint

2021-03-25 Thread Paul Eggert

On 3/25/21 9:33 AM, Kamil Dudka wrote:

How does it differ from the following change, which you did not block?


Thanks for reminding me of the earlier exchange. On second thought I 
guess it's OK, though I'd prefer it to use IF_LINT (as in the earlier 
change), as that makes it cleaner since it's just one line of useless 
code, not three.






bug#47383: [PATCH 2/2] ln: fix memory leaks in do_link()

2021-03-25 Thread Paul Eggert
Thanks, I installed that. I then changed "free(" to "free (" as per GNU 
style.






bug#47384: [PATCH 1/2] hostname: fix a memory leak with -Dlint

2021-03-25 Thread Paul Eggert

On 3/25/21 9:08 AM, Kamil Dudka wrote:

Wasn't that exactly what -Dlint was for when we discussed it the last time?


Sorry, don't recall the last time. This is a borderline area, admittedly.





bug#47384: [PATCH 1/2] hostname: fix a memory leak with -Dlint

2021-03-25 Thread Paul Eggert

On 3/25/21 3:57 AM, Kamil Dudka wrote:


+#ifdef lint
+  free(hostname);
+#endif


Let's not do this one. The program is about to exit so there's no need 
to free, and any static-checking tool that complains about a missing 
'free' here is issuing a false alarm. On this particular issue it's 
better to fix the tools than to clutter upb source code to pacify them.






bug#47324: Missing information in documentation

2021-03-25 Thread Paul Eggert

On 3/25/21 6:49 AM, L A Walsh wrote:

    If languages such as python and perl can document their usage
in manpages, certainly gnu could be as helpful.


You must be joking. Python and Perl are not fully documented in 
manpages. Most free-software projects treat manpages the way the Gnu 
project does, or don't even bother with manpages.






bug#47348: Possible bug coreutils : no cat command

2021-03-23 Thread Paul Eggert

On 3/23/21 4:22 AM, Luís via GNU coreutils Bug Reports wrote:

     In a reinstallation of Linux Mint 20 ( based on the ubuntu 20 focal fossa 
), I found that the cat command, supported by the Coreutils package 
(8.30-ubuntu2) is no longer working.


Please send a bug report to the Linux Mint maintainers, as this 
evidently has nothing to do with the upstream Coreutils project.






bug#47243: pr lacks -p

2021-03-18 Thread Paul Eggert

On 3/18/21 8:38 AM, Eric Blake wrote:

POSIX requires 'pr -p' to support paging (although it incorrectly stated
that it waits for \r, and is being fixed to wait for \n instead):
https://austingroupbugs.net/view.php?id=1433

During discussion of the behavior of -p today, the Austin Group was
surprised that coreutils' pr lacks -p altogether.


Given that hardly anybody uses pr any more, I'm surprised that the 
Austin Group still cares about its options. It's an obsolete utility, 
and ought to be deprecated.






bug#47085: du: why does 'usage' show prefixes 'Z' or 'Y' if they are disallowed?

2021-03-12 Thread Paul Eggert

On 3/11/21 8:53 PM, L A Walsh wrote:

Why are those suffixes listed as valid under the program 'usage'
and manpage, when they are automatically disallowed?


They are valid if your computer has wide-enough integers. As far as I 
know no platform supports Y and only one or two supports Z, but the 
documentation is future-proofing.


Conversely, if you're running on a really small computer that doesn't 
even support 64-bit integers, even 'T' is too wide.


I doubt whether it's worth complicating the manual for this minor 
detail, as the current diagnostic "'Y" too large" is accurate as far as 
it goes.






bug#47023: df utilility displays G instead of GM as unit size for Gigabytes in power of 1000

2021-03-10 Thread Paul Eggert

On 3/10/21 2:50 PM, L A Walsh wrote:

You are using a local 8-bit encoding, whereas everyone else was
using UTF-8.  Your mailer re-encoded their messages into one
of the 8-bit western encodings, whereas most people use UTF-8
these days, so while their original messages with accents came
through just fine in UTF-8, your re-encoding into Western didn't
display properly.


Although his email did reencode those names into ISO 8859-1 which is 
more likely to cause problems than cure them these days, it still 
displays well on my MUA (Thunderbird) because its header said 
"Content-Type: text/plain; charset=iso-8859-1". His email is also 
displaying properly in the archive 
, as the 
archiving software reencodes those names back into UTF-8 and the web 
page uses "Content-Type: text/html; charset=utf-8" for all the emails.


Possibly your email client is programmed to ignore encodings in incoming 
"Content-Type" lines; that would explain the glitches you saw.






bug#47014: Design flaw: incompatible touch '-f' gnu-option causes loss of (meta)data by default

2021-03-09 Thread Paul Eggert

On 3/9/21 6:03 PM, L A Walsh wrote:

Thanks for the update and appreciate your diligence...


You're welcome. Closing the bug report.





bug#47023: df utilility displays G instead of GM as unit size for Gigabytes in power of 1000

2021-03-09 Thread Paul Eggert

On 3/9/21 4:58 AM, Philippe Bénézech via GNU coreutils Bug Reports wrote:


df displays G instead of GM as unit size for Gigabytes in power of 1000 
(but the value is correct)


$ df -BGB /home
Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2    421GB   355GB   45GB  89% /home

$ df -H /home
Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
/dev/mapper/ssd2   421G    355G   45G  89% /home


I don't see a bug here. First, I assume you meant to write "GB" rather 
than "GM". Second, "df -BGB" is documented to append units (in this 
case, "GB") to the output number, whereas "df -H" is merely documented 
to append a size indication (in this case, "G").






bug#47014: Design flaw: incompatible touch '-f' gnu-option causes loss of (meta)data by default

2021-03-08 Thread Paul Eggert

On 3/8/21 6:29 PM, L A Walsh wrote:

Warning, '-f' assuming '-r' was intended


I don't think that'd be helpful, given that -f now has a well-defined 
and common meaning that doesn't agree with what you remember, and that 
in 4.2BSD (circa 1983) -f meant something quite different from -r and 
the common current interpretation of -f (which is to ignore it) is 
extremely compatible with 4.2BSD's interpretation. See:


https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/man/man1/touch.1





bug#47014: Design flaw: incompatible touch '-f' gnu-option causes loss of (meta)data by default

2021-03-08 Thread Paul Eggert

On 3/8/21 5:50 PM, L A Walsh wrote:


Data loss shown in original bug submission.  As mentioned/documented
it was use of:
'touch -f  '


Sure, but what was the context of that command? Was it part of a shell 
script? What was the script for? Can we see a copy? That sort of thing.




I don't know which version of touch I remember it from as I've
use a few versions of unix, as in (scratching memory):
some form of SCO Unix on Intel chips (early 80's, pre IBM-PC), HPUX, Sun 
Unix(a BSD variant), SunOS (a SysV variant), IRIX(sgi),

among others whose names I don't remember.


It'd be helpful to nail that down.

On FreeBSD, touch's -f option is also a no-op, and I observe similar 
behavior on Solaris 10 (where I lack the source code). So there are good 
compatibility arguments for leaving things the way they are.






bug#47014: Design flaw: incompatible touch '-f' gnu-option causes loss of (meta)data by default

2021-03-08 Thread Paul Eggert

On 3/8/21 3:27 PM, L A Walsh wrote:

gnu accepts but ignores the previously active '-f'(from) switch


GNU "touch -f" has always been a no-op and has never meant "from" as far 
as I know - though I admit I looked back only to 1992. Perhaps you're 
thinking of some other "touch" program? If so, which one and which version?


Is the data loss you mentioned something that you actually observed? If 
so, what was the context?






bug#46815: cp integer overflow in progress (time remaining)

2021-02-27 Thread Paul Eggert

On 2/27/21 1:31 PM, Ronald Knol wrote:

I am looking at "src/cp.c" from coreutils-8.32 and it has command line
options --progress-bar (aka -g).


coreutils-8.32 doesn't have those options. Apparently you have a 
modified copy. You can verify this by getting the original from:


https://ftp.gnu.org/gnu/coreutils/coreutils-8.32.tar.gz

It would have saved time for both of us if the people who modified your 
copy had changed the package name and bug-reporting address. Once you 
find out who modified your copy, please suggest that to them.






bug#46815: cp integer overflow in progress (time remaining)

2021-02-27 Thread Paul Eggert

On 2/27/21 7:35 AM, Ronald Knol wrote:

This is "cp -argu  ". The source tree contains more
than 2TiB worth of data.

I believe the issue is in src/copy.c where (on line 355) an INT is used to
store "cur_size".

 int cur_size = g_iTotalWritten + *total_n_read / 1024;


GNU coreutils 'cp' lacks a 'g' option, and doesn't have the line number 
you mentioned. It sounds like you're dealing with a bug in a modified 
version of 'cp', which means you should direct your bug report to 
whoever made that modification.






bug#45358: bootstrap fails due to a certificate mismatch

2021-02-16 Thread Paul Eggert

On 2/15/21 3:07 AM, Grigoriy Sokolik wrote:


But be careful, this is really bad advice: fetching anything without
consistency ad authority validation is really insecure!


Yes, we should instead fix the underlying problem whatever it is (not 
sure what it is since that wasn't reported).






bug#46169: Parallelize merge sort

2021-01-29 Thread Paul Eggert

On 1/29/21 1:07 AM, Ole Tange wrote:

Could you consider implementing a parallel merge, so I can retire
parsort?


Yes, improving that part of 'sort' performance has been on my long list 
of things to do for quite some time. If someone else could take up the 
task it'd be done quicker Anyway, thanks for reporting it as a bug, 
so that we can track it in our bug database.






bug#46060: Offer ls --limit=...

2021-01-24 Thread Paul Eggert

On 1/23/21 1:13 PM, 積丹尼 Dan Jacobson wrote:

And any database command already has
a --limit option these days, and does not rely on a second program to
trim its output because it can't control itself. Indeed, on some remote
connections one would only want to launch one program, not two.


That argument would apply to any program, no? "cat", "diff", "sh", 
"node",


Not sure why "ls" needs a convenience flag that would complicate the 
documentation and maintenance and be so rarely useful.






bug#46048: split -n K/N loses data, sum of output files is smaller than input file.

2021-01-24 Thread Paul Eggert

On 1/24/21 8:52 AM, Pádraig Brady wrote:

-  if (lseek (STDIN_FILENO, start, SEEK_CUR) < 0)
+  if (lseek (STDIN_FILENO, start, SEEK_SET) < 0)


Dumb question: will this handle the case where you're splitting from 
stdin and stdin is a seekable file and its initial file offset is nonzero?






bug#45924: RFE: rmdir -r: recursively remove [empty] directories under the target.

2021-01-18 Thread Paul Eggert

On 1/18/21 8:08 AM, Bernhard Voelker wrote:

On 1/17/21 11:18 PM, Paul Eggert wrote:

find DIR -depth -type d -exec rmdir {} +


find(1) can also find empty directories and delete them:

   $ find DIR -type d -empty -delete


Thanks, I'd forgotten about that.

I added the attached to the manual, as the point seems worth documenting 
even if we don't change the code.
>From eebed78799a7996dd80b66c493a0fc199705dea3 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 18 Jan 2021 21:08:39 -0800
Subject: [PATCH] doc: rmdir --recursive substitutes

* doc/coreutils.texi (rmdir invocation): Add note on how to remove
empty subdirectories recursively.
---
 doc/coreutils.texi | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index fe2fc52b7..94c9fbfa5 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -11006,7 +11006,19 @@ Give a diagnostic for each successful removal.
 
 @end table
 
-@xref{rm invocation}, for how to remove non-empty directories (recursively).
+@xref{rm invocation}, for how to remove non-empty directories recursively.
+
+To remove all empty directories under @var{dirname}, including
+directories that become empty because other directories are removed,
+you can use either of the following commands:
+
+@example
+# This uses GNU extensions.
+find @var{dirname} -type d -empty -delete
+
+# This runs on any POSIX platform.
+find @var{dirname} -depth -type d -exec rmdir @{@} +
+@end example
 
 @exitstatus
 
-- 
2.27.0



bug#45924: RFE: rmdir -r: recursively remove [empty] directories under the target.

2021-01-18 Thread Paul Eggert

On 1/18/21 2:53 AM, L A Walsh wrote:

Except that 'find DIR -depth -type d -exec rmdir {} +'
is anything but simple and not something anyone outside of
a minority of *nix users would have a clue about how to create, whereas 
'rmdir -r DIR' is both direct and simple and

more easily understandable


It's not that simple. For example, it's not clear whether rmdir -r 
should also remove directories containing only empty subdirectories, 
which is what you asked for. Perhaps some people would want that, 
perhaps they'd want to remove just empty leaf directories. Or perhaps 
they'd want rmdir to remove empty subdirectories only if it has 
permission to do so. Or maybe they'd want to also remove subdirectories 
whose directory entries are all hidden (start with '.'). Or there are 
lots of other possible things people could plausibly want.


This is what 'find' is for. If people needed to do something like "rmdir 
-r" every day then it'd be plausible to add it even though there's a 
simple substitute. But people don't, so let's stick with what we have.






bug#45924: RFE: rmdir -r: recursively remove [empty] directories under the target.

2021-01-17 Thread Paul Eggert

On 1/16/21 4:29 PM, L A Walsh wrote:

Yes, you could do it some other way, like by using 'find'


That's what I'd do, yes. 'find DIR -depth -type d -exec rmdir {} +'. I 
doubt whether it's worth hacking on this at the C level (complicating 
the documentation too) when there's such a simple and portable way to do 
this unusual task already.






bug#14371: bug#45886: mkdir -m argument does not work correctly, applies incorrect permissions

2021-01-15 Thread Paul Eggert
Thanks for the bug report. I reproduced the problem and installed the 
attached patch to fix it.
>From b8375c422ffe0e018cbb4cad187d1e909195d263 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 15 Jan 2021 02:57:59 -0800
Subject: [PATCH] mkdir: fix bug when -m's more generous than umask

Problem reported by David McCall (Bug#45886).
I introduced this problem when fixing Bug#14371.
* NEWS: Mention the fix.
* src/mkdir.c (struct mkdir_options): New members umask_ancestor,
umask_self, replacing umask_value.
(make_ancestor): Use them when temporarily adjusting umask.
(main): Set them, and set the umask to umask_self instead
of leaving it alone.
* tests/mkdir/perm.sh (tests): Add test case for bug.
---
 NEWS|  3 +++
 src/mkdir.c | 30 ++
 tests/mkdir/perm.sh |  1 +
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/NEWS b/NEWS
index c2474fee3..a6ba96450 100644
--- a/NEWS
+++ b/NEWS
@@ -20,6 +20,9 @@ GNU coreutils NEWS-*- outline -*-
   ls no longer crashes when printing the SELinux context for unstatable files.
   [bug introduced in coreutils-6.9.91]
 
+  mkdir -m no longer mishandles modes more generous than the umask.
+  [bug introduced in coreutils-8.22]
+
   nl now handles single character --section-delimiter arguments,
   by assuming a second ':' character has been specified, as specified by POSIX.
   [This bug was present in "the beginning".]
diff --git a/src/mkdir.c b/src/mkdir.c
index eccc9d382..b266cee8c 100644
--- a/src/mkdir.c
+++ b/src/mkdir.c
@@ -89,8 +89,11 @@ struct mkdir_options
  made.  */
   int (*make_ancestor_function) (char const *, char const *, void *);
 
-  /* Umask value in effect.  */
-  mode_t umask_value;
+  /* Umask value for when making an ancestor.  */
+  mode_t umask_ancestor;
+
+  /* Umask value for when making the directory itself.  */
+  mode_t umask_self;
 
   /* Mode for directory itself.  */
   mode_t mode;
@@ -130,20 +133,18 @@ make_ancestor (char const *dir, char const *component, void *options)
 error (0, errno, _("failed to set default creation context for %s"),
quoteaf (dir));
 
-  mode_t user_wx = S_IWUSR | S_IXUSR;
-  bool self_denying_umask = (o->umask_value & user_wx) != 0;
-  if (self_denying_umask)
-umask (o->umask_value & ~user_wx);
+  if (o->umask_ancestor != o->umask_self)
+umask (o->umask_ancestor);
   int r = mkdir (component, S_IRWXUGO);
-  if (self_denying_umask)
+  if (o->umask_ancestor != o->umask_self)
 {
   int mkdir_errno = errno;
-  umask (o->umask_value);
+  umask (o->umask_self);
   errno = mkdir_errno;
 }
   if (r == 0)
 {
-  r = (o->umask_value & S_IRUSR) != 0;
+  r = (o->umask_ancestor & S_IRUSR) != 0;
   announce_mkdir (dir, options);
 }
   return r;
@@ -282,8 +283,7 @@ main (int argc, char **argv)
   if (options.make_ancestor_function || specified_mode)
 {
   mode_t umask_value = umask (0);
-  umask (umask_value);
-  options.umask_value = umask_value;
+  options.umask_ancestor = umask_value & ~(S_IWUSR | S_IXUSR);
 
   if (specified_mode)
 {
@@ -293,10 +293,16 @@ main (int argc, char **argv)
  quote (specified_mode));
   options.mode = mode_adjust (S_IRWXUGO, true, umask_value, change,
   _bits);
+  options.umask_self = umask_value & ~options.mode;
   free (change);
 }
   else
-options.mode = S_IRWXUGO;
+{
+  options.mode = S_IRWXUGO;
+  options.umask_self = umask_value;
+}
+
+  umask (options.umask_self);
 }
 
   return savewd_process_files (argc - optind, argv + optind,
diff --git a/tests/mkdir/perm.sh b/tests/mkdir/perm.sh
index 4d36f19b5..083a47733 100755
--- a/tests/mkdir/perm.sh
+++ b/tests/mkdir/perm.sh
@@ -35,6 +35,7 @@ tests='
 050  :   -m 312   : drwx-w-rwx : d-wx--x-w- :
 160  :   empty: drwx--xrwx : drw---xrwx :
 160  :   -m 743   : drwx--xrwx : drwxr---wx :
+022  :   -m o-w   : drwxr-xr-x : drwxrwxr-x :
 027  :   -m =+x   : drwxr-x--- : d--x--x--- :
 027  :   -m =+X   : drwxr-x--- : d--x--x--- :
 -:   -: last   : last   :
-- 
2.27.0



bug#45749: [gnu.org #1673209] [Coreutils manual] Broken page link (from and to gnu.org)

2021-01-09 Thread Paul Eggert
Thanks for reporting that. I fixed it in Coreutils master on Savannah by 
applying the attached patch, and this should propagate out to the 
website after the next Coreutils release. Closing the Coreutils bug report.
>From 86640823d63e1c881ae56c5ae0cbc5f848ce7beb Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 9 Jan 2021 13:04:40 -0800
Subject: [PATCH] doc: modernize and fix regexp xref
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* doc/coreutils.texi: Fix regexp cross-reference that had become
out-of-date (Bug#45749).  Also, fix some obsolete references to
SunOS and to /usr/dict/words, and change “Linux” to “GNU/Linux”
where appropriate.  Unfortunately the pipeline example gets more
complicated since /usr/share/dict/words is not sorted the way that
‘comm’ wants.
---
 doc/coreutils.texi | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index e9dd21c4e..fe2fc52b7 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -7714,7 +7714,7 @@ high performance (``contiguous data'') file
 @item d
 directory
 @item D
-door (Solaris 2.5 and up)
+door (Solaris)
 @c @item F
 @c semaphore, if this is a distinct file type
 @item l
@@ -7728,7 +7728,7 @@ network special file (HP-UX)
 @item p
 FIFO (named pipe)
 @item P
-port (Solaris 10 and up)
+port (Solaris)
 @c @item Q
 @c message queue, if this is a distinct file type
 @item s
@@ -11824,7 +11824,7 @@ are also listed.
 @cindex file system space, retrieving old data more quickly
 Do not invoke the @code{sync} system call before getting any usage data.
 This may make @command{df} run significantly faster on systems with many
-disks, but on some systems (notably SunOS) the results may be slightly
+disks, but on some systems (notably Solaris) the results may be slightly
 out of date.  This is the default.
 
 @item --output
@@ -11925,7 +11925,7 @@ otherwise.  @xref{Block size}.
 @opindex --sync
 @cindex file system space, retrieving current data more slowly
 Invoke the @code{sync} system call before getting any usage data.  On
-some systems (notably SunOS), doing this yields more up to date results,
+some systems (notably Solaris), doing this yields more up to date results,
 but in general this option makes @command{df} much slower, especially when
 there are many or very busy file systems.
 
@@ -11980,7 +11980,7 @@ all systems.
 @opindex xfs @r{file system type}
 @opindex btrfs @r{file system type}
 A file system on a locally-mounted hard disk.  (The system might even
-support more than one type here; Linux does.)
+support more than one type here; GNU/Linux does.)
 
 @item iso9660@r{, }cdfs
 @cindex CD-ROM file system type
@@ -13564,9 +13564,8 @@ expression operators.
 @kindex \| @r{regexp operator}
 In the regular expression, @code{\+}, @code{\?}, and @code{\|} are
 operators which respectively match one or more, zero or one, or separate
-alternatives.  SunOS and other @command{expr}'s treat these as regular
-characters.  (POSIX allows either behavior.)
-@xref{Top, , Regular Expression Library, regex, Regex}, for details of
+alternatives.  These operators are GNU extensions.  @xref{Regular Expressions,,
+Regular Expressions, grep, The GNU Grep Manual}, for details of
 regular expression syntax.  Some examples are in @ref{Examples of expr}.
 
 @item match @var{string} @var{regex}
@@ -15204,7 +15203,7 @@ Switch to a different shell layer.  Non-POSIX.
 
 @item status
 @opindex status
-Send an info signal.  Not currently supported on Linux.  Non-POSIX.
+Send an info signal.  Not currently supported on GNU/Linux.  Non-POSIX.
 
 @item start
 @opindex start
@@ -16617,8 +16616,8 @@ parsed reliably.  In the following example, @var{kernel-version} is
 
 @example
 uname -a
-@result{} Linux dumdum.example.org 5.7.9-100.fc31.x86_64@c
- #1 SMP Fri Jul 17 17:18:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
+@result{} Linux dumdum.example.org 5.9.16-200.fc33.x86_64@c
+ #1 SMP Mon Dec 21 14:08:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
 @end example
 
 
@@ -19015,7 +19014,7 @@ might be used.  What it's really about is the ``Software Tools'' philosophy
 of program development and usage.
 
 The software tools philosophy was an important and integral concept
-in the initial design and development of Unix (of which Linux and GNU are
+in the initial design and development of Unix (of which GNU/Linux and GNU are
 essentially clones).  Unfortunately, in the modern day press of
 Internetworking and flashy GUIs, it seems to have fallen by the
 wayside.  This is a shame, since it provides a powerful mental model
@@ -19443,10 +19442,7 @@ A minor modification to the above pipeline can give us a simple spelling
 checker!  To determine if you've spelled a word correctly, all you have to
 do is look it up in a dictionary.  If it is not there, then chances are
 that your spelling is incorrect.  So, we need a dictionary.
-The conventional locat

bug#45700: rm should not prompt if ! isatty(2)

2021-01-06 Thread Paul Eggert

On 1/6/21 10:56 AM, John Wiersba via GNU coreutils Bug Reports wrote:

$ touch asdf && chmod a-w asdf && rm asdf 2>&1 | catrm: remove write-protected 
regular empty file 'asdf'?  # should*not*  prompt

If the prompt cannot be seen, then it can't be properly answered, so there is 
no point in prompting and consequently leaving the user with a hanging command 
and no way to know what's being expected of them.  Instead rm should attempt to 
remove the file and succeed or fail based on the result.


POSIX requires the current behavior; see clause 3 in:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html

Although GNU rm needn't follow POSIX blindly, it's doubtful that rm 
should remove the file in this particular case, as the longstanding 
tradition is that plain "rm" does not remove unwriteable files without 
more confirmation.


Since you know about "rm -f" I suggest using that (that's what everyone 
else does...).






bug#45648: `dd` seek/skip which way is up?

2021-01-04 Thread Paul Eggert

On 1/4/21 7:44 PM, Bela Lubkin wrote:

TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document
them as pure synonyms for 'skip' and 'seek'.


Thanks for doing all that research. It's compelling, and I think your 
patch (or something like it) should go in. I'll wait for a bit to hear 
other opinions.






bug#45648: `dd` seek/skip which way is up?

2021-01-04 Thread Paul Eggert

On 1/4/21 3:07 PM, Bernhard Voelker wrote:

I previously encountered a `dd` implementation which also accepted
'oseek=N' and 'iseek=N', which I found far more natural and easy to
remember.

What 'dd' implementation was this specifically?


Solaris dd has iseek and oseek. However, they are not aliases for skip 
and seek. If coreutils dd were to add these features I expect we should 
do them the Solaris way, instead of making them aliases for skip and 
seek. This would take more work than the proposed patches.


https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html





bug#45258: mkdir man page unclear in describing -m flag

2020-12-15 Thread Paul Eggert
Thanks for your bug report. I installed the attached patch; although it 
doesn't use the exact wording you proposed, I hope it works well enough.
>From 3ee0e25426a513c5da891ce6a370abed156a3b83 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 15 Dec 2020 11:52:19 -0800
Subject: [PATCH] doc: document mkdir -m -p better
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Chris Colohan wrote that the man page did not do enough to dispel
a common misunderstanding that “contributed to one of the scariest
outages Google has ever seen” (Bug#45258).
* doc/coreutils.texi (mkdir invocation):
* src/mkdir.c (usage): Document -m vs -p better.
---
 doc/coreutils.texi | 13 +
 src/mkdir.c|  3 ++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index df0655c20..44ce7d2e0 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -10693,6 +10693,8 @@ Set the file permission bits of created directories to @var{mode},
 which uses the same syntax as
 in @command{chmod} and uses @samp{a=rwx} (read, write and execute allowed for
 everyone) for the point of the departure.  @xref{File permissions}.
+This option affects only directories given on the command line;
+it does not affect any parents that may be created via the @option{-p} option.
 
 Normally the directory has the desired file mode bits at the moment it
 is created.  As a GNU extension, @var{mode} may also mention
@@ -10708,15 +10710,18 @@ overridden in this way.
 @opindex --parents
 @cindex parent directories, creating
 Make any missing parent directories for each argument, setting their
-file permission bits to the umask modified by @samp{u+wx}.  Ignore
+file permission bits to @samp{=rwx,u+wx},
+that is, with the umask modified by @samp{u+wx}.  Ignore
 existing parent directories, and do not change their file permission
 bits.
 
-To set the file permission bits of any newly-created parent
-directories to a value that includes @samp{u+wx}, you can set the
+If the @option{-m} option is also given, it does not affect
+file permission bits of any newly-created parent directories.
+To control these bits, set the
 umask before invoking @command{mkdir}.  For example, if the shell
 command @samp{(umask u=rwx,go=rx; mkdir -p P/Q)} creates the parent
-@file{P} it sets the parent's permission bits to @samp{u=rwx,go=rx}.
+@file{P} it sets the parent's file permission bits to @samp{u=rwx,go=rx}.
+(The umask must include @samp{u=wx} for this method to work.)
 To set a parent's special mode bits as well, you can invoke
 @command{chmod} after @command{mkdir}.  @xref{Directory Setuid and
 Setgid}, for how the set-user-ID and set-group-ID bits of
diff --git a/src/mkdir.c b/src/mkdir.c
index 8f07d666e..1f4588f10 100644
--- a/src/mkdir.c
+++ b/src/mkdir.c
@@ -65,7 +65,8 @@ Create the DIRECTORY(ies), if they do not already exist.\n\
 
   fputs (_("\
   -m, --mode=MODE   set file mode (as in chmod), not a=rwx - umask\n\
-  -p, --parents no error if existing, make parent directories as needed\n\
+  -p, --parents no error if existing, make parent directories as needed,\n\
+with their file modes unaffected by any -m option.\n\
   -v, --verbose print a message for each created directory\n\
 "), stdout);
   fputs (_("\
-- 
2.27.0



bug#45093: Character 149 causing ASCII BEL output to console in Windoze port of Gnu CoreUtils

2020-12-07 Thread Paul Eggert

On 12/7/20 11:38 AM, Robert S. Kissel wrote:

If you could possibly direct me to the maintainers of the pre-compiled
Windoze port (I'm certain that I downloaded it from the gnu.org
Web-site)


Sure about that? I'm not aware of any. At any rate, whereever you downloaded it 
from should have contact info.






bug#45093: Character 149 causing ASCII BEL output to console in Windoze port of Gnu CoreUtils

2020-12-07 Thread Paul Eggert

On 12/6/20 8:23 PM, Robert S. Kissel wrote:

I'm pretty sure this is a bug in the Windoze port of head and tail,


You should have better luck writing directly to the people who prepared that 
port, as they don't hang out on this mailing list and we largely don't worry 
about MS-Windows.






bug#44763: Error when 'make'ing latest version of coreutils

2020-11-21 Thread Paul Eggert
Thanks for reporting your recipe for working around all these problems. I've 
installed patches for the problems into coreutils and gnulib and am closing the 
bug report.


On 11/21/20 3:45 PM, Chris Elvidge wrote:
git commit -m 'build: update gnulib submodule to latest' gnulib 2>&1 | tee -a 
$outfiles/out_commit.1.txt


I suggest doing the bootstrap after this 'git commit', not earlier.


Because of the abovementioned patches, you should no longer need to do the 
following steps:



# Berny's addition
git clean -xdfq && ./bootstrap 2>&1 | tee -a $outfiles/out_bootstrap.2.txt

./configure 2>&1 | tee -a $outfiles/out_configure.1.txt

# do edit to make make work
# Akim's change - make it expect a long not a long long
sed -i -e '2301s/%"PRIdMAX"/%ld/' lib/parse-datetime.y
sed -n 2301p lib/parse-datetime.y

# do three edits to make make check work
# put 'return NULL;' back before '/*NOTREACHED*/' # explained by Berny
sed -i -e '184s#\(/\*NOTREACHED\*/\)#return NULL; \1#' 
gnulib/tests/test-nl_langinfo-mt.c

sed -n 184p gnulib/tests/test-nl_langinfo-mt.c
sed -i -e '94s#\(/\*NOTREACHED\*/\)#return NULL; \1#' 
gnulib/tests/test-setlocale_null-mt-all.c

sed -n 94p gnulib/tests/test-setlocale_null-mt-all.c
sed -i -e '94s#\(/\*NOTREACHED\*/\)#return NULL; \1#' 
gnulib/tests/test-setlocale_null-mt-one.c

sed -n 94p gnulib/tests/test-setlocale_null-mt-one.c

# pause here to make sure edits done properly
read -p "Press return to continue" junk







bug#44763: Error when 'make'ing latest version of coreutils

2020-11-21 Thread Paul Eggert

On 11/21/20 6:37 AM, Chris Elvidge wrote:

parse-datetime.y: In function 'parse_datetime2':
parse-datetime.y:2301:27: error: format '%lld' expects argument of type 'long 
long int', but argument 2 has type 'time_t {aka long int}' [-Werror=format=]


That's due to a typo that I recently introduced to parse-datetime.y. Thanks for 
reporting it. (I didn't observe the problem since I tested on hosts with 64-bit 
time_t, not 32-bit.) I installed the attached patch into Gnulib and propagated 
this into Coreutils.
>From fdf0468198631a456406edc09983972edb8fa5c4 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 21 Nov 2020 19:04:10 -0800
Subject: [PATCH] parse-datetime: fix printf format typo

* lib/parse-datetime.y (parse_datetime2): Fix format typo in
previous patch to this file.  Problem reported by Chris Elvidge in
<https://bugs.gnu.org/44763#32>.
---
 ChangeLog| 5 +
 lib/parse-datetime.y | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index de92d102e..229945e86 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2020-11-21  Paul Eggert  
 
+	parse-datetime: fix printf format typo
+	* lib/parse-datetime.y (parse_datetime2): Fix format typo in
+	previous patch to this file.  Problem reported by Chris Elvidge in
+	<https://bugs.gnu.org/44763#32>.
+
 	setlocale-null-tests: work around GCC bug 44511
 	* tests/test-setlocale_null-mt-all.c:
 	* tests/test-setlocale_null-mt-one.c:
diff --git a/lib/parse-datetime.y b/lib/parse-datetime.y
index 44ae90350..e8ed691c8 100644
--- a/lib/parse-datetime.y
+++ b/lib/parse-datetime.y
@@ -2298,7 +2298,8 @@ parse_datetime2 (struct timespec *result, char const *p,
   "%+"PRIdMAX" seconds, %+d ns),\n"),
 pc.rel.hour, pc.rel.minutes, pc.rel.seconds,
 pc.rel.ns);
-dbg_printf (_("new time = %"PRIdMAX" epoch-seconds\n"), t4);
+intmax_t t4i = t4;
+dbg_printf (_("new time = %"PRIdMAX" epoch-seconds\n"), t4i);
 
 /* Warn about crossing DST due to time adjustment.
Example: https://bugs.gnu.org/8357
-- 
2.27.0



bug#44763: Error when 'make'ing latest version of coreutils

2020-11-21 Thread Paul Eggert

On 11/21/20 5:17 AM, Pádraig Brady wrote:

The info in https://bugs.gnu.org/44739 must be incorrect,
and we've two counter checks to it now.


Yes, that sounds right. Closing that bug report.





bug#44704: uniq: replace repeated lines with a message about how many repeated lines

2020-11-17 Thread Paul Eggert

On 11/17/20 5:32 AM, Brian J. Murrell wrote:
> [previous line repeated 4 times]

uniq -c already does something like that, though it outputs "5" instead of "4". 
Not sure it's worth gussying up 'uniq' to provide exactly the functionality 
requested, as output reformatting is easy enough to do yourself using awk or 
Python or whatever.






bug#44695: error - GraphClust2 docker

2020-11-16 Thread Paul Eggert

On 11/16/20 10:58 AM, Christina Palka via GNU coreutils Bug Reports wrote:

I got the following error when attempting to install GraphClust2 using
Docker on Mac.


tail: unrecognized file system type 0x794c7630 for
‘/home/galaxy/logs/uwsgi.log’. please report this to bug-coreut...@gnu.or


Thanks for reporting that. As it happens, the problem was fixed in coreutils 
8.25 (2016-01-20); see:


https://debbugs.gnu.org/cgi/bugreport.cgi?bug=27513

so I'll close the bug report and you should be able to fix the problem by 
upgrading to a more-modern coreutils.






bug#44587: ls prints garbage when listing contents of a directory without exec permissions

2020-11-11 Thread Paul Eggert

On 11/11/20 3:24 PM, Jan Schaumann wrote:

$ ls -la dir
ls: cannot access dir/.: Permission denied
ls: cannot access dir/..: Permission denied
ls: cannot access dir/file: Permission denied
total 0
d? ? ? ? ?? .
d? ? ? ? ?? ..
-? ? ? ? ?? file
$


Expected output:

$ ls -la dir
ls: cannot access dir/.: Permission denied
ls: cannot access dir/..: Permission denied
ls: cannot access dir/file: Permission denied


As Bernhard mentioned, the actual output is intentional. The expected output 
would be less useful, as it would give the user a bit less information (e.g., it 
would not tell the user where 'file' is a regular file or a directory).






bug#44248: Indentation of --help and --version

2020-10-26 Thread Paul Eggert
One way to attack the problem is (1) use only one-liners for option help, and 
(2) not worry about indentation so much (either in English or in German) as the 
excess indenting doesn't help readability enough to justify the translation 
hassle. To do that, I propose changes like the attached for comm. This will 
cause 'comm --help' output to look like the following, which is good enough and 
which will still work with help2man:


Usage: comm [OPTION]... FILE1 FILE2
Compare sorted files FILE1 and FILE2 line by line.

When FILE1 or FILE2 (not both) is -, read standard input.

With no options, produce three-column output.  Column one contains
lines unique to FILE1, column two contains lines unique to FILE2,
and column three contains lines common to both files.

  -1  suppress column 1 (lines unique to FILE1)
  -2  suppress column 2 (lines unique to FILE2)
  -3  suppress column 3 (lines that appear in both files)

  --check-order  check that the input is correctly sorted
  --nocheck-order  do not check that the input is correctly sorted
  --output-delimiter=STR  separate columns with STR
  --total  output a summary
  -z, --zero-terminated  line delimiter is NUL, not newline
  --help display this help and exit
  --version  output version information and exit

Note, comparisons honor the rules specified by 'LC_COLLATE'.

Examples:
  comm -12 file1 file2  Print only lines present in both file1 and file2.
  comm -3 file1 file2  Print lines in file1 not in file2, and vice versa.

GNU coreutils online help: 
Full documentation 
or available locally via: info '(coreutils) comm invocation'
diff --git a/src/comm.c b/src/comm.c
index 2bf8094bf..2893746cb 100644
--- a/src/comm.c
+++ b/src/comm.c
@@ -128,24 +128,23 @@ and column three contains lines common to both files.\n\
 "), stdout);
   fputs (_("\
 \n\
-  -1  suppress column 1 (lines unique to FILE1)\n\
-  -2  suppress column 2 (lines unique to FILE2)\n\
-  -3  suppress column 3 (lines that appear in both files)\n\
+  -1  suppress column 1 (lines unique to FILE1)\n\
+  -2  suppress column 2 (lines unique to FILE2)\n\
+  -3  suppress column 3 (lines that appear in both files)\n\
 "), stdout);
   fputs (_("\
 \n\
-  --check-order check that the input is correctly sorted, even\n\
-  if all input lines are pairable\n\
-  --nocheck-order   do not check that the input is correctly sorted\n\
+  --check-order  check that the input is correctly sorted\n\
+  --nocheck-order  do not check that the input is correctly sorted\n\
 "), stdout);
   fputs (_("\
   --output-delimiter=STR  separate columns with STR\n\
 "), stdout);
   fputs (_("\
-  --total   output a summary\n\
+  --total  output a summary\n\
 "), stdout);
   fputs (_("\
-  -z, --zero-terminatedline delimiter is NUL, not newline\n\
+  -z, --zero-terminated  line delimiter is NUL, not newline\n\
 "), stdout);
   fputs (HELP_OPTION_DESCRIPTION, stdout);
   fputs (VERSION_OPTION_DESCRIPTION, stdout);


bug#43828: invalid date converting from UTC, near DST

2020-10-06 Thread Paul Eggert

On 10/6/20 4:24 AM, Martin Fido wrote:

I have version 8.25:


Seems to have been fixed by coreutils 8.30:

$ TZ='Australia/Sydney' date -d '2020-10-04T02:00:00Z'
Sun 04 Oct 2020 01:00:00 PM AEDT





bug#43657: rm does not delete files

2020-09-28 Thread Paul Eggert

On 9/27/20 8:58 PM, Amit Rao wrote:

There's a limit? My first attempt didn't use a wildcard; i attempted to delete 
a directory.


'rm dir' fails because 'rm' by default leaves directories alone.


My second attempt was rm -rf dir/*


If "dir" has too many files that will fail due to shell limitations that have 
nothing to do with Coreutils. Use 'rm -rf dir' instead.






bug#43657: rm does not delete files

2020-09-27 Thread Paul Eggert

On 9/27/20 1:00 PM, Amit Rao wrote:

rm /path/*

does not delete files if there are a lot (say 2000) of them in a single
directory


What does the command do instead?

There is a limit as to how many arguments you can pass to 'rm'. If that's what 
you ran into, it's a problem with your kernel or your shell, not with 'rm'.






bug#43497: ls exit status on removed directory

2020-09-21 Thread Paul Eggert

On 9/18/20 4:15 PM, Philip Rowlands wrote:


$ mkdir /tmp/abc
$ cd /tmp/abc
$ rmdir /tmp/abc
$ ls

What happened:
no output, successful exit status

What was expected:
no output, unsuccessful exit status


POSIX says that the rmdir command is supposed to behave like the rmdir syscall. 
For the syscall, POSIX allows either of the two behaviors you mention, as 
 says 
that if the rmdir syscall's argument is "the current working directory of any 
process, it is unspecified whether the function succeeds, or whether it shall 
fail and set errno to [EBUSY]". The Linux kernel rmdir syscall succeeds, so 
coreutils rmdir succeeds.



ls tried to list the contents of . but failed to do so, at least on Linux:
open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, 0x55e10c419cf0, 32768)  = -1 ENOENT (No such file or directory)


ls doesn't use getdents directly; it uses the readdir function of the GNU C 
library, which specifically tests for this situation and sets errno to 0, with 
this comment at 
:


  /* On some systems getdents fails with ENOENT when the 

 open directory has been rmdir'd already.  POSIX.1 


 requires that we treat this condition like normal EOF.  */

It's not clear to me that this comment is correct for current POSIX, but anyway 
this is a matter for the GNU C library not for coreutils ls, so if you think 
there's a bug there I suggest filing a glibc bug report 
.






bug#43415: coreutils 8.32: install: fchmod fails with EBADF

2020-09-15 Thread Paul Eggert

On 9/14/20 6:31 PM, Cameron Nemo via GNU coreutils Bug Reports wrote:

It seems like relying on the /proc link is not ideal,
and a bug is being hidden by such behavior.
Is there any chance that this can be resolved?


It really should be fixed in the Linux kernel: it needs a proper way to 
implement POSIX fchmodat  
with the AT_SYMLINK_NOFOLLOW flag, in order to plug some security holes 
involving symlink attacks. See:


https://bugzilla.redhat.com/show_bug.cgi?id=1810141
https://lkml.org/lkml/2020/6/9/548

In the meantime, mounting /proc may be your best bet. I vaguely recall there are 
other places in glibc that assume /proc.






bug#43162: chgrp clears setgid even when group is not changed

2020-09-01 Thread Paul Eggert

On 9/1/20 3:30 PM, Karl Berry wrote:

I was on centos7.

 (I don't observe your problem on my Fedora 31 box, for example).

Maybe there is hope for a future centos, then.


Maybe. Or it could be a filesystem or mounting issue. My filesystem was ext4 
mounted rw,relatime,seclabel, for what it's worth.


Anyway, closing the bug report.





bug#43162: chgrp clears setgid even when group is not changed

2020-09-01 Thread Paul Eggert

On 9/1/20 2:25 PM, Karl Berry wrote:

Is it necessary for chgrp to clear setgid on directories even when the
group is not actually changed? In my life at least, it is rather
annoying.


The chgrp command isn't doing that directly; it's merely invoking the fchownat 
syscall, and the syscall is clearing setgid.


POSIX requires chgrp to behave like the chown syscall even if the file's group 
is already correct, and it appears that the syscall clears the setgid bit on 
your platform (a behavior that POSIX allows, and even requires for regular 
files). So partly this is a platform issue (I don't observe your problem on my 
Fedora 31 box, for example).


I don't see an easy way to change chgrp without departing from POSIX, or perhaps 
adding a run-time option to the chown and chgrp commands. Not sure it's worth it.






bug#42804: mkdir saying it can't create folder although it created it

2020-08-14 Thread Paul Eggert

On 8/11/20 3:03 PM, Nick Levinson via GNU coreutils Bug Reports wrote:


I don't know what an example transcript is


Ah, I was asking for the output of a shell session (or however you invoked 
'mkdir'). The idea is that we would like to reproduce the bug, and need a recipe 
to do that.


If you're using GNU/Linux, the strace output would be quite helpful, e.g., run 
the shell command:


strace -o tr.txt mkdir foobar

and then send us a copy of the file tr.txt, assuming 'mkdir foobar' fails in the 
way you describe.






bug#42804: mkdir saying it can't create folder although it created it

2020-08-10 Thread Paul Eggert

On 8/10/20 11:48 AM, Nick Levinson via GNU coreutils Bug Reports wrote:

When I use mkdir, when it succeeds sometimes it has no message, which is 
acceptable, but sometimes it says this:

mkdir: cannot create directory '': File exists
But the directory didn't exist before I used mkdir to make it, and I show 
hidden files, thus also hidden directories.


Unfortunately I do not understand this bug report. I don't get the connection 
between the diagnostic and hidden directories.


Can you give an example transcript of the bug and explain it a bit more? Thanks.





bug#42766: file names with spaces are quoted in the output from ls

2020-08-09 Thread Paul Eggert

On 8/8/20 9:09 AM, David Thomas wrote:

If most people think things are a bad idea, why do them?


I don't see any real evidence that most people think the change is a bad idea. 
Although there have been complaints, that doesn't mean most people are 
complaining, or that most people are unhappy about the change.


In practice I've found the new behavior to be significantly safer. I too often 
have to deal with files with shell metacharacters in their names (people send me 
all sorts of weird stuff). The old 'ls' behavior was quite dangerous in that 
respect.



at first I was typing out the quotes to cd into them. Then I discovered it 
still worked to cd into them without typing the quotes


What file names were these, exactly? If 'ls' is overquoting, that's something we 
could fix without affecting safety.






bug#42470: Help text update suggestion for "date" util

2020-07-27 Thread Paul Eggert

On 7/25/20 8:07 AM, Wes Novack wrote:

Thank you! For future reference, what is the PR process?


See:

https://debbugs.gnu.org/

and look for "read more" if you're interested.





bug#42358: mv w/mkdir -p of destination

2020-07-15 Thread Paul Eggert

On 7/14/20 3:36 PM, L A Walsh wrote:

But I've found asking for features usually doesn't work and sometimes
results in work to preclude future
implementation of the feature.  Reporting bugs also, often gets ignored
until some large company reports
the same problem or until it causes a serious enough security incident.


You've often disagreed with design decisions made by maintainers, but this is 
the first time I recall you've accused them of large-company bias. Perhaps you 
should get your other grievances off your chest while you're at it.


I haven't noticed any such bias myself. Anyway, it does help to propose good 
patches, since my volunteer time is limited.






bug#42269: Remove non-GMP code from coreutils factor.c

2020-07-08 Thread Paul Eggert

On 7/8/20 12:34 PM, Torbjörn Granlund wrote:


Any number which does not happen to be B-smooth for, say B < 2^30, will
show easily measurable performance difference of 5x to 40x IIRC.


Ah, I had tried the example in the manual, (2^31 - 1) * (2^61 - 1). Even though 
it isn't B-smooth for B < 2^30, the performance difference was only 2x on my 
machine. I just now tried 2^127 - 1 and saw a similar performance difference, 
but 2^127 - 3 had a 15x difference so it's a better example.


I installed the attached to try to document this better.


I have a patch which makes the non-GMP code some 2x - 3x faster.  It's
been maturing for several years now, so I suppose I should really finish
it.  (It got tangled with code which improves the GMP case by letting it
fall into the non-GMP code as numbers get smaller.  That sounds simple
but is quite messy for various reasons.  It is also not clear how much
complexity we could defend for this command of limited utility.)


Yes, 'factor' is just a minor utility needed for POSIX compliance. Although it'd 
be nice to get that 2x-3x improvement whenever you have the time, it's not 
urgent. Thanks for your guidance on the GMP issue.


>From ba1489d763b66dd1fcec08ecb4cba5917745f6bf Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 8 Jul 2020 18:58:18 -0700
Subject: [PATCH] factor: explain why non-GMP code (Bug#42269)

* doc/coreutils.texi (factor invocation):
* src/factor.c: Explain why the two-word algorithm is useful.
---
 doc/coreutils.texi | 24 ++--
 src/factor.c   |  5 +
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 6ec1e6c31..656b8bc79 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -18368,14 +18368,17 @@ Print the program version on standard output, then exit without further
 processing.
 @end table
 
-Factoring the product of the eighth and ninth Mersenne primes
-takes about 4 milliseconds of CPU time on an Intel Xeon Silver 4116.
+If the number to be factored is small (less than @math{2^{127}} on
+typical machines), @command{factor} uses a faster algorithm.
+For example, on a circa-2017 Intel Xeon Silver 4116, factoring the
+product of the eighth and ninth Mersenne primes (approximately
+@math{2^{92}}) takes about 4 ms of CPU time:
 
 @example
-M8=$(echo 2^31-1|bc)
-M9=$(echo 2^61-1|bc)
-n=$(echo "$M8 * $M9" | bc)
-bash -c "time factor $n"
+$ M8=$(echo 2^31-1 | bc)
+$ M9=$(echo 2^61-1 | bc)
+$ n=$(echo "$M8 * $M9" | bc)
+$ bash -c "time factor $n"
 4951760154835678088235319297: 2147483647 2305843009213693951
 
 real	0m0.004s
@@ -18383,11 +18386,12 @@ user	0m0.004s
 sys	0m0.000s
 @end example
 
-Similarly, factoring the eighth Fermat number @math{2^{256}+1} takes
-about 14 seconds on the same machine.
+For larger numbers, @command{factor} uses a slower algorithm.  On the
+same platform, factoring the eighth Fermat number @math{2^{256} + 1}
+takes about 14 seconds, and the slower algorithm would have taken
+about 750 ms to factor @math{2^{127} - 3} instead of the 50 ms needed by
+the faster algorithm.
 
-The single-precision code uses an algorithm
-designed for factoring smaller numbers.
 Factoring large numbers is, in general, hard.  The Pollard-Brent rho
 algorithm used by @command{factor} is particularly effective for
 numbers with relatively small factors.  If you wish to factor large
diff --git a/src/factor.c b/src/factor.c
index c1c35a562..1b1607f16 100644
--- a/src/factor.c
+++ b/src/factor.c
@@ -53,6 +53,11 @@
 trick of multiplying all n-residues by the word base, allowing cheap Hensel
 reductions mod n.
 
+The GMP code uses an algorithm that can be considerably slower;
+for example, on a circa-2017 Intel Xeon Silver 4116, factoring
+2^{127}-3 takes about 50 ms with the two-word algorithm but would
+take about 750 ms with the GMP code.
+
   Improvements:
 
 * Use modular inverses also for exact division in the Lucas code, and
-- 
2.17.1



bug#42269: Remove non-GMP code from coreutils factor.c

2020-07-08 Thread Paul Eggert

On 7/8/20 9:57 AM, Torbjörn Granlund wrote:


The non-GMP code of coreutils was extremely well-tuned by me and Niels
Möller a couple of years ago.


How time flies! The code was merged in 2012.


By leaving just the GMP code, you would create a pretty useless factor
command.  Any naive old factor command would often beat it.  It would
make much more sense to remove the factor command altogether.


OK, thanks. Then let's forget about the patch I just proposed.

Could you give an example of where the 128-bit code shines, compared to the GMP 
code on the same arguments? I could add the example as a comment in the factor.c 
code, to let me and future maintainers know why it's useful for performance.






bug#42211: Problem in sort

2020-07-06 Thread Paul Eggert
On 7/5/20 9:53 PM, Richard Freedman wrote:

> I discovered that trying to use -c with --debug causes an error - but not in 
> the version that I have
> on my mac laptop !

Ah, I had meant to suggest using --debug without -c.

> when I try to specify a "key" even for a file with only 1 column - the 
> program stops on consecutive entries
> that are identical.

When all keys compare equal, 'sort' falls back on a last-resort comparison of
the entire line to break ties, and it's finding that your lines are out of
order. You don't want 'sort' to do that, so you should specify the -s (--stable)
option. -s is a GNU extension.

Closing the bug report, as this should fix the problem for you.





bug#42211: Problem in sort

2020-07-05 Thread Paul Eggert
On 7/4/20 2:39 PM, Richard Freedman wrote:
> When I use sort -n -c on a specified column in a file sort reports an error 
> and then stops if two numbers are exactly the same.

Could you send us the input, and the output of "sort --debug -n -c -k3"
(assuming you're using column 3)? My guess is that the output will explain the
symptoms you're seeing, but if not then we'd like to see the test case. Thanks.





bug#8061: Introduce SEEK_DATA/SEEK_HOLE to extent_scan module

2020-06-25 Thread Paul Eggert
This email is follow up to <https://bugs.gnu.org/8601> dated 2011-05-01. Jeff,
thanks for reporting the problem. (There's a good chance this email will bounce
but I'll send it to your 2011 email address anyway.)

I recently ran into the same issue and derived the attached patches
independently. I then found your bug report, made sure the attached patches
fixed every problem that your proposal did, and installed the attached patches
into Savannah.

The attached patches 1-3 merely fix typos and refactor.

Patch 4 corresponds to your proposal; however, it differs in that its basic idea
is to use the FIEMAP code only as a fallback if SEEK_DATA doesn't work, rather
than try to add to the already-too-complicated code that fiddles with FIEMAPs.
(I don't observe any significant performance advantage to the FIEMAP stuff, but
maybe that's just me.)

Patch 5 adds opportunistic use of the copy_file_range syscall introduced in
Linux kernel 4.5 (2016) and reworked in 5.3 (2019). This should improve 'cp'
performance on kernels and file systems that support copy_file_range.
>From 4fe5259ab6c9e459a6db5938d143a9c65be113d9 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 25 Jun 2020 18:10:49 -0700
Subject: [PATCH 1/5] maint: typo fix

* NEWS: Fix typo.
---
 NEWS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index d36259641..d713fa724 100644
--- a/NEWS
+++ b/NEWS
@@ -17,7 +17,7 @@ GNU coreutils NEWS-*- outline -*-
 
   cp and install now default to copy-on-write (COW) if available.
 
-  On GNU/Linux systems, ls no longer issues an error message on
+  On GNU/Linux systems, ls no longer issues an error message on a
   directory merely because it was removed.  This reverts a change
   that was made in release 8.32.
 
-- 
2.25.4

>From 51981008f9892d44231c432535deac4f9b3cbe5e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 23 Jun 2020 19:18:04 -0700
Subject: [PATCH 2/5] cp: refactor extent_copy

* src/copy.c (extent_copy): New arg SCAN, replacing
REQUIRE_NORMAL_COPY.  All callers changed.
(enum scantype): New type.
(infer_scantype): Rename from is_probably_sparse and return
the new type.  Add args FD and SCAN.  All callers changed.
---
 src/copy.c | 119 +
 1 file changed, 55 insertions(+), 64 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index 54601ce07..f694f913f 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -422,9 +422,8 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
  size_t hole_size, off_t src_total_size,
  enum Sparse_type sparse_mode,
  char const *src_name, char const *dst_name,
- bool *require_normal_copy)
+ struct extent_scan *scan)
 {
-  struct extent_scan scan;
   off_t last_ext_start = 0;
   off_t last_ext_len = 0;
 
@@ -432,45 +431,25 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
  We may need this at the end, for a final ftruncate.  */
   off_t dest_pos = 0;
 
-  extent_scan_init (src_fd, );
-
-  *require_normal_copy = false;
   bool wrote_hole_at_eof = true;
-  do
+  while (true)
 {
-  bool ok = extent_scan_read ();
-  if (! ok)
-{
-  if (scan.hit_final_extent)
-break;
-
-  if (scan.initial_scan_failed)
-{
-  *require_normal_copy = true;
-  return false;
-}
-
-  error (0, errno, _("%s: failed to get extents info"),
- quotef (src_name));
-  return false;
-}
-
   bool empty_extent = false;
-  for (unsigned int i = 0; i < scan.ei_count || empty_extent; i++)
+  for (unsigned int i = 0; i < scan->ei_count || empty_extent; i++)
 {
   off_t ext_start;
   off_t ext_len;
   off_t ext_hole_size;
 
-  if (i < scan.ei_count)
+  if (i < scan->ei_count)
 {
-  ext_start = scan.ext_info[i].ext_logical;
-  ext_len = scan.ext_info[i].ext_length;
+  ext_start = scan->ext_info[i].ext_logical;
+  ext_len = scan->ext_info[i].ext_length;
 }
   else /* empty extent at EOF.  */
 {
   i--;
-  ext_start = last_ext_start + scan.ext_info[i].ext_length;
+  ext_start = last_ext_start + scan->ext_info[i].ext_length;
   ext_len = 0;
 }
 
@@ -498,7 +477,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
 {
   error (0, errno, _("cannot lseek %s"), quoteaf (src_name));
 fail:
-  extent_scan_free ();
+  extent_scan_free (scan);
   return false;
 }
 
@@ -539,7 +518,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
   /* For now, do not tre

bug#41944: cp: default to --reflink=auto, revisted

2020-06-18 Thread Paul Eggert
Thanks, I'd forgotten that. The performance improvement is long overdue, so I
installed the attached.
>From 25725f9d41735d176d73a757430739fb71c7d043 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 18 Jun 2020 22:16:24 -0700
Subject: [PATCH] cp: default to COW
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Likewise for ‘install’.  Proposed in Bug#24400, and long past due.
* NEWS:
* doc/coreutils.texi (cp invocation):
* src/copy.h (enum Reflink_type): Document this.
* src/cp.c (cp_option_init):
* src/install.c (cp_option_init): Implement this.
---
 NEWS   |  2 ++
 doc/coreutils.texi | 19 ---
 src/copy.h |  4 ++--
 src/cp.c   |  2 +-
 src/install.c  |  2 +-
 5 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index 8ddd0e22f..655ff779f 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,8 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Changes in behavior
 
+  cp and install now default to copy-on-write (COW) if available.
+
   On GNU/Linux systems, ls no longer issues an error message on
   directory merely because it was removed.  This reverts a change
   that was made in release 8.32.
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 3432fb294..4bbb960b7 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -8864,12 +8864,14 @@ The @var{when} value can be one of the following:
 
 @table @samp
 @item always
-The default behavior: if the copy-on-write operation is not supported
+If the copy-on-write operation is not supported
 then report the failure for each file and exit with a failure status.
+Plain @option{--reflink} is equivalent to @option{--reflink=when}.
 
 @item auto
 If the copy-on-write operation is not supported then fall back
 to the standard copy behavior.
+This is the default if no @option{--reflink} option is given.
 
 @item never
 Disable copy-on-write operation and use the standard copy behavior.
@@ -8878,12 +8880,6 @@ Disable copy-on-write operation and use the standard copy behavior.
 This option is overridden by the @option{--link}, @option{--symbolic-link}
 and @option{--attributes-only} options, thus allowing it to be used
 to configure the default data copying behavior for @command{cp}.
-For example, with the following alias, @command{cp} will use the
-minimum amount of space supported by the file system.
-
-@example
-alias cp='cp --reflink=auto --sparse=always'
-@end example
 
 @item --remove-destination
 @opindex --remove-destination
@@ -8928,6 +8924,15 @@ This is useful in creating a file for use with the @command{mkswap} command,
 since such a file must not have any holes.
 @end table
 
+For example, with the following alias, @command{cp} will use the
+minimum amount of space supported by the file system.
+(Older versions of @command{cp} can also benefit from
+@option{--reflink=auto} here.)
+
+@example
+alias cp='cp --sparse=always'
+@end example
+
 @optStripTrailingSlashes
 
 @item -s
diff --git a/src/copy.h b/src/copy.h
index 874d6f71c..a0ad494b9 100644
--- a/src/copy.h
+++ b/src/copy.h
@@ -46,10 +46,10 @@ enum Sparse_type
 /* Control creation of COW files.  */
 enum Reflink_type
 {
-  /* Default to a standard copy.  */
+  /* Do a standard copy.  */
   REFLINK_NEVER,
 
-  /* Try a COW copy and fall back to a standard copy.  */
+  /* Try a COW copy and fall back to a standard copy; this is the default.  */
   REFLINK_AUTO,
 
   /* Require a COW copy and fail if not available.  */
diff --git a/src/cp.c b/src/cp.c
index 8db2c4b9e..a4ecbbc9f 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -793,7 +793,7 @@ cp_option_init (struct cp_options *x)
   x->move_mode = false;
   x->install_mode = false;
   x->one_file_system = false;
-  x->reflink_mode = REFLINK_NEVER;
+  x->reflink_mode = REFLINK_AUTO;
 
   x->preserve_ownership = false;
   x->preserve_links = false;
diff --git a/src/install.c b/src/install.c
index 22124d51b..a94053f4d 100644
--- a/src/install.c
+++ b/src/install.c
@@ -264,7 +264,7 @@ cp_option_init (struct cp_options *x)
 {
   cp_options_default (x);
   x->copy_as_regular = true;
-  x->reflink_mode = REFLINK_NEVER;
+  x->reflink_mode = REFLINK_AUTO;
   x->dereference = DEREF_ALWAYS;
   x->unlink_dest_before_opening = true;
   x->unlink_dest_after_failed_open = false;
-- 
2.17.1



bug#41664: du give file not accessible

2020-06-02 Thread Paul Eggert
On 6/2/20 2:42 AM, Sumit Gupta wrote:
> Is this expected or shall the command be modified to ignore such files?

It does seem reasonable that 'du' should ignore a file when the system call says
ENOENT, since the file isn't there and cannot be consuming disk space.





bug#37702: Suggestion for 'df' utility

2020-05-30 Thread Paul Eggert
On 5/30/20 4:49 AM, Erik Auerswald wrote:
> I concur that a command line option to override config file (or env var)
> settings seems useful if a config file and/or env var approach is used.

In other utilities we've been moving away from environment variables and/or
config files for the usual security and other-hassle reasons. So I'd prefer
having 'df' just do the "right" thing by default, and to have an option to
override that. The "right" thing should be to ignore all these pseudofilesystems
that hardly anybody cares about.





bug#41554: chmod allows removing x bit on chmod without a force flag, which can be inconvenient to recover from

2020-05-26 Thread Paul Eggert
On 5/26/20 6:30 PM, Will Rosecrans wrote:
> The underlying safety logic is similar to that behind the
> existing "--(no-)preserve-root"

I think not. There are all sorts of other things one shouldn't chmod either, but
we can't and shouldn't maintain a long list. Let's stop with "/".





bug#41480: Chars out of order in date.c string

2020-05-23 Thread Paul Eggert
On 5/23/20 4:41 AM, Anders Jonsson wrote:

> I noticed one thing when having a look at the Swedish translation of 
> coreutils.
> 
>>#: src/date.c:196
>>msgid ""
>>"  %F   full date; like %+4Y-%m-%d\n"

There must be some confusion here, because this translation is for coreutils
8.31 and later.

> This doesn't give the expected result when I try it in coreutils 8.30 in 
> Debian
> testing:

That's because the behavior of coreutils changed in 8.31. The translation string
you're talking about was introduced in coreutils 8.31, so I'm puzzled as to why
it'd be used with coreutils 8.30.

Here's the behavior change in 8.31:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=188d87b05190690d6f8b0577ec65ef221a711d08

and here's the closely-related documentation change in 8.31:

https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=2ab2f7a422652a9ec887e08ca8935b44e9629505





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-10 Thread Paul Eggert
On 5/7/20 7:06 PM, Eric Blake wrote:
> 
> (My personal wish: I would love a variation of mkdir that returns an open fd 
> on
> the just-created directory on success in a single syscall,

Yes! That would be a worthy addition.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-02 Thread Paul Eggert
On 5/2/20 3:41 PM, Jonny Grant wrote:
> Is a more accurate strerror considered unreliable?
> 
> Current:
> mkdir: cannot create directory ‘test’: File exists
> 
> Proposed:
> mkdir: cannot create directory ‘test’: Is a directory

I don't understand this comment. As I understand it you're proposing a change to
the mkdir command not a change to the strerror library function, and the change
you're proposing would introduce a race condition to the mkdir command.

A better fix would be to change the mkdir system call so that it sets errno to
EISDIR in this situation. This would fix not only the mkdir utility, but also
lots of other programs; and it wouldn't introduce a race condition. So if you're
interested in getting the problem fixed, I suggest that you propose such a
change to the Linux kernel developers.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-02 Thread Paul Eggert
On 5/2/20 6:26 AM, Jonny Grant wrote:
> If developers have race conditions in their shell scripts

I've personally fixed a bug in the GNU mkdir command that was triggered by such
races. Core utilities should be reliable even when these races are happening.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-01 Thread Paul Eggert
On 5/1/20 1:21 PM, Jonny Grant wrote:
> yes, the fix pretty trivial for mkdir as you highlight EISDIR:
> stat(), S_ISDIR(sb.st_mode), and set errno to EISDIR or output 
> strerror(EISDIR)

That would introduce a race condition, and wouldn't behave correctly if some
other process changes the destination from a regular file to a directory between
the time we call mkdir and the time that we call stat.





bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-01 Thread Paul Eggert
On 5/1/20 9:16 AM, Jonny Grant wrote:
> rm: cannot remove 'test': Is a directory

That's because rm used unlink which failed with EISDIR, which is a different
error number.

Consider this example:

$ >d # Create an empty regular file.
$ mkdir d
mkdir: cannot create directory ‘d’: File exists

Here the system call mkdir("d", 0777) failed with errno == EEXIST (File exists).
Presumably you wouldn't object to the diagnostic here because d is a regular
file, not a directory. But the mkdir system call fails in exactly the same way
if d is a directory, so the error message is the same in both cases.

Directories are files, so the error message is correct even if it confused you.
I don't see any portable and efficient way to make the diagnostic less confusing
for you, without also making diagnostic incorrect in some other scenarios (such
as the scenario described above).





bug#40904: listing multiple subdirectories places filenames in different columns between each subdirectory

2020-04-27 Thread Paul Eggert
On 4/27/20 7:36 AM, Jim Clark wrote:
> When I list a hard drive "ls -AR > list.txt" and import it into Libreoffice
> Calc, then break the lines using "text-to-columns", I am not able to
> perform a fixed format break so that the filenames are placed in their own
> column.

I can't reproduce the problem. All the file names start at the beginning of the
line. Quite possibly you're using an alias, so that your 'ls' is not the plain
vanilla 'ls'. At any rate, 'find' is probably a better tool for what you want 
to d.





bug#40509: Use of fsetxattr() in cp tickles an EXT leak (possibly unnecessarily so)

2020-04-15 Thread Paul Eggert

On 4/15/20 7:11 AM, Gregg Leventhal wrote:


+xattr_size = flistxattr(src_fd, list, size);
+if ( xattr_size || errno == ERANGE )


Surely this should be 'if (flistxattr (src_fd, NULL, 0) < 0 && errno == 
ERANGE)'.


If you agree with this direction, I can continue, addressing other affected
code paths (i.e --preserve=mode).


This sounds like a good thing to do. Before you spend a lot of time on it, 
though, would you be willing to assign copyright to your work product to the FSF 
so that we could install the patch? If so, I can send you email on how to fill 
out the paperwork; if not, we'd better arrange for someone else to write the fix.






bug#40586: date and '%-N' does not appear to remove leading zeros anymore, but trailing zeros.

2020-04-12 Thread Paul Eggert

On 4/12/20 1:51 PM, Drake Jacovian wrote:

Obviously, removing trailing zeroes will changes it value.


%-N is intended to be used after a decimal point, so removing trailing zeros 
does not change its value in its intended use.






bug#40540: Faster sort with locale

2020-04-10 Thread Paul Eggert

On 4/10/20 6:19 AM, Ole Tange wrote:

But would it be possible to convert the input string1 into a string in
a generalized format, which would sort the same way as the localized
sort, but using a simple compare?


I tried doing that a long time ago by using strxfrm, but it made 'sort' 
significantly slower. You're welcome to try again; perhaps things have changed.






bug#40509: Use of fsetxattr() in cp tickles an EXT leak (possibly unnecessarily so)

2020-04-08 Thread Paul Eggert

On 4/8/20 7:15 AM, Gregg Leventhal wrote:


rsync doesn't make set/get xattr calls and purports to preserve ACLs with
-A.


I'm not quite following your bug report, but it appears that you're saying that 
cp could somehow discover that it needn't use fgetxattr and fsetxattr on files 
that lack extended attributes, and for those files cp could stick with ordinary 
POSIX syscalls (e.g., umask, chmod) to give files proper permissions, and in 
that case 'cp' would presumably (a) operate more efficiently and (b) not trigger 
a bug in the EXT filesystem.


This sounds like a worthy suggestion, though of course it would be better to 
have a concrete proposal in the form of a coreutils patch, along with a few 
performance measurements. For starters, how does rsync do it?


Also, of course EXT should be fixed regardless of what coreutils does here.





bug#40220: date command set linux epoch time failed

2020-03-30 Thread Paul Eggert

On 3/29/20 9:32 PM, Bob Proulx wrote:

Both calls from GNU date are returning EINVAL.  Those are Linux kernel
system calls.  Those Linux kernel system calls are using
CLOCK_MONOTONIC.


OK, I think I understand now. For some reason Linux prohibits you from setting 
CLOCK_REALTIME to a value less than what CLOCK_MONOTONIC would report. I don't 
know why Linux has this restriction - it violates POSIX as near as I can tell - 
but at any rate as you say it's a problem with the Linux kernel, not with GNU 
'date'.






bug#40220: date command set linux epoch time failed

2020-03-29 Thread Paul Eggert

On 3/28/20 9:12 AM, Bob Proulx wrote:

By reading the documentation for CLOCK_MONOTONIC in clock_gettime(2):


GNU 'date' doesn't use CLOCK_MONOTONIC, so why is CLOCK_MONOTONIC relevant to 
this bug report?


Is this some busybox thing? If so, user 'shy' needs to report it to the busybox 
people, not to bug-coreutils.






bug#40220: date command set linux epoch time failed

2020-03-28 Thread Paul Eggert

On 3/27/20 11:52 PM, Bob Proulx wrote:

I tested this in a victim system and if I was very quick I was able to
log in and set the time to :10 seconds but no earlier.


Sounds like some sort of atomic-time thing, since UTC and TAI differed by 10 
seconds when they started up in 1972. Perhaps the clock in question uses TAI 
internally?






bug#39929: coreutils-8.32 fails to build on aarch64

2020-03-07 Thread Paul Eggert

On 3/5/20 11:36 PM, Bernhard Voelker wrote:

s/emits/shall not emit/

P.S. Also the check for $host_triplet containing 'linux' in test is:
a) no longer needed, ...


Thanks for catching those; I installed the attached further patch.
>From ab149bd415daf1cb8ecde0b948bc0a2663611a61 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 7 Mar 2020 10:29:51 -0800
Subject: [PATCH] ls: improve removed-directory test

* tests/ls/removed-directory.sh: Remove host_triplet test.
Skip this test if one cannot remove the working directory.
>From a suggestion by Bernhard Voelker (Bug#39929).
---
 tests/ls/removed-directory.sh | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/tests/ls/removed-directory.sh b/tests/ls/removed-directory.sh
index fe8f929a1..63b209dee 100755
--- a/tests/ls/removed-directory.sh
+++ b/tests/ls/removed-directory.sh
@@ -1,7 +1,7 @@
 #!/bin/sh
-# If ls is asked to list a removed directory (e.g. the parent process's
-# current working directory that has been removed by another process), it
-# emits an error message.
+# If ls is asked to list a removed directory (e.g., the parent process's
+# current working directory has been removed by another process), it
+# should not emit an error message merely because the directory is removed.
 
 # Copyright (C) 2020 Free Software Foundation, Inc.
 
@@ -21,15 +21,10 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ ls
 
-case $host_triplet in
-  *linux*) ;;
-  *) skip_ 'non linux kernel' ;;
-esac
-
 cwd=$(pwd)
 mkdir d || framework_failure_
 cd d || framework_failure_
-rmdir ../d || framework_failure_
+rmdir ../d || skip_ "can't remove working directory on this platform"
 
 ls >../out 2>../err || fail=1
 cd "$cwd" || framework_failure_
-- 
2.17.1



bug#39929: coreutils-8.32 fails to build on aarch64

2020-03-05 Thread Paul Eggert

On 3/5/20 1:43 PM, Paul Eggert wrote:

Why is this code even there at all? If readdir(3) says that the current 
directory has no entries, shouldn't 'ls' just say that? Why should ls 
report an error simply because the current directory isn't reachable 
from the filesystem? Whether the current directory is unreachable has 
nothing to do with ls's job, which is to report whether the current 
directory has entries.


Attached is a proposed patch to fix this.
>From 511d0c323bc90a0ab7e8f3672b07a1144885a9e8 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 5 Mar 2020 17:25:29 -0800
Subject: [PATCH] ls: restore 8.31 behavior on removed directories

* NEWS: Mention this.
* src/ls.c: Do not include 
(print_dir): Don't worry about whether the directory is removed.
* tests/ls/removed-directory.sh: Adjust to match new (i.e., old)
behavior.
---
 NEWS  |  6 ++
 src/ls.c  | 22 --
 tests/ls/removed-directory.sh | 10 ++
 3 files changed, 8 insertions(+), 30 deletions(-)

diff --git a/NEWS b/NEWS
index fdc8bf5db..653e7178b 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,12 @@ GNU coreutils NEWS-*- outline -*-
 
 * Noteworthy changes in release ?.? (-??-??) [?]
 
+** Changes in behavior
+
+  On GNU/Linux systems, ls no longer issues an error message on
+  directory merely because it was removed.  This reverts a change
+  that was made in release 8.32.
+
 
 * Noteworthy changes in release 8.32 (2020-03-05) [stable]
 
diff --git a/src/ls.c b/src/ls.c
index 24b983287..4acf5f44d 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -49,10 +49,6 @@
 # include 
 #endif
 
-#ifdef __linux__
-# include 
-#endif
-
 #include 
 #include 
 #include 
@@ -2896,7 +2892,6 @@ print_dir (char const *name, char const *realname, bool command_line_arg)
   struct dirent *next;
   uintmax_t total_blocks = 0;
   static bool first = true;
-  bool found_any_entries = false;
 
   errno = 0;
   dirp = opendir (name);
@@ -2972,7 +2967,6 @@ print_dir (char const *name, char const *realname, bool command_line_arg)
   next = readdir (dirp);
   if (next)
 {
-  found_any_entries = true;
   if (! file_ignored (next->d_name))
 {
   enum filetype type = unknown;
@@ -3018,22 +3012,6 @@ print_dir (char const *name, char const *realname, bool command_line_arg)
   if (errno != EOVERFLOW)
 break;
 }
-#ifdef __linux__
-  else if (! found_any_entries)
-{
-  /* If readdir finds no directory entries at all, not even "." or
- "..", then double check that the directory exists.  */
-  if (syscall (SYS_getdents, dirfd (dirp), NULL, 0) == -1
-  && errno != EINVAL)
-{
-  /* We exclude EINVAL as that pertains to buffer handling,
- and we've passed NULL as the buffer for simplicity.
- ENOENT is returned if appropriate before buffer handling.  */
-  file_failure (command_line_arg, _("reading directory %s"), name);
-}
-  break;
-}
-#endif
   else
 break;
 
diff --git a/tests/ls/removed-directory.sh b/tests/ls/removed-directory.sh
index e8c835dab..fe8f929a1 100755
--- a/tests/ls/removed-directory.sh
+++ b/tests/ls/removed-directory.sh
@@ -26,20 +26,14 @@ case $host_triplet in
   *) skip_ 'non linux kernel' ;;
 esac
 
-LS_FAILURE=2
-
-cat <<\EOF >exp-err || framework_failure_
-ls: reading directory '.': No such file or directory
-EOF
-
 cwd=$(pwd)
 mkdir d || framework_failure_
 cd d || framework_failure_
 rmdir ../d || framework_failure_
 
-returns_ $LS_FAILURE ls >../out 2>../err || fail=1
+ls >../out 2>../err || fail=1
 cd "$cwd" || framework_failure_
 compare /dev/null out || fail=1
-compare exp-err err || fail=1
+compare /dev/null err || fail=1
 
 Exit $fail
-- 
2.24.1



bug#39929: coreutils-8.32 fails to build on aarch64

2020-03-05 Thread Paul Eggert

On 3/5/20 9:39 AM, Pádraig Brady wrote:

Ah well.
Does the attached address this for you.


Eeeuw.

Why is this code even there at all? If readdir(3) says that the current 
directory has no entries, shouldn't 'ls' just say that? Why should ls 
report an error simply because the current directory isn't reachable 
from the filesystem? Whether the current directory is unreachable has 
nothing to do with ls's job, which is to report whether the current 
directory has entries.






bug#39850: "du" command can not count some files

2020-03-01 Thread Paul Eggert
I don't see a bug there, as the files you say "du" is not counting have counts 
of zero.






bug#38627: uniq -c gets wrong count with non-ascii strings

2020-02-23 Thread Paul Eggert

On 2/23/20 11:43 AM, Pádraig Brady wrote:


 #include "hard-locale.h"
 #include "posixver.h"
 #include "stdio--.h"
-#include "xmemcoll.h"


Please also remove the '#include "hard-locale.h"' line.

Thanks for fixing this.





bug#39693: Sv: bug#39693: Any chance of fixing --rfc-3339 to conform to the standard?

2020-02-21 Thread Paul Eggert

On 2/20/20 11:56 PM, Mads Bondo Dydensborg wrote:

Your statement is in conflict with the message exchange, referenced by the bug 
I linked to, with, as I understand it, the authors of the standard:


Not really. In that email exchange one of the authors of the RFC 
mentioned a goal of the RFC. The part of the RFC that I quoted, though, 
is an explicit exception to that particular goal. The RFC had several 
goals, they sometimes conflicted, and the RFC's text was a compromise. I 
was involved with the drafting of the RFC, and remember the history 
reasonably well.



The ISO output from date can not be used, as it uses a "," as fractional 
separator


You can use the following if you want subsecond resolution with both 'T' 
and '.':


date '+%Y-%m-%dT%H:%M:%S.%N%:z'

This won't work for some historical timestamps (e.g., the Netherlands 
before 1937) but RFC 3339 doesn't support them anyway so it's probably 
good enough.






bug#39693: Any chance of fixing --rfc-3339 to conform to the standard?

2020-02-20 Thread Paul Eggert

On 2/20/20 4:39 AM, Mads Bondo Dydensborg wrote:

As have been established in 2006 and again in 2010, the rfc-3339 mandates the use of 
"T" in a single field timestamp.


No, RFC 3339 explicitly allows the use of space. It says:

  NOTE: ISO 8601 defines date and time separated by "T".
  Applications using this syntax may choose, for the sake of
  readability, to specify a full-date and full-time separated by
  (say) a space character.

This paragraph was put into the RFC at my suggestion, precisely so that GNU 
"date" output wouldn't have to contain that "T".


Tf you want GNU 'date' to output the 'T', you can use 'date --iso-8601=s' 
instead of 'date --rfc-3339=s'. That's the point of having these two options for 
GNU 'date'. If it weren't for this difference in behavior, GNU 'date' wouldn't 
have needed a --rfc-3339 option in the first place, and we shouldn't change the 
meaning of --rfc-3339 to eviscerate the whole point of the option.






bug#39611: coreutils v8.31 fails to compile with -Ofast

2020-02-14 Thread Paul Eggert

On 2/14/20 3:21 PM, zsugabubus wrote:

$ export CFLAGS=-Ofast # Works with -O3
$ ./configure && make


Coreutils (and many other programs) is not compatible with -Ofast, which is not 
surprising as -Ofast is documented to not work in many cases. The obvious 
workaround is to not use -Ofast.






bug#39236: [musl] coreutils cp mishandles error return from lchmod

2020-02-07 Thread Paul Eggert

On 1/22/20 2:05 PM, Rich Felker wrote:

I think we're approaching a consensus that glibc should fix this too,
so then it would just be gnulib matching the fix.


I installed the attached patch to Gnulib in preparation for the upcoming 
glibc fix. The patch causes fchmodat with AT_SYMLINK_NOFOLLOW to work on 
non-symlinks, and similarly for lchmod on non-symlinks. The idea is to 
avoid this sort of problem in the future, and to let Coreutils etc. work 
on older platforms as if glibc 2.32 (or whatever) is already in place.
>From b16a04394121e7396569a13161dba02c6752b19f Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 7 Feb 2020 16:34:12 -0800
Subject: [PATCH] fchmodat: AT_SYMLINK_NOFOLLOW fix for non-symlinks

Fix lchmod, and fchmodat with AT_SYMLINK_NOFOLLOW, so that
they act like chmod on non-symlinks.
* NEWS:
* doc/glibc-functions/lchmod.texi (lchmod):
* doc/posix-functions/fchmodat.texi (fchmodat):
Mention this.
* lib/fchmodat.c: Define __need_system_sys_stat_h before including
config.h, and undef it after including sys/stat.h the first time.
Include fcntl.h, stdio.h, unistd.h, intprops.h, and include
sys/stat.h a second time after defining orig_fchmodat.
(orig_fchmodat) [HAVE_FCHMODAT]: New function.
(fchmodat) [HAVE_FCHMODAT]: Work around the AT_SYMLINK_NOFOLLOW bug.
* lib/lchmod.c: New file.
* lib/sys_stat.in.h (fchmodat, lchmod):
Support replacing these functions.
* m4/fchmodat.m4 (gl_FUNC_FCHMODAT): If fchmodat exists,
test that AT_SYMLINK_NOFOLLOW works on non-symlinks.
* m4/lchmod.m4 (gl_FUNC_LCHMOD): Check for lstat.
Test that lchmod works on non-symlinks.
* m4/sys_stat_h.m4 (gl_SYS_STAT_H_DEFAULTS):
Default REPLACE_FCHMODAT and REPLACE_LCHMOD to 0.
* modules/fchmodat (Depends-on): Add fstatat, intprops, lchmod, unistd.
(Depends-on, configure.ac): Check REPLACE_FCHMODAT too.
* modules/lchmod (Files): Add lib/lchmod.c.
(Depends-on): Add errno, fcntl-h, fchmodat, intprops, lstat, unistd.
(configure.ac): Compile lchmod.c if needed.
(lib_SOURCES): Add lchmod.c.
* modules/sys_stat (sys/stat.h): Substitute REPLACE_FCHMODAT
and REPLACE_LCHMOD.
* tests/test-fchmodat.c: Include fcntl.h, sys/stat.h.
(main): Test fchmodat with AT_SYMLINK_NOFOLLOW on non-symlinks.
---
 ChangeLog | 35 
 NEWS  |  7 +++
 doc/glibc-functions/lchmod.texi   |  4 ++
 doc/posix-functions/fchmodat.texi | 11 ++--
 lib/fchmodat.c| 89 +--
 lib/lchmod.c  | 72 +
 lib/sys_stat.in.h | 41 +++---
 m4/fchmodat.m4| 48 -
 m4/lchmod.m4  | 52 --
 m4/sys_stat_h.m4  |  4 +-
 modules/fchmodat  | 10 ++--
 modules/lchmod| 13 -
 modules/sys_stat  |  2 +
 tests/test-fchmodat.c | 10 
 14 files changed, 348 insertions(+), 50 deletions(-)
 create mode 100644 lib/lchmod.c

diff --git a/ChangeLog b/ChangeLog
index 99e0c2e9e..71dcaba6c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,38 @@
+2020-02-07  Paul Eggert  
+
+	fchmodat: AT_SYMLINK_NOFOLLOW fix for non-symlinks
+	Fix lchmod, and fchmodat with AT_SYMLINK_NOFOLLOW, so that
+	they act like chmod on non-symlinks.
+	* NEWS:
+	* doc/glibc-functions/lchmod.texi (lchmod):
+	* doc/posix-functions/fchmodat.texi (fchmodat):
+	Mention this.
+	* lib/fchmodat.c: Define __need_system_sys_stat_h before including
+	config.h, and undef it after including sys/stat.h the first time.
+	Include fcntl.h, stdio.h, unistd.h, intprops.h, and include
+	sys/stat.h a second time after defining orig_fchmodat.
+	(orig_fchmodat) [HAVE_FCHMODAT]: New function.
+	(fchmodat) [HAVE_FCHMODAT]: Work around the AT_SYMLINK_NOFOLLOW bug.
+	* lib/lchmod.c: New file.
+	* lib/sys_stat.in.h (fchmodat, lchmod):
+	Support replacing these functions.
+	* m4/fchmodat.m4 (gl_FUNC_FCHMODAT): If fchmodat exists,
+	test that AT_SYMLINK_NOFOLLOW works on non-symlinks.
+	* m4/lchmod.m4 (gl_FUNC_LCHMOD): Check for lstat.
+	Test that lchmod works on non-symlinks.
+	* m4/sys_stat_h.m4 (gl_SYS_STAT_H_DEFAULTS):
+	Default REPLACE_FCHMODAT and REPLACE_LCHMOD to 0.
+	* modules/fchmodat (Depends-on): Add fstatat, intprops, lchmod, unistd.
+	(Depends-on, configure.ac): Check REPLACE_FCHMODAT too.
+	* modules/lchmod (Files): Add lib/lchmod.c.
+	(Depends-on): Add errno, fcntl-h, fchmodat, intprops, lstat, unistd.
+	(configure.ac): Compile lchmod.c if needed.
+	(lib_SOURCES): Add lchmod.c.
+	* modules/sys_stat (sys/stat.h): Substitute REPLACE_FCHMODAT
+	and REPLACE_LCHMOD.
+	* tests/test-fchmodat.c: Include fcntl.h, sys/stat.h.
+	(main): Test fchmodat with AT_SYMLINK_NOFOLLOW on non-symlinks.
+
 2020-02-05  Marc Dionne(tiny change)
 
 	mountlist: Consider AFS filesystems as remote
diff --git a/NEWS b/NEWS
index dc5cc71f9..bc81dfc28 100644
--- a/NEWS
+++ b/NEWS
@@ -58,6 +58,13 @@ User visible incompatible changes
 
 D

bug#39273: unwanted behavior in the combination of an scenario regarding btrfs, ssh, borg, and 'df' from the core utils

2020-02-04 Thread Paul Eggert
penat(AT_FDCWD, "/dev/sda1", O_RDONLY) = 4
pread64(4,
"\334\301u\237\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"
..., 4096, 65536) = 4096
close(4)= 0
openat(AT_FDCWD, "/dev/sda1", O_RDONLY) = 4
ioctl(4, BLKGETSIZE64, [412294840320])  = 0
close(4)= 0
sysinfo({uptime=27223, loads=[16736, 24192, 20704],
totalram=25104957440, freeram=17284509696, sharedram=173166592,
bufferram=2154496, totalswap=0, freeswap=0, procs=704, totalhigh=0,
freehigh=0, mem_unit=1}) = 0
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=0} => {total_spaces=4}) = 0
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=4} => {total_spaces=4,
spaces=...}) = 0
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
write(1, "Overall:\n", 9Overall:
)   = 9
write(1, "Device size:\t\t 383.98GiB\n", 29Device
size:  383.98GiB
) = 29
write(1, "Device allocated:\t\t  61.03Gi"..., 34Device
allocated: 61.03GiB
) = 34
write(1, "Device unallocated:\t\t 322.95"..., 36Device
unallocated:  322.95GiB
) = 36
write(1, "Device missing:\t\t 0.00B\n", 32Device
missing:0.00B
) = 32
write(1, "Used:\t\t\t  55.61GiB\n",
23Used:   55.61GiB
) = 23
write(1, "Free (estimated):\t\t 328.22Gi"..., 51Free
(estimated):328.22GiB  (min: 328.22GiB)
) = 51
write(1, "Data ratio:\t\t\t  1.00\n", 29Data
ratio:   1.00
) = 29
write(1, "Metadata ratio:\t\t  1.00\n", 32Metadata
ratio: 1.00
) = 32
write(1, "Global reserve:\t\t 141.11MiB\t"..., 46Global
reserve: 141.11MiB  (used: 0.00B)
) = 46
write(1, "\n", 1
)   = 1
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=0} => {total_spaces=4}) = 0
ioctl(3, BTRFS_IOC_SPACE_INFO, {space_slots=4} => {total_spaces=4,
spaces=...}) = 0
write(1, "Data,single: Size:59.00GiB, Used"..., 51Data,single:
Size:59.00GiB, Used:53.72GiB (91.06%)
) = 51
write(1, "   /dev/sda1\t  59.00GiB\n", 24   /dev/sda1 59.00GiB
) = 24
write(1, "\n", 1
)   = 1
write(1, "Metadata,single: Size:2.00GiB, U"..., 53Metadata,single:
Size:2.00GiB, Used:1.88GiB (94.17%)
) = 53
write(1, "   /dev/sda1\t   2.00GiB\n", 24   /dev/sda1  2.00GiB
) = 24
write(1, "\n", 1
)   = 1
write(1, "System,single: Size:32.00MiB, Us"..., 52System,single:
Size:32.00MiB, Used:16.00KiB (0.05%)
) = 52
write(1, "   /dev/sda1\t  32.00MiB\n", 24   /dev/sda1 32.00MiB
) = 24
write(1, "\n", 1
)   = 1
write(1, "Unallocated:\n", 13Unallocated:
)  = 13
write(1, "   /dev/sda1\t 322.95GiB\n", 24   /dev/sda1322.95GiB
) = 24
close(3)= 0
exit_group(0)   = ?
+++ exited with 0 +++
wismerhill:/home/lux #




wismerhill:/home/lux # btrfs fi df  /
Data, single: total=59.00GiB, used=53.72GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=2.00GiB, used=1.88GiB
GlobalReserve, single: total=141.11MiB, used=0.00B



---


I hope the output helps to track down the reason so this is solveable.
One observation is that the Error appears on filesystems whitch are
pretty empty, less than 60% filled.

Phil :-)





Am Mittwoch, den 29.01.2020, 11:18 -0800 schrieb Paul Eggert:

On 1/29/20 5:42 AM, Wismerhill wrote:


i tried to replicate the error, but i couldn´t do it exact the same
procedure,  the Disk  was all ready filled up to 70%, and was not
empty
like when the error appeared.
I couldn´t move the files away due a space problem.
I tried with my Musicarchiv (ca. 750 GB) but the error didn´t
appear.

As a workaround for my problem (how i filled up the Disk) i created
a
borg backup local on a USB 1 TB Disk (Btrfs Filesystem) without
problems and used Rsync to copy that backup to the server (happens
befor i got your mail).
On the Server i got the same confusing freespace by then, df, and
KDE
Plasma widgets show me 0 Byte left, but Rsync finished without
error,
and the borg repositorie is  working troubleless remote.

As soon as i run again into that error i will do the procedure you
described me, and send in the requested datas.



Thanks for the heads-up. Please cc 39...@debbugs.gnu.org with any
further info that you may provide.







bug#39273: unwanted behavior in the combination of an scenario regarding btrfs, ssh, borg, and 'df' from the core utils

2020-01-29 Thread Paul Eggert

On 1/29/20 5:42 AM, Wismerhill wrote:


i tried to replicate the error, but i couldn´t do it exact the same
procedure,  the Disk  was all ready filled up to 70%, and was not empty
like when the error appeared.
I couldn´t move the files away due a space problem.
I tried with my Musicarchiv (ca. 750 GB) but the error didn´t appear.

As a workaround for my problem (how i filled up the Disk) i created a
borg backup local on a USB 1 TB Disk (Btrfs Filesystem) without
problems and used Rsync to copy that backup to the server (happens
befor i got your mail).
On the Server i got the same confusing freespace by then, df, and KDE
Plasma widgets show me 0 Byte left, but Rsync finished without error,
and the borg repositorie is  working troubleless remote.

As soon as i run again into that error i will do the procedure you
described me, and send in the requested datas.



Thanks for the heads-up. Please cc 39...@debbugs.gnu.org with any 
further info that you may provide.






bug#39273: unwanted behavior in the combination of an scenario regarding btrfs, ssh, borg, and 'df' from the core utils

2020-01-24 Thread Paul Eggert

On 1/24/20 11:50 AM, Wismerhill wrote:

'df' reports a wrong space calculation


What's wrong about the space calculation?

Please give the 'df' command that you ran, its faulty output, and also 
the output of 'strace' applied to the 'df' command that you ran. For 
example, on my machine, 'strace df' outputs the line:


statfs("/tmp", {f_type=TMPFS_MAGIC, f_bsize=4096, f_blocks=1018122, 
f_bfree=1007348, f_bavail=1007348, f_files=1018122, f_ffree=1018073, 
f_fsid={val=[0, 0]}, f_namelen=255, f_frsize=4096, 
f_flags=ST_VALID|ST_NOSUID|ST_NODEV}) = 0


and those are the numbers that 'df' uses to calculate what it should 
output. If those numbers are wrong, df's output will be wrong but it's 
not df's fault - it's the kernel or btrfs or whatever. If those numbers 
are right but df's output is wrong, then df is at fault.






bug#39236: [musl] coreutils cp mishandles error return from lchmod

2020-01-22 Thread Paul Eggert

On 1/22/20 7:08 AM, Florian Weimer wrote:

I think you misread what I wrote: lchmod*always*  returns ENOSYS.  Even
if the file is not a symbolic link.  Likewise, fchmodat with
AT_SYMLINK_NOFOLLOW *always* returns ENOTSUP.


That's too bad, because coreutils (and many other applications, I 
expect) assume that lchmod (and fchmodat with AT_SYMLINK_NOFOLLOW) to 
act like chmod except not follow symlinks, in order to make it less 
likely that the application will run afoul of a symlink race and chmod 
the wrong file. Isn't that how the Linux fstatat call behaves? And if 
so, why does glibc fstatat refuse to support this behavior?


To work around this bug, I suppose coreutils etc. should do something 
like the following:


1. Never use lchmod since the porting nightmare is bad enough without it.

2. On non-glibc systems (or glibc systems where the bug is fixed), use 
fchmodat with AT_SYMLINK_NOFOLLOW.


3. On glibc systems with the bug, use openat with AT_SYMLINK_NOFOLLOW 
and O_PATH, and then fchmod the resulting file descriptor.


Does this sound right? Or is there some O_PATH gotcha that I haven't 
thought about?


Come to think of it, perhaps the best thing would be to change Gnulib's 
lchmod and fchmodat modules so that they do what applications expect, 
even on buggy glibc systems. (Which would be ironic, since Gnulib's main 
goal is to put wrappers around other libraries so that they look more 
like glibc.)






bug#38627: uniq -c gets wrong count with non-ascii strings

2019-12-16 Thread Paul Eggert
On 12/15/19 11:40 AM, Roy Smith wrote:
> With the following input:
> 
>> $ cat x
>> "ⁿᵘˡˡ"
>> "ܥܝܪܐܩ"
> 
> 
> Running "uniq -c" says there's two copies of the same line!
> 
>> $ uniq -c x
>>   2 "ⁿᵘˡˡ"

Thanks for the bug report. I expect this is because GNU 'uniq' uses the
equivalent of strcoll (locale-dependent comparison) to compare lines, whereas
macOS 'uniq' uses the equivalent of strcmp (byte comparison). Since the two
lines compare equal in your locale, GNU 'uniq' says there's just one line.

The GNU 'uniq' behavior appears to be a consequence of this commit:

commit 545c2323d493c7ed9c770d9b8e45a15db6f615bc
Author: Jim Meyering 
Date:   Fri Aug 2 14:42:37 2002 +

with a change noted this way in NEWS:

* uniq now obeys the LC_COLLATE locale, as per POSIX 1003.1-2001 TC1.

However, the 2016 edition of POSIX removed mention of LC_COLLATE from 'uniq',
and I expect this means that the 2002 commit should be reverted so that GNU
'uniq' behaves like macOS 'uniq' (a behavior that I think makes more sense 
anyway).

I'll CC: this email to Jim Meyering to see whether he has an opinion about this.

In the meantime you can work around the problem by using 'LC_ALL=C uniq' instead
of plain 'uniq' in your shell script.





bug#38299: A bug while trying to decode a non encode base64

2019-11-20 Thread Paul Eggert

On 11/20/19 6:22 AM, Martin Schulte wrote:

vardhamanbn1 is a valid encoding


Thanks for explaining; closing the bug report.





bug#38168: shred vs. SSD

2019-11-11 Thread Paul Eggert
Thanks for mentioning this. I installed the attached patch to fix the problems 
that you mentioned, except that I didn't add a section on storage media, data 
remanence, and data forensics (partly because a lot of this stuff is secret).


If someone would like to contribute text in that area, it would be a good thing 
to have (if only to discourage even more users from using 'shred' :-). In the 
meantime I'll take the liberty of closing the bug report.
>From adf41d7c1e8adf11857ee53d51419e218dcd8804 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 11 Nov 2019 16:52:47 -0800
Subject: [PATCH] shred: modernize documentation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* doc/coreutils.texi (shred invocation):
Modernize discussion to today’s technology (Bug#38168).
* src/shred.c (usage): Omit lengthy duplication of the manual’s
discussion of file systems and storage devices, as that became out
of sync with the manual.  Instead, just cite the manual.
---
 doc/coreutils.texi | 152 ++---
 src/shred.c|  42 ++---
 2 files changed, 93 insertions(+), 101 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index b552cc105..32ddba597 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9877,7 +9877,7 @@ by POSIX.
 
 @emph{Warning}: If you use @command{rm} to remove a file, it is usually
 possible to recover the contents of that file.  If you want more assurance
-that the contents are truly unrecoverable, consider using @command{shred}.
+that the contents are unrecoverable, consider using @command{shred}.
 
 The program accepts the following options.  Also see @ref{Common options}.
 
@@ -10019,51 +10019,46 @@ predates the development of the @code{getopt} standard syntax.
 @cindex erasing data
 
 @command{shred} overwrites devices or files, to help prevent even
-very expensive hardware from recovering the data.
-
-Ordinarily when you remove a file (@pxref{rm invocation}), the data is
-not actually destroyed.  Only the index listing where the file is
-stored is destroyed, and the storage is made available for reuse.
-There are undelete utilities that will attempt to reconstruct the index
-and can bring the file back if the parts were not reused.
-
-On a busy system with a nearly-full drive, space can get reused in a few
-seconds.  But there is no way to know for sure.  If you have sensitive
-data, you may want to be sure that recovery is not possible by actually
-overwriting the file with non-sensitive data.
-
-However, even after doing that, it is possible to take the disk back
-to a laboratory and use a lot of sensitive (and expensive) equipment
-to look for the faint ``echoes'' of the original data underneath the
-overwritten data.  If the data has only been overwritten once, it's not
-even that hard.
+extensive forensics from recovering the data.
+
+Ordinarily when you remove a file (@pxref{rm invocation}), its data
+and metadata are not actually destroyed.  Only the file's directory
+entry is removed, and the file's storage is reclaimed only when no
+process has the file open and no other directory entry links to the
+file.  And even if file's data and metadata's storage space is freed
+for further reuse, there are undelete utilities that will attempt to
+reconstruct the file from the data in freed storage, and that can
+bring the file back if the storage was not rewritten.
+
+On a busy system with a nearly-full device, space can get reused in a few
+seconds.  But there is no way to know for sure.  And although the
+undelete utilities and already-existing processes require insider or
+superuser access, you may be wary of the superuser,
+of processes running on your behalf, or of attackers
+that can physically access the storage device.  So if you have sensitive
+data, you may want to be sure that recovery is not possible
+by plausible attacks like these.
 
 The best way to remove something irretrievably is to destroy the media
 it's on with acid, melt it down, or the like.  For cheap removable media
-like floppy disks, this is the preferred method.  However, hard drives
-are expensive and hard to melt, so the @command{shred} utility tries
-to achieve a similar effect non-destructively.
-
-This uses many overwrite passes, with the data patterns chosen to
-maximize the damage they do to the old data.  While this will work on
-floppies, the patterns are designed for best effect on hard drives.
-For more details, see the source code and Peter Gutmann's paper
-@uref{https://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html,
-@cite{Secure Deletion of Data from Magnetic and Solid-State Memory}},
-from the proceedings of the Sixth USENIX Security Symposium (San Jose,
-California, July 22--25, 1996).
-
-@strong{Please note} that @command{shred} relies on a very important assumption:
-that the file system overwrites data in place.  This is the traditional
+this is often the preferred met

  1   2   3   4   5   6   7   8   9   10   >