bug#64392: cksum: escaping issues of --check output

2023-07-01 Thread Christoph Anton Mitterer
On Sat, 2023-07-01 at 18:53 +0100, Pádraig Brady wrote:
> That's not a common case I suppose,
> so I'm amenable to using the consistent escaping here.

Good :-)


> Info docs already contain:
> 
> "Without ‘--zero’, if FILE contains a backslash, newline, or carriage
> return, the line is started with a backslash, and each problematic
> character in the file name is escaped with a backslash, making the
> output unambiguous even in the presence of arbitrary file names."


Well yes, but that's in like the "common" section.

Further down, for --tag, it's explicitly mentioned again there, that
there's the escaping when \ is present as leading escaping indicator.

For --untagged and --check there's no such further mentioning ... so at
least it's a bit inconsistent... and could lead people to think it
would happen only with --tag.


Actually I'd even more "definitely" describe the escaping algorithm
above, in the sense that any \ \r and \n are escaped, and that any
other \-sequence (like \" \0 \xXX etc.) are explicitly reserved for
future use.
This especially in hindsight that other tools may also use the
tagged/unttaged output formats and add their own add-ons assuming
they're free to do so.


Cheers,
Chris.





bug#64392: cksum: escaping issues of --check output

2023-07-01 Thread Pádraig Brady

On 01/07/2023 18:20, Christoph Anton Mitterer wrote:

On Sat, 2023-07-01 at 17:07 +0100, Pádraig Brady wrote:

Right. We traditionally didn't escape any chars in the --check
output,
but that changed with
https://github.com/coreutils/coreutils/commit/646902b30
To minimize escaping, that patch only considered the '\n' character,
but we should also have considered file names with a leading '\'.

The attached should address this.


Thanks, but wouldn't it be better to use exactly the same escaping as
in the sums output? I.e. also escaping \r?


Yes maybe. I was thinking this status output would be less likely to be 
persisted,
and so would not need the same escaping requirements,
but for consistency it may be better to have the same escaping rules,
with the caveat that file names with a literal backslash anywhere
would now be escaped. That's not a common case I suppose,
so I'm amenable to using the consistent escaping here.


Also, documenting the escaping behaviour in info/manpages?


Info docs already contain:

"Without ‘--zero’, if FILE contains a backslash, newline, or carriage
return, the line is started with a backslash, and each problematic
character in the file name is escaped with a backslash, making the
output unambiguous even in the presence of arbitrary file names."

cheers,
Pádraig






bug#64392: cksum: escaping issues of --check output

2023-07-01 Thread Christoph Anton Mitterer
Oh and I've seen you really escape \ only if it's the first character.

Same here, I'd suggest to apply the same escaping rules as for the
other output, and escape '\\' '\n' and '\r' as soon as any of them
occurs in the output.





bug#64392: cksum: escaping issues of --check output

2023-07-01 Thread Christoph Anton Mitterer
On Sat, 2023-07-01 at 17:07 +0100, Pádraig Brady wrote:
> Right. We traditionally didn't escape any chars in the --check
> output,
> but that changed with
> https://github.com/coreutils/coreutils/commit/646902b30
> To minimize escaping, that patch only considered the '\n' character,
> but we should also have considered file names with a leading '\'.
> 
> The attached should address this.

Thanks, but wouldn't it be better to use exactly the same escaping as
in the sums output? I.e. also escaping \r?

Also, documenting the escaping behaviour in info/manpages?


Cheers,
Chris.





bug#64392: cksum: escaping issues of --check output

2023-07-01 Thread Pádraig Brady

On 01/07/2023 01:10, Christoph Anton Mitterer wrote:

Hey.

It seems to me that the output of --check mode in cksum (and likely
also in md5sum and friends) suffers from improper escaping (which,
IIRC, is not even documented for that output... but may be wrong):

$ touch a $'new\nline' '\n' z
$ ls -al
total 0
drwxr-xr-x 1 calestyo calestyo  24 Jul  1 02:01  .
drwxr-xr-x 1 calestyo calestyo 176 Jul  1 01:48  ..
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01  a
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01 'new'$'\n''line'
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01  z
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01 '\n'

$ cksum -a sha512 --tag * > sums.tagged



$ cksum -c sums.tagged
a: OK
\n: OK
\new\nline: OK
z: OK



Assuming the same rules for the --check output as for the sums files, a
leading \ should serve as the escaping indicator.

So for:
\new\nline: OK
that would be fine but for:
\n: OK
it's not but would rather need to be:
\\\n: OK


Right. We traditionally didn't escape any chars in the --check output,
but that changed with https://github.com/coreutils/coreutils/commit/646902b30
To minimize escaping, that patch only considered the '\n' character,
but we should also have considered file names with a leading '\'.

The attached should address this.

Marking this as done.

thanks,
PádraigFrom 7c7870a1f2a608aca1cff24f402e4a44358f08ce Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sat, 1 Jul 2023 17:01:18 +0100
Subject: [PATCH] cksum: escape filenames with a leading '\' in --check status

* src/digest.c (digest_check): Also escape in the case that the
file name has a leading '\'.
* tests/cksum/md5sum-bsd.sh: Add a test case.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/64392
---
 NEWS  | 5 +
 src/digest.c  | 4 +++-
 tests/cksum/md5sum-bsd.sh | 9 ++---
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 3c134db52..c96b153fc 100644
--- a/NEWS
+++ b/NEWS
@@ -19,6 +19,11 @@ GNU coreutils NEWS-*- outline -*-
   cksum again diagnoses read errors in its default CRC32 mode.
   [bug introduced in coreutils-9.0]
 
+  'cksum --check' now ensures filenames with a leading backslash character
+  are escaped appropriately in the status output.
+  This also applies to the standalone checksumming utilities.
+  [bug introduced in coreutils-8.25]
+
   dd again supports more than two multipliers for numbers.
   Previously numbers of the form '1024x1024x32' gave "invalid number" errors.
   [bug introduced in coreutils-9.1]
diff --git a/src/digest.c b/src/digest.c
index 851c7d118..2d4e45b1e 100644
--- a/src/digest.c
+++ b/src/digest.c
@@ -1216,8 +1216,10 @@ digest_check (char const *checkfile_name)
   bool ok;
   bool missing;
   /* Only escape in the edge case producing multiple lines,
+ or starting with a literal '\' character,
  to ease automatic processing of status output.  */
-  bool needs_escape = ! status_only && strchr (filename, '\n');
+  bool needs_escape = ! status_only && (*filename == '\\'
+|| strchr (filename, '\n'));
 
   properly_formatted_lines = true;
 
diff --git a/tests/cksum/md5sum-bsd.sh b/tests/cksum/md5sum-bsd.sh
index 5fe9b2fc9..76c62814f 100755
--- a/tests/cksum/md5sum-bsd.sh
+++ b/tests/cksum/md5sum-bsd.sh
@@ -74,13 +74,16 @@ md5sum --strict -c check.md5 || fail=1
 # with the GNU extension of escaped newlines
 nl='
 '
-tab='	'
+t='	'
 rm check.md5
-for i in 'a\b' 'a\' "a${nl}b" "a${tab}b"; do
+for i in 'a\b' 'a\' '\a' "a${nl}b" "a${t}b"; do
   : > "$i"
   md5sum --tag "$i" >> check.md5 || fail=1
 done
-md5sum --strict -c check.md5 || fail=1
+md5sum --strict -c check.md5 > out || fail=1
+printf '%s: OK\n' 'a\b' 'a\' '\\\a' '\a\nb' "a${t}b" > exp || framework_failure_
+compare exp out || fail=1
+
 
 # Ensure BSD traditional format with GNU extension escapes
 # is in the expected format
-- 
2.41.0



bug#64390: cksum: cannot --check untagged sums file

2023-07-01 Thread Christoph Anton Mitterer
On Sat, 2023-07-01 at 09:05 +0200, Andreas Schwab wrote:
> $ cksum -a sha512 -c sums.untagged

Haha... I'm so dumb. ^^
Thanks for pointing to the obvious.


Thus closing.

Cheers,
Chris.





bug#64390: cksum: cannot --check untagged sums file

2023-07-01 Thread Andreas Schwab
On Jul 01 2023, Christoph Anton Mitterer wrote:

> $ cksum -c sums.untagged 
> cksum: sums.untagged: no properly formatted checksum lines found

$ cksum -a sha512 -c sums.untagged

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."