bug#64392: cksum: escaping issues of --check output
On Sat, 2023-07-01 at 18:53 +0100, Pádraig Brady wrote: > That's not a common case I suppose, > so I'm amenable to using the consistent escaping here. Good :-) > Info docs already contain: > > "Without ‘--zero’, if FILE contains a backslash, newline, or carriage > return, the line is started with a backslash, and each problematic > character in the file name is escaped with a backslash, making the > output unambiguous even in the presence of arbitrary file names." Well yes, but that's in like the "common" section. Further down, for --tag, it's explicitly mentioned again there, that there's the escaping when \ is present as leading escaping indicator. For --untagged and --check there's no such further mentioning ... so at least it's a bit inconsistent... and could lead people to think it would happen only with --tag. Actually I'd even more "definitely" describe the escaping algorithm above, in the sense that any \ \r and \n are escaped, and that any other \-sequence (like \" \0 \xXX etc.) are explicitly reserved for future use. This especially in hindsight that other tools may also use the tagged/unttaged output formats and add their own add-ons assuming they're free to do so. Cheers, Chris.
bug#64392: cksum: escaping issues of --check output
On 01/07/2023 18:20, Christoph Anton Mitterer wrote: On Sat, 2023-07-01 at 17:07 +0100, Pádraig Brady wrote: Right. We traditionally didn't escape any chars in the --check output, but that changed with https://github.com/coreutils/coreutils/commit/646902b30 To minimize escaping, that patch only considered the '\n' character, but we should also have considered file names with a leading '\'. The attached should address this. Thanks, but wouldn't it be better to use exactly the same escaping as in the sums output? I.e. also escaping \r? Yes maybe. I was thinking this status output would be less likely to be persisted, and so would not need the same escaping requirements, but for consistency it may be better to have the same escaping rules, with the caveat that file names with a literal backslash anywhere would now be escaped. That's not a common case I suppose, so I'm amenable to using the consistent escaping here. Also, documenting the escaping behaviour in info/manpages? Info docs already contain: "Without ‘--zero’, if FILE contains a backslash, newline, or carriage return, the line is started with a backslash, and each problematic character in the file name is escaped with a backslash, making the output unambiguous even in the presence of arbitrary file names." cheers, Pádraig
bug#64392: cksum: escaping issues of --check output
Oh and I've seen you really escape \ only if it's the first character. Same here, I'd suggest to apply the same escaping rules as for the other output, and escape '\\' '\n' and '\r' as soon as any of them occurs in the output.
bug#64392: cksum: escaping issues of --check output
On Sat, 2023-07-01 at 17:07 +0100, Pádraig Brady wrote: > Right. We traditionally didn't escape any chars in the --check > output, > but that changed with > https://github.com/coreutils/coreutils/commit/646902b30 > To minimize escaping, that patch only considered the '\n' character, > but we should also have considered file names with a leading '\'. > > The attached should address this. Thanks, but wouldn't it be better to use exactly the same escaping as in the sums output? I.e. also escaping \r? Also, documenting the escaping behaviour in info/manpages? Cheers, Chris.
bug#64392: cksum: escaping issues of --check output
On 01/07/2023 01:10, Christoph Anton Mitterer wrote: Hey. It seems to me that the output of --check mode in cksum (and likely also in md5sum and friends) suffers from improper escaping (which, IIRC, is not even documented for that output... but may be wrong): $ touch a $'new\nline' '\n' z $ ls -al total 0 drwxr-xr-x 1 calestyo calestyo 24 Jul 1 02:01 . drwxr-xr-x 1 calestyo calestyo 176 Jul 1 01:48 .. -rw-r--r-- 1 calestyo calestyo 0 Jul 1 02:01 a -rw-r--r-- 1 calestyo calestyo 0 Jul 1 02:01 'new'$'\n''line' -rw-r--r-- 1 calestyo calestyo 0 Jul 1 02:01 z -rw-r--r-- 1 calestyo calestyo 0 Jul 1 02:01 '\n' $ cksum -a sha512 --tag * > sums.tagged $ cksum -c sums.tagged a: OK \n: OK \new\nline: OK z: OK Assuming the same rules for the --check output as for the sums files, a leading \ should serve as the escaping indicator. So for: \new\nline: OK that would be fine but for: \n: OK it's not but would rather need to be: \\\n: OK Right. We traditionally didn't escape any chars in the --check output, but that changed with https://github.com/coreutils/coreutils/commit/646902b30 To minimize escaping, that patch only considered the '\n' character, but we should also have considered file names with a leading '\'. The attached should address this. Marking this as done. thanks, PádraigFrom 7c7870a1f2a608aca1cff24f402e4a44358f08ce Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Sat, 1 Jul 2023 17:01:18 +0100 Subject: [PATCH] cksum: escape filenames with a leading '\' in --check status * src/digest.c (digest_check): Also escape in the case that the file name has a leading '\'. * tests/cksum/md5sum-bsd.sh: Add a test case. * NEWS: Mention the bug fix. Fixes https://bugs.gnu.org/64392 --- NEWS | 5 + src/digest.c | 4 +++- tests/cksum/md5sum-bsd.sh | 9 ++--- 3 files changed, 14 insertions(+), 4 deletions(-) diff --git a/NEWS b/NEWS index 3c134db52..c96b153fc 100644 --- a/NEWS +++ b/NEWS @@ -19,6 +19,11 @@ GNU coreutils NEWS-*- outline -*- cksum again diagnoses read errors in its default CRC32 mode. [bug introduced in coreutils-9.0] + 'cksum --check' now ensures filenames with a leading backslash character + are escaped appropriately in the status output. + This also applies to the standalone checksumming utilities. + [bug introduced in coreutils-8.25] + dd again supports more than two multipliers for numbers. Previously numbers of the form '1024x1024x32' gave "invalid number" errors. [bug introduced in coreutils-9.1] diff --git a/src/digest.c b/src/digest.c index 851c7d118..2d4e45b1e 100644 --- a/src/digest.c +++ b/src/digest.c @@ -1216,8 +1216,10 @@ digest_check (char const *checkfile_name) bool ok; bool missing; /* Only escape in the edge case producing multiple lines, + or starting with a literal '\' character, to ease automatic processing of status output. */ - bool needs_escape = ! status_only && strchr (filename, '\n'); + bool needs_escape = ! status_only && (*filename == '\\' +|| strchr (filename, '\n')); properly_formatted_lines = true; diff --git a/tests/cksum/md5sum-bsd.sh b/tests/cksum/md5sum-bsd.sh index 5fe9b2fc9..76c62814f 100755 --- a/tests/cksum/md5sum-bsd.sh +++ b/tests/cksum/md5sum-bsd.sh @@ -74,13 +74,16 @@ md5sum --strict -c check.md5 || fail=1 # with the GNU extension of escaped newlines nl=' ' -tab=' ' +t=' ' rm check.md5 -for i in 'a\b' 'a\' "a${nl}b" "a${tab}b"; do +for i in 'a\b' 'a\' '\a' "a${nl}b" "a${t}b"; do : > "$i" md5sum --tag "$i" >> check.md5 || fail=1 done -md5sum --strict -c check.md5 || fail=1 +md5sum --strict -c check.md5 > out || fail=1 +printf '%s: OK\n' 'a\b' 'a\' '\\\a' '\a\nb' "a${t}b" > exp || framework_failure_ +compare exp out || fail=1 + # Ensure BSD traditional format with GNU extension escapes # is in the expected format -- 2.41.0
bug#64390: cksum: cannot --check untagged sums file
On Sat, 2023-07-01 at 09:05 +0200, Andreas Schwab wrote: > $ cksum -a sha512 -c sums.untagged Haha... I'm so dumb. ^^ Thanks for pointing to the obvious. Thus closing. Cheers, Chris.
bug#64390: cksum: cannot --check untagged sums file
On Jul 01 2023, Christoph Anton Mitterer wrote: > $ cksum -c sums.untagged > cksum: sums.untagged: no properly formatted checksum lines found $ cksum -a sha512 -c sums.untagged -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."