On Sun, Nov 28, 2021 at 12:57:00PM -0600, David Wright wrote: > I was taken by surprise by the following output from md5sum: > $ echo special/* > special/C:\nppdf32Log\debuglog.txt special/same-contents > $ md5sum special/* > \adfc1d2f1b1d6c7fcaa51e857c1a6f68 special/C:\\nppdf32Log\\debuglog.txt > adfc1d2f1b1d6c7fcaa51e857c1a6f68 special/same-contents
Fun. > I don't understand why it pollutes the first field in its output. Well, it doesn't bother to *document* why it does this, so we can only guess (or source-dive). > I would have thought it sufficient to mangle the filename if it > feels it has to (echo doesn't bother). Perhaps it prepends the \ character to the output line to indicate to whoever's reading this file (which may be md5sum itself, in --check mode) that a filename mangling *has occurred* and needs to be accounted for. Otherwise, how would the reader know whether the filename is actually C:\\nppdf32Log\\debuglog.txt or C:\nppdf32Log\debuglog.txt ... and, upon further investigation, it turns out md5sum is part of GNU coreutils. Which means the man page that I've been reading *is not the documentation*. Fuckers. In the blighted *info page*, there's this paragraph: For each FILE, ‘md5sum’ outputs by default, the MD5 checksum, a space, a flag indicating binary or text input mode, and the file name. Binary mode is indicated with ‘*’, text mode with ‘ ’ (space). Binary mode is the default on systems where it’s significant, otherwise text mode is the default. Without ‘--zero’, if FILE contains a backslash or newline, the line is started with a backslash, and each problematic character in the file name is escaped with a backslash, making the output unambiguous even in the presence of arbitrary file names. If FILE is omitted or specified as ‘-’, standard input is read.