On 08/23/2012 10:36 AM, Pádraig Brady wrote:
On 08/23/2012 08:55 AM, Ondrej Oprala wrote:
On 08/22/2012 06:38 PM, Jim Meyering wrote:
Eric Blake wrote:
On 08/22/2012 10:05 AM, Pádraig Brady wrote:
On 08/22/2012 03:00 PM, Ondrej Oprala wrote:
Hi, I haven't heard from this thread since I posted the last patch,
are there still things to correct or should I consider it closed?
The --tag description in the texinfo is a bit sparse. It says:
"If the file is not binary, put a leading whitespace before the
algorithm's name."
More accurately it could say:
"When operating in @option{--text} mode, put a leading space before
the algorithm's name.
On @option{--text} mode significant systems, this enables
compatibility with existing
BSD (binary) checksums, while also allowing operating in @option{--text} mode".
It's worth considering though, do we want to support --text mode
at all with --tag? I.E. we could avoid the default leading space,
and have --tag imply --binary and be mutually exclusive with --text?
I'm fine with --tag implying --binary. --text mode is almost always the
wrong thing to use, and it should have never been the default on
text-mode systems, nor should it be the default output (we're stuck with
that for normal output, where the presence of '*' to indicate binary
output, but only from systems where text mode differs, causes no end of
grief; but for BSD output we don't have to repeat the mistake).
I'm 60:40 for having --tag imply --binary given the above.
I'm 90:10 for having --tag imply --binary.
I agree wholeheartedly. Simpler is better.
Ok, I'm almost done with the changes, I just have one question.
What should the sum utils do if they get --tag --text as arguments?
Would it be possible to add BOOL TEXT and end with an error
if(text && prefix_tag) ?
Yes, if both are specified it should do something like:
if (prefix_tag && text)
{
/* This could be supported in a backwards compatible way
by prefixing the output line with a space in text mode.
However that's invasive enough that it was agreed to
not support this mode with --tag, as --text use cases
are adequately supported by the default output format. */
error(0, 0, _("--tag does not support --text mode"));
usage (EXIT_FAILURE);
}
thanks!
Pádraig.
Alright then, I've redone the tests, made the news and coreutils.texi
entry a bit more verbose, redone the utils to work in binary mode with
--tag specified and added the check in case both TAG and TEXT are given
as arguments.
Cheers,
Ondrej.
>From 8f4858cb7f190099293dce31535d752362487443 Mon Sep 17 00:00:00 2001
From: Ondrej Oprala <[email protected]>
Date: Thu, 2 Aug 2012 13:31:50 +0200
Subject: [PATCH] md5sum,sha1sum,sha224sum,sha256sum,sha384sum,sha512sum: add
--tag option
* NEWS: Add new feature info.
* doc/coreutils.texi (md5sum invocation): Add detailed information
about the new --tag option.
* src/md5sum.c: Add the new --tag option for BSD-style output.
(bsd_split_3): Add ESCAPED_FILENAME parameter.
(print_filename): New function.
(filename_unescape): New function.
* tests/misc/md5sum-bsd: Add tests for the new feature.
---
NEWS | 4 +
doc/coreutils.texi | 10 +++
src/md5sum.c | 214 ++++++++++++++++++++++++++++++++------------------
tests/misc/md5sum-bsd | 33 ++++++++
4 files changed, 183 insertions(+), 78 deletions(-)
diff --git a/NEWS b/NEWS
index d8a47ab..b96e7d3 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,10 @@ GNU coreutils NEWS -*-
outline -*-
** Bug fixes
+ md5sum now accepts the --tag option to print BSD-style output with GNU
+ file name escaping. This also affects sha1sum, sha224sum, sha256sum,
+ sha384sum and sha512sum.
+
du no longer emits a "disk-corrupted"-style diagnostic when it detects
a directory cycle that is due to a bind-mounted directory. Instead,
it detects this precise type of cycle, diagnoses it as such and
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 62b31fe..0710fce 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -3705,6 +3705,16 @@ If all listed files are readable and are consistent with
the associated
MD5 checksums, exit successfully. Otherwise exit with a status code
indicating there was a failure.
+@item --tag
+@opindex --tag
+@cindex BSD output
+Create a BSD style output. If the file name contains a '\' or a newline,
+put a '\' at the beginning of the corresponding line. Filenames containing
+these characters will also be escaped just as they would be without
+@option{--tag}. The @option{--tag} option implies binary mode. If both
+@option{--tag} and @option{--text} are specified, issue a warning and
+exit nonzero.
+
@item -t
@itemx --text
@opindex -t
diff --git a/src/md5sum.c b/src/md5sum.c
index f7e0849..85c96b5 100644
--- a/src/md5sum.c
+++ b/src/md5sum.c
@@ -135,7 +135,8 @@ enum
{
STATUS_OPTION = CHAR_MAX + 1,
QUIET_OPTION,
- STRICT_OPTION
+ STRICT_OPTION,
+ TAG_OPTION
};
static struct option const long_options[] =
@@ -147,6 +148,7 @@ static struct option const long_options[] =
{ "text", no_argument, NULL, 't' },
{ "warn", no_argument, NULL, 'w' },
{ "strict", no_argument, NULL, STRICT_OPTION },
+ { "tag", no_argument, NULL, TAG_OPTION },
{ GETOPT_HELP_OPTION_DECL },
{ GETOPT_VERSION_OPTION_DECL },
{ NULL, 0, NULL, 0 }
@@ -179,6 +181,9 @@ With no FILE, or when FILE is -, read standard input.\n\
printf (_("\
-c, --check read %s sums from the FILEs and check them\n"),
DIGEST_TYPE_STRING);
+ fputs (_("\
+ --tag create a BSD-style checksum\n\
+"), stdout);
if (O_BINARY)
fputs (_("\
-t, --text read in text mode (default if reading tty stdin)\n\
@@ -215,23 +220,72 @@ space for text), and name for each FILE.\n"),
#define ISWHITE(c) ((c) == ' ' || (c) == '\t')
+/* Given a file name, S of length S_LEN, that is not NUL-terminated,
+ modify it in place, performing the equivalent of this sed substitution:
+ 's/\\n/\n/g;s/\\\\/\\/g' i.e., replacing each "\\n" string with a newline
+ and each "\\\\" with a single backslash, NUL-terminate it and return S.
+ If S is not a valid escaped file name, i.e., if it ends with an odd number
+ of backslashes or if it contains a backslash followed by anything other
+ than "n" or another backslash, return NULL. */
+
+static char *
+filename_unescape (char *s, size_t s_len)
+{
+ char *dst = s;
+
+ for (size_t i = 0; i < s_len; i++)
+ {
+ switch (s[i])
+ {
+ case '\\':
+ if (i == s_len - 1)
+ {
+ /* File name ends with an unescaped backslash: invalid. */
+ return NULL;
+ }
+ ++i;
+ switch (s[i])
+ {
+ case 'n':
+ *dst++ = '\n';
+ break;
+ case '\\':
+ *dst++ = '\\';
+ break;
+ default:
+ /* Only '\' or 'n' may follow a backslash. */
+ return NULL;
+ }
+ break;
+
+ case '\0':
+ /* The file name may not contain a NUL. */
+ return NULL;
+
+ default:
+ *dst++ = s[i];
+ break;
+ }
+ }
+ *dst = '\0';
+
+ return s;
+}
+
/* Split the checksum string S (of length S_LEN) from a BSD 'md5' or
'sha1' command into two parts: a hexadecimal digest, and the file
name. S is modified. Return true if successful. */
static bool
bsd_split_3 (char *s, size_t s_len, unsigned char **hex_digest,
- char **file_name)
+ char **file_name, bool escaped_filename)
{
size_t i;
if (s_len == 0)
return false;
- *file_name = s;
-
- /* Find end of filename. The BSD 'md5' and 'sha1' commands do not escape
- filenames, so search backwards for the last ')'. */
+ /* Find end of filename. */
i = s_len - 1;
while (i && s[i] != ')')
i--;
@@ -239,6 +293,11 @@ bsd_split_3 (char *s, size_t s_len, unsigned char
**hex_digest,
if (s[i] != ')')
return false;
+ *file_name = s;
+
+ if (escaped_filename && filename_unescape (s, i) == NULL)
+ return false;
+
s[i++] = '\0';
while (ISWHITE (s[i]))
@@ -271,7 +330,14 @@ split_3 (char *s, size_t s_len,
while (ISWHITE (s[i]))
++i;
+ if (s[i] == '\\')
+ {
+ ++i;
+ escaped_filename = true;
+ }
+
/* Check for BSD-style checksum line. */
+
algo_name_len = strlen (DIGEST_TYPE_STRING);
if (STREQ_LEN (s + i, DIGEST_TYPE_STRING, algo_name_len))
{
@@ -282,7 +348,7 @@ split_3 (char *s, size_t s_len,
*binary = 0;
return bsd_split_3 (s + i + algo_name_len + 1,
s_len - (i + algo_name_len + 1),
- hex_digest, file_name);
+ hex_digest, file_name, escaped_filename);
}
}
@@ -293,11 +359,6 @@ split_3 (char *s, size_t s_len,
if (s_len - i < min_digest_line_length + (s[i] == '\\'))
return false;
- if (s[i] == '\\')
- {
- ++i;
- escaped_filename = true;
- }
*hex_digest = (unsigned char *) &s[i];
/* The first field has to be the n-character hexadecimal
@@ -333,49 +394,8 @@ split_3 (char *s, size_t s_len,
*file_name = &s[i];
if (escaped_filename)
- {
- /* Translate each '\n' string in the file name to a NEWLINE,
- and each '\\' string to a backslash. */
+ return filename_unescape (&s[i], s_len - i) != NULL;
- char *dst = &s[i];
-
- while (i < s_len)
- {
- switch (s[i])
- {
- case '\\':
- if (i == s_len - 1)
- {
- /* A valid line does not end with a backslash. */
- return false;
- }
- ++i;
- switch (s[i++])
- {
- case 'n':
- *dst++ = '\n';
- break;
- case '\\':
- *dst++ = '\\';
- break;
- default:
- /* Only '\' or 'n' may follow a backslash. */
- return false;
- }
- break;
-
- case '\0':
- /* The file name may not contain a NUL. */
- return false;
- break;
-
- default:
- *dst++ = s[i++];
- break;
- }
- }
- *dst = '\0';
- }
return true;
}
@@ -636,6 +656,31 @@ digest_check (const char *checkfile_name)
&& (!strict || n_improperly_formatted_lines == 0));
}
+static void
+print_filename (char const *file)
+{
+ /* Translate each NEWLINE byte to the string, "\\n",
+ and each backslash to "\\\\". */
+ while (*file)
+ {
+ switch (*file)
+ {
+ case '\n':
+ fputs ("\\n", stdout);
+ break;
+
+ case '\\':
+ fputs ("\\\\", stdout);
+ break;
+
+ default:
+ putchar (*file);
+ break;
+ }
+ file++;
+ }
+}
+
int
main (int argc, char **argv)
{
@@ -646,6 +691,8 @@ main (int argc, char **argv)
int opt;
bool ok = true;
int binary = -1;
+ bool prefix_tag = false;
+ bool text = false;
/* Setting values of global variables. */
initialize_main (&argc, &argv);
@@ -675,6 +722,7 @@ main (int argc, char **argv)
quiet = false;
break;
case 't':
+ text = true;
binary = 0;
break;
case 'w':
@@ -690,6 +738,10 @@ main (int argc, char **argv)
case STRICT_OPTION:
strict = true;
break;
+ case TAG_OPTION:
+ prefix_tag = true;
+ binary = 1;
+ break;
case_GETOPT_HELP_CHAR;
case_GETOPT_VERSION_CHAR (PROGRAM_NAME, AUTHORS);
default:
@@ -734,6 +786,17 @@ main (int argc, char **argv)
usage (EXIT_FAILURE);
}
+ if (text && prefix_tag)
+ {
+ /* This could be supported in a backwards compatible way
+ by prefixing the output line with a space in text mode.
+ However that's invasive enough that it was agreed to
+ not support this mode with --tag, as --text use cases
+ are adequately supported by the default output format. */
+ error (0, 0, _("--tag does not support --text mode"));
+ usage (EXIT_FAILURE);
+ }
+
if (!O_BINARY && binary < 0)
binary = 0;
@@ -754,41 +817,36 @@ main (int argc, char **argv)
ok = false;
else
{
+ if (prefix_tag)
+ {
+ if (strchr (file, '\n') || strchr (file, '\\'))
+ putchar ('\\');
+
+ fputs (DIGEST_TYPE_STRING, stdout);
+ fputs (" (", stdout);
+ print_filename (file);
+ fputs (") = ", stdout);
+ }
+
size_t i;
/* Output a leading backslash if the file name contains
a newline or backslash. */
- if (strchr (file, '\n') || strchr (file, '\\'))
+ if (!prefix_tag && (strchr (file, '\n') || strchr (file, '\\')))
putchar ('\\');
for (i = 0; i < (digest_hex_bytes / 2); ++i)
printf ("%02x", bin_buffer[i]);
- putchar (' ');
- if (file_is_binary)
- putchar ('*');
- else
- putchar (' ');
-
- /* Translate each NEWLINE byte to the string, "\\n",
- and each backslash to "\\\\". */
- for (i = 0; i < strlen (file); ++i)
+ if (!prefix_tag)
{
- switch (file[i])
- {
- case '\n':
- fputs ("\\n", stdout);
- break;
-
- case '\\':
- fputs ("\\\\", stdout);
- break;
-
- default:
- putchar (file[i]);
- break;
- }
+ putchar (' ');
+
+ putchar (file_is_binary ? '*' : ' ');
+
+ print_filename (file);
}
+
putchar ('\n');
}
}
diff --git a/tests/misc/md5sum-bsd b/tests/misc/md5sum-bsd
index 8226d7a..f29eabb 100755
--- a/tests/misc/md5sum-bsd
+++ b/tests/misc/md5sum-bsd
@@ -38,4 +38,37 @@ md5sum --strict -c check.md5 || fail=1
# an option to avoid the ambiguity.
tail -n+2 check.md5 | md5sum --strict -c && fail=1
+#--tag option test
+
+for i in 'a' ' b' '*c' 'dd' ' '; do
+ echo "$i" > "$i"
+ md5sum --tag "$i" >> check.md5sum
+done
+sed 's/ / /' check.md5sum > check.md5
+
+md5sum --strict -c check.md5sum || fail=1
+md5sum --strict -c check.md5 || fail=1
+
+#--tag testing filenames with \t \n and trailing \
+
+nl='
+'
+tab=' '
+for i in 'a\b' 'a\' "a${nl}b" "a${tab}b"; do
+ :> "$i"
+ md5sum --tag "$i" >> check.md5sum
+done
+
+md5sum --strict -c check.md5sum || fail=1
+
+ex_file='test
+\\file'
+ex_output='\MD5 (test\n\\\\file) = d41d8cd98f00b204e9800998ecf8427e'
+
+touch "$ex_file"
+echo -E "$ex_output" > exp
+md5sum --tag "$ex_file" > out
+
+compare exp out || fail=1
+
Exit $fail
--
1.7.11.4