[PATCH] mktemp: cut the output info when suffix contains XXX.
* src/mktemp.c: When --suffix was supply in cmdline, the template points to the meraged buffer(dest_name). As X is counted from template which not contains suffix, So the output should be cut. Signed-off-by: lvgenggeng --- src/mktemp.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/mktemp.c b/src/mktemp.c index 5f8e5155d..d194aeff5 100644 --- a/src/mktemp.c +++ b/src/mktemp.c @@ -156,7 +156,7 @@ main (int argc, char **argv) int status = EXIT_SUCCESS; size_t x_count; size_t suffix_len; - char *dest_name; + char *dest_name = nullptr; initialize_main (, ); set_program_name (argv[0]); @@ -255,7 +255,12 @@ main (int argc, char **argv) } x_count = count_consecutive_X_s (template, suffix - template); if (x_count < 3) -error (EXIT_FAILURE, 0, _("too few X's in template %s"), quote (template)); +{ + /* when suffix was appended to template, template should be cut. */ + if (template == dest_name) +template[suffix - template] = '\0'; + error (EXIT_FAILURE, 0, _("too few X's in template %s"), quote (template)); +} if (use_dest_dir) { -- 2.20.1
bug#69418: test failure when no french locale is installed
Testing the current coreutils git master: On a Debian 12 system, in which I have not installed a French UTF-8 locale, I see a test failure of tests/misc/join-utf8. The essential lines from test-suite.log: + test set = set + LC_ALL=none ../tests/misc/join-utf8.sh: line 24: warning: setlocale: LC_ALL: cannot change locale (none): No such file or directory The cause is that on such a system, LOCALE_FR_UTF8, as determined by gnulib/m4/locale-fr.m4, is 'none', not empty or absent. The attached patch fixes the failure. >From 74be78b1efba477016c153e0eacc93a3c661e748 Mon Sep 17 00:00:00 2001 From: Bruno Haible Date: Mon, 26 Feb 2024 22:33:18 +0100 Subject: [PATCH] join: Avoid test failure on systems which have no French UTF-8 locale * tests/misc/join-utf8.sh: Test the value of LOCALE_FR_UTF8 against 'none', not against a missing value. --- tests/misc/join-utf8.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/misc/join-utf8.sh b/tests/misc/join-utf8.sh index 2dbe2bb81..9af9e55ce 100755 --- a/tests/misc/join-utf8.sh +++ b/tests/misc/join-utf8.sh @@ -19,7 +19,7 @@ . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src print_ver_ join -test "${LOCALE_FR_UTF8+set}" = set || skip_ "French UTF-8 locale not available" +test "$LOCALE_FR_UTF8" != none || skip_ "French UTF-8 locale not available" LC_ALL=$LOCALE_FR_UTF8 export LC_ALL -- 2.34.1
Re: sort dynamic linking overhead
On 26/02/2024 18:06, Andreas Schwab wrote: On Feb 26 2024, Pádraig Brady wrote: https://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3bbdb3938 It's still bad as it adds a hidden dependency that is invisible to rpm. Right. In practice though since coreutils also links libcrypto for cksum and the separate digest utilities, this should be OK. In edge cases with a separate package per util, the dependency can be added manually. Also, the regexp in the sed command contains unquoted uses of '.' that are supposed to be matched literally. Fixed. thanks, Pádraig
Re: sort dynamic linking overhead
On 2024-02-26 06:12, Pádraig Brady wrote: On 26/02/2024 06:44, Yann Collet wrote: * xxhash128 is not a cryptographic hash function, so it doesn't attempt tobe random. Just a correction : xxh128 does try to be random. And quite hardly: a significant amount of development is spent on ensuring this property. It’s even tested with PractRand, and it could be used as a good random number generator. Being non-cryptographic means that what it doesn’t try is to make sure no one can intentionally forge a hash collision from 2 different files (other than brute-forcing, which is impractical). But that’s different, and I wouldn’t call this property “randomness”, even though randomness is a pre-requisite (but not sufficient in itself) to collision resistance. Fair enough, I should have called it a different name since it's not really random. However, 'sort -R' does have problems when hashes have collisions, as it falls back on ordinary comparisons and thus ceases to be a "random" sort, so collision resistance is a good property to have if 'sort -R' is given adversarial input. md5 shouldn't be considered as cryptographic anyway since it's broken. Although MD5 is broken as a hash for a string, it's not clear to me that it's broken in the way that GNU 'sort -R' uses MD5. GNU 'sort -R' uses a random salt, does not report hashes to the user, and does not output anything until it's read all its input. It seems to me that breaking gnu 'sort -R' would be harder than breaking HMAC-MD5, which itself is significantly harder than breaking MD5. That being said, if MD5 is broken for GNU 'sort' then we should use a better hash.
Re: sort dynamic linking overhead
On Feb 26 2024, Pádraig Brady wrote: > https://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3bbdb3938 It's still bad as it adds a hidden dependency that is invisible to rpm. Also, the regexp in the sed command contains unquoted uses of '.' that are supposed to be matched literally. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: sort dynamic linking overhead
On 26/02/2024 17:39, Bruno Haible wrote: Pádraig Brady wrote: + void *handle = dlopen ("libcrypto.so", RTLD_LAZY | RTLD_GLOBAL); That only works if libopenssl-devel is installed. Good spot. I'd already pushed a fix for this at: https://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3bbdb3938 Does this work for all the various names of libcrypto in various distros? Debian 12 libcrypto.so.3 Ubuntu 22.04libcrypto.so.1.1 libcrypto.so.3 Slackware 15libcrypto.so.1.1 openSUSE 15.5 libcrypto.so.1.1 CentOS Stream 9 libcrypto.so.3 Guix 1.4libcrypto.so.1.1 Alpine 3.19 libcrypto.so.3 FreeBSD 14.0libcrypto.so.38 NetBSD 9.3 libcrypto.so.14 OpenBSD 7.4 libcrypto.so.52.0 I only tested with libcrypto.so.3, but it should match all of the above. It matches libcrypto.so.[.0-9]* cheers, Pádraig
Re: sort dynamic linking overhead
Pádraig Brady wrote: > >> + void *handle = dlopen ("libcrypto.so", RTLD_LAZY | RTLD_GLOBAL); > > > > That only works if libopenssl-devel is installed. > > Good spot. > I'd already pushed a fix for this at: > https://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3bbdb3938 Does this work for all the various names of libcrypto in various distros? Debian 12 libcrypto.so.3 Ubuntu 22.04libcrypto.so.1.1 libcrypto.so.3 Slackware 15libcrypto.so.1.1 openSUSE 15.5 libcrypto.so.1.1 CentOS Stream 9 libcrypto.so.3 Guix 1.4libcrypto.so.1.1 Alpine 3.19 libcrypto.so.3 FreeBSD 14.0libcrypto.so.38 NetBSD 9.3 libcrypto.so.14 OpenBSD 7.4 libcrypto.so.52.0 Bruno
Re: sort dynamic linking overhead
On 26/02/2024 17:11, Andreas Schwab wrote: On Feb 25 2024, Paul Eggert wrote: +/* Dynamically link the crypto library, if it needs linking. */ +static void +link_libcrypto (void) +{ +#if DLOPEN_LIBCRYPTO && HAVE_OPENSSL_MD5 + void *handle = dlopen ("libcrypto.so", RTLD_LAZY | RTLD_GLOBAL); That only works if libopenssl-devel is installed. Good spot. I'd already pushed a fix for this at: https://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3bbdb3938 thanks, Pádraig
Re: sort dynamic linking overhead
On 26/02/2024 06:25, Paul Eggert wrote: On 2023-10-09 06:48, Pádraig Brady wrote: An incremental patch attached to use xxhash128 (0.8.2) shows a good improvement (note avx2 being used on this cpu): xxhash128 is not a cryptographic hash function, so it doesn't attempt to be random. Of course most people won't care - it's random "enough" - but it would be a functionality change. blake2 is cryptographic and would be random, but would bloat the 'sort' executable with code that's hardly ever used. To attack the problem in a more conservative way, I installed the attached patch into coreutils. With it, 'sort -R' continues to use MD5 but on GNUish platforms 'sort' links libcrypto dynamically only if -R is used (Bruno's suggestion). This doesn't significantly affect 'sort -R' performance, and reduces the startup overhead of plain 'sort' to be what it was before we started passing -lcrypto to gcc by default (in coreutils 8.32). I also toyed with changing MD5 to SHA512, but that hurt performance. For what it's worth, although I tested with an Intel Xeon W-1350, which supports SHA-NI as well as various AVX-512 options, I didn't see where libcrypto (at least on Ubuntu 23.10, which has OpenSSL 3.0.10) takes advantage of these special-purpose instructions. I noticed that the dlopen() used an unversioned lib, which generally won't work as that's what's available at build time, not at run time. Attached is a fix. Note that also includes a comment for why we check for SHA512 and not MD5 (to avoid deprecation warnings). Also attached is a fix to a sc_tight_scope failure. cheers, PádraigFrom 09fed593048f744f5a76e23e58b232fd8922bf2e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Mon, 26 Feb 2024 14:42:40 + Subject: [PATCH 1/2] maint: avoid sc_tight_scope failure in sort.c * cfg.mk: Exclude the ptr_MD5_* symbols added in commit v9.4-130-g7f57ac2d2, as there is no way to declare these static given they way they're defined. --- cfg.mk | 3 +++ 1 file changed, 3 insertions(+) diff --git a/cfg.mk b/cfg.mk index c0a131b2b..7da1a4b4a 100644 --- a/cfg.mk +++ b/cfg.mk @@ -927,6 +927,9 @@ _gl_TS_other_headers = $(srcdir)/src/*.h src/*.h # Normally, the rule would detect its declaration, but that uses a # different name, __clz_tab. _gl_TS_unmarked_extern_vars = factor_clz_tab +# Avoid tight_scope rule stating these should be static +# as there is no way to achieve that with the way these are defined. +_gl_TS_unmarked_extern_vars += ptr_MD5_.* # Other tight_scope settings _gl_TS_dir = . _gl_TS_obj_files = src/*.$(OBJEXT) -- 2.43.0 From e8d92ab0aa57fa79afed52ecfebfb2d3a0f6c3ea Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Mon, 26 Feb 2024 16:38:41 + Subject: [PATCH 2/2] build: fix libcrypto version linked by sort at runtime One should link the versioned lib at runtime, and the unversioned lib at build time, as the unversioned lib may not be installed, and better couples the binary with the required version. * configure.ac: Define LIBCRYPTO_SONAME, determined from the test binary linked with -lcrypto. Also document why we use SHA512() in the check, rather than MD5(). * src/sort.c (link_libcrypto): Use the versioned lib in dlopen(). --- configure.ac | 13 ++--- src/sort.c | 2 +- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/configure.ac b/configure.ac index 043081b90..eb6c33aa4 100644 --- a/configure.ac +++ b/configure.ac @@ -365,12 +365,16 @@ AS_CASE([$LIB_CRYPTO], [AC_LANG_PROGRAM( [[#include #include + /* Use SHA512 rather than MD5 here to avoid deprecation warnings. + So need to check HAVE_OPENSSL_MD5.. with DLOPEN_LIBCRYPTO. */ ]], [[return !(dlopen ("libcrypto.so", RTLD_LAZY | RTLD_GLOBAL) && SHA512 (0, 0, 0));]])], [# readelf works with cross-builds; ldd works on more platforms. - AS_CASE([`(readelf -d conftest$EXEEXT || ldd conftest$EXEEXT - ) 2>/dev/null`], + LIBCRYPTO_SONAME="`(readelf -d conftest$EXEEXT || ldd conftest$EXEEXT + ) 2>/dev/null | + sed -n 's/.*\(libcrypto.so.[[.0-9]]*\).*/\1/p'`" + AS_CASE([$LIBCRYPTO_SONAME], [*libcrypto*], [utils_cv_dlopen_libcrypto=yes])]) LIBS=$saved_LIBS]) @@ -378,7 +382,10 @@ AS_CASE([$LIB_CRYPTO], [yes], [AC_DEFINE([DLOPEN_LIBCRYPTO], [1], [Define to 1 if dlopen exists and libcrypto is - linked dynamically.])])]) + linked dynamically.]) + AC_DEFINE_UNQUOTED([LIBCRYPTO_SONAME], ["$LIBCRYPTO_SONAME"], + [versioned libcrypto]) + ])]) # macOS >= 10.12 AC_CHECK_FUNCS([fclonefileat]) diff --git a/src/sort.c b/src/sort.c index cefe381bf..2d8324ca4 100644 --- a/src/sort.c +++
Re: sort dynamic linking overhead
On Feb 25 2024, Paul Eggert wrote: > +/* Dynamically link the crypto library, if it needs linking. */ > +static void > +link_libcrypto (void) > +{ > +#if DLOPEN_LIBCRYPTO && HAVE_OPENSSL_MD5 > + void *handle = dlopen ("libcrypto.so", RTLD_LAZY | RTLD_GLOBAL); That only works if libopenssl-devel is installed. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: sort dynamic linking overhead
* xxhash128 is not a cryptographic hash function, so it doesn't attempt to be random. Just a correction : xxh128 does try to be random. And quite hardly: a significant amount of development is spent on ensuring this property. It’s even tested with PractRand, and it could be used as a good random number generator. Being non-cryptographic means that what it doesn’t try is to make sure no one can intentionally forge a hash collision from 2 different files (other than brute-forcing, which is impractical). But that’s different, and I wouldn’t call this property “randomness”, even though randomness is a pre-requisite (but not sufficient in itself) to collision resistance. From: Paul Eggert Date: Sunday, February 25, 2024 at 10:25 PM To: Pádraig Brady , Bruno Haible , bug-gnu...@gnu.org , Coreutils Cc: Yann Collet Subject: Re: sort dynamic linking overhead On 2023-10-09 06:48, Pádraig Brady wrote: > An incremental patch attached to use xxhash128 (0.8.2) > shows a good improvement (note avx2 being used on this cpu): xxhash128 is not a cryptographic hash function, so it doesn't attempt to be random. Of course most people won't care - it's random "enough" - but it would be a functionality change. blake2 is cryptographic and would be random, but would bloat the 'sort' executable with code that's hardly ever used. To attack the problem in a more conservative way, I installed the attached patch into coreutils. With it, 'sort -R' continues to use MD5 but on GNUish platforms 'sort' links libcrypto dynamically only if -R is used (Bruno's suggestion). This doesn't significantly affect 'sort -R' performance, and reduces the startup overhead of plain 'sort' to be what it was before we started passing -lcrypto to gcc by default (in coreutils 8.32). I also toyed with changing MD5 to SHA512, but that hurt performance. For what it's worth, although I tested with an Intel Xeon W-1350, which supports SHA-NI as well as various AVX-512 options, I didn't see where libcrypto (at least on Ubuntu 23.10, which has OpenSSL 3.0.10) takes advantage of these special-purpose instructions.
Re: sort dynamic linking overhead
On 26/02/2024 06:44, Yann Collet wrote: * xxhash128 is not a cryptographic hash function, so it doesn't attempt tobe random. Just a correction : xxh128 does try to be random. And quite hardly: a significant amount of development is spent on ensuring this property. It’s even tested with PractRand, and it could be used as a good random number generator. Being non-cryptographic means that what it doesn’t try is to make sure no one can intentionally forge a hash collision from 2 different files (other than brute-forcing, which is impractical). But that’s different, and I wouldn’t call this property “randomness”, even though randomness is a pre-requisite (but not sufficient in itself) to collision resistance. Right. I was looking at both md5 and xxhash128 having a 10 quality score in the SMHasher metric. I even saw a comment from you Yann that xxhas128 may have slightly better dispersion than md5. Also md5 shouldn't be considered as cryptographic anyway since it's broken. I.e. I don't think users would need to be informed of this change if made. Re Paul's committed patch, it's a good improvement, and does not add a new (xxhash) dependency. Paul, should the configure check, be testing for the MD5 routines rather than SHA512? Also an entry in the Improvements section of NEWS would be appropriate. thanks! Pádraig
Re: sort dynamic linking overhead
On 2023-10-09 06:48, Pádraig Brady wrote: An incremental patch attached to use xxhash128 (0.8.2) shows a good improvement (note avx2 being used on this cpu): xxhash128 is not a cryptographic hash function, so it doesn't attempt to be random. Of course most people won't care - it's random "enough" - but it would be a functionality change. blake2 is cryptographic and would be random, but would bloat the 'sort' executable with code that's hardly ever used. To attack the problem in a more conservative way, I installed the attached patch into coreutils. With it, 'sort -R' continues to use MD5 but on GNUish platforms 'sort' links libcrypto dynamically only if -R is used (Bruno's suggestion). This doesn't significantly affect 'sort -R' performance, and reduces the startup overhead of plain 'sort' to be what it was before we started passing -lcrypto to gcc by default (in coreutils 8.32). I also toyed with changing MD5 to SHA512, but that hurt performance. For what it's worth, although I tested with an Intel Xeon W-1350, which supports SHA-NI as well as various AVX-512 options, I didn't see where libcrypto (at least on Ubuntu 23.10, which has OpenSSL 3.0.10) takes advantage of these special-purpose instructions.From 7f57ac2d20c144242953a8dc7d95b02df0244751 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sun, 25 Feb 2024 17:13:12 -0800 Subject: [PATCH] sort: dynamically link -lcrypto if -R This saves time in the usual case, which does not need -lcrypto. * configure.ac (DLOPEN_LIBCRYPTO): New macro. * src/sort.c [DLOPEN_LIBCRYPTO && HAVE_OPENSSL_MD5]: New macros MD5_Init, MD5_Update, MD5_Final. Include "md5.h" after defining them. Include , and define new functions link_failure and symbol_address. (link_libcrypto): New function. (random_md5_state_init): Call it before using crypto functions. --- configure.ac | 29 + src/sort.c | 52 +++- 2 files changed, 80 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index c7eca1b8d..043081b90 100644 --- a/configure.ac +++ b/configure.ac @@ -351,6 +351,35 @@ if test $utils_cv_localtime_cache = yes; then AC_DEFINE([LOCALTIME_CACHE], [1], [FIXME]) fi +# Should 'sort' link libcrypto dynamically? +AS_CASE([$LIB_CRYPTO], + [-lcrypto], +[# Check for dlopen and libcrypto dynamic linking in one program, + # as there's little point to checking them separately. + AC_CACHE_CHECK([for dlopen and whether libcrypto is linked dynamically], + [utils_cv_dlopen_libcrypto], + [utils_cv_dlopen_libcrypto=no +saved_LIBS=$LIBS +LIBS="$LIBS $LIB_CRYPTO" +AC_LINK_IFELSE( + [AC_LANG_PROGRAM( + [[#include + #include + ]], + [[return !(dlopen ("libcrypto.so", RTLD_LAZY | RTLD_GLOBAL) +&& SHA512 (0, 0, 0));]])], + [# readelf works with cross-builds; ldd works on more platforms. + AS_CASE([`(readelf -d conftest$EXEEXT || ldd conftest$EXEEXT + ) 2>/dev/null`], + [*libcrypto*], + [utils_cv_dlopen_libcrypto=yes])]) +LIBS=$saved_LIBS]) + AS_CASE([$utils_cv_dlopen_libcrypto], + [yes], + [AC_DEFINE([DLOPEN_LIBCRYPTO], [1], +[Define to 1 if dlopen exists and libcrypto is + linked dynamically.])])]) + # macOS >= 10.12 AC_CHECK_FUNCS([fclonefileat]) diff --git a/src/sort.c b/src/sort.c index dea7be45d..cefe381bf 100644 --- a/src/sort.c +++ b/src/sort.c @@ -39,7 +39,6 @@ #include "hash.h" #include "heap.h" #include "ignore-value.h" -#include "md5.h" #include "mbswidth.h" #include "nproc.h" #include "physmem.h" @@ -2085,6 +2084,56 @@ getmonth (char const *month, char **ea) return 0; } +/* When using the OpenSSL implementation, dynamically link only if -R. + This saves startup time in the usual (sans -R) case. */ + +#if DLOPEN_LIBCRYPTO && HAVE_OPENSSL_MD5 +/* In the typical case where md5.h does not #undef HAVE_OPENSSL_MD5, + trick md5.h into declaring and using pointers to functions not functions. + This causes the compiler's -lcrypto option to have no effect, + as sort.o no longer uses any crypto symbols statically. */ +# define MD5_Init (*ptr_MD5_Init) +# define MD5_Update (*ptr_MD5_Update) +# define MD5_Final (*ptr_MD5_Final) +#endif + +#include "md5.h" + +#if DLOPEN_LIBCRYPTO && HAVE_OPENSSL_MD5 +# include + +/* Diagnose a dynamic linking failure. */ +static void +link_failure (void) +{ + error (SORT_FAILURE, 0, "%s", dlerror ()); +} + +/* Return a function pointer in HANDLE for SYMBOL. */ +static void * +symbol_address (void *handle, char const *symbol) +{ + void *address = dlsym (handle, symbol); + if (!address) +link_failure (); + return address; +} +#endif + +/* Dynamically link the crypto library, if it needs linking. */ +static void +link_libcrypto (void) +{ +#if