tests: Refactor functions for signalling NaNs

2023-10-09 Thread Bruno Haible
This patch centralizes the code for producing signalling NaNs.


2023-10-09  Bruno Haible  

tests: Refactor functions for signalling NaNs.
* tests/snan.h: New file, based on tests/test-isnanf.h,
tests/test-isnand.h, tests/test-isnanl.h.
* tests/test-isfinite.c: Include snan.h.
(test_isfinitef, test_isfinited, test_isfinitel): Simplify.
* tests/test-isinf.c: Include snan.h.
(test_isinff, test_isinfd, test_isinfl): Simplify.
* tests/test-isnan.c: Include snan.h.
(test_float, test_double, test_long_double): Simplify.
* tests/test-isnanf.h: Include snan.h.
(main): Simplify.
* tests/test-isnand.h: Include snan.h.
(main): Simplify.
* tests/test-isnanl.h: Include snan.h.
(main): Simplify.
* tests/test-signbit.c: Include snan.h.
(test_signbitf, test_signbitd, test_signbitl): Simplify.
* tests/test-stdio.c: Include qnan.h, snan.h instead of nan.h.
(main): Test quiet NaNs always. Also test a signalling NaN.
* modules/isfinite-tests (Files): Add tests/nan.h, tests/snan.h.
* modules/isinf-tests (Files): Likewise.
* modules/isnan-tests (Files): Add tests/snan.h.
* modules/isnanf-tests (Files): Likewise.
* modules/isnanf-nolibm-tests (Files): Likewise.
* modules/isnand-tests (Files): Likewise.
* modules/isnand-nolibm-tests (Files): Likewise.
* modules/isnanl-tests (Files): Likewise.
* modules/isnanl-nolibm-tests (Files): Likewise.
* modules/signbit-tests (Files): Likewise.
* modules/stdio-tests (Files): Add tests/qnan.h, tests/snan.h.

diff --git a/modules/isfinite-tests b/modules/isfinite-tests
index 56207be027..bf7964648b 100644
--- a/modules/isfinite-tests
+++ b/modules/isfinite-tests
@@ -1,6 +1,8 @@
 Files:
 tests/test-isfinite.c
 tests/infinity.h
+tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentf.m4
 m4/exponentd.m4
diff --git a/modules/isinf-tests b/modules/isinf-tests
index 02719fe601..fb958d74ea 100644
--- a/modules/isinf-tests
+++ b/modules/isinf-tests
@@ -1,6 +1,8 @@
 Files:
 tests/test-isinf.c
 tests/infinity.h
+tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentf.m4
 m4/exponentd.m4
diff --git a/modules/isnan-tests b/modules/isnan-tests
index c794455c5b..06ebb72156 100644
--- a/modules/isnan-tests
+++ b/modules/isnan-tests
@@ -3,6 +3,7 @@ tests/test-isnan.c
 tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentf.m4
 m4/exponentd.m4
diff --git a/modules/isnand-nolibm-tests b/modules/isnand-nolibm-tests
index 7b74b0b7ed..c8d92f3a75 100644
--- a/modules/isnand-nolibm-tests
+++ b/modules/isnand-nolibm-tests
@@ -4,6 +4,7 @@ tests/test-isnand.h
 tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentd.m4
 
diff --git a/modules/isnand-tests b/modules/isnand-tests
index fb3fc494c9..30a2c6f9e6 100644
--- a/modules/isnand-tests
+++ b/modules/isnand-tests
@@ -4,6 +4,7 @@ tests/test-isnand.h
 tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentd.m4
 
diff --git a/modules/isnanf-nolibm-tests b/modules/isnanf-nolibm-tests
index 4635edd9a5..acd8bd34bc 100644
--- a/modules/isnanf-nolibm-tests
+++ b/modules/isnanf-nolibm-tests
@@ -4,6 +4,7 @@ tests/test-isnanf.h
 tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentf.m4
 
diff --git a/modules/isnanf-tests b/modules/isnanf-tests
index 1566695546..c094c02a37 100644
--- a/modules/isnanf-tests
+++ b/modules/isnanf-tests
@@ -4,6 +4,7 @@ tests/test-isnanf.h
 tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentf.m4
 
diff --git a/modules/isnanl-nolibm-tests b/modules/isnanl-nolibm-tests
index 927ad69388..58a0f51fa1 100644
--- a/modules/isnanl-nolibm-tests
+++ b/modules/isnanl-nolibm-tests
@@ -4,6 +4,7 @@ tests/test-isnanl.h
 tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentl.m4
 
diff --git a/modules/isnanl-tests b/modules/isnanl-tests
index b00e5d5910..5a2d880c3f 100644
--- a/modules/isnanl-tests
+++ b/modules/isnanl-tests
@@ -4,6 +4,7 @@ tests/test-isnanl.h
 tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
+tests/snan.h
 tests/macros.h
 m4/exponentl.m4
 
diff --git a/modules/signbit-tests b/modules/signbit-tests
index 9465de790b..3b3ef9b6b6 100644
--- a/modules/signbit-tests
+++ b/modules/signbit-tests
@@ -4,6 +4,7 @@ tests/minus-zero.h
 tests/infinity.h
 tests/nan.h
 tests/qnan.h
+tests/snan.h
 tests/macros.h
 m4/exponentf.m4
 m4/exponentd.m4
diff --git a/modules/stdio-tests b/modules/stdio-tests
index 52b753e2f7..84199d5aa3 100644
--- a/modules/stdio-tests
+++ b/modules/stdio-tests
@@ -1,6 +1,8 @@
 Files:
 tests/test-stdio.c
 tests/nan.h
+tests/qnan.h
+tests/snan.h
 tests/macros.h
 m4/exponentd.m4
 
diff --git a/tests/snan.h b/tests/snan.h
new file mode 100644
index 00..e877000ece
--- 

Re: sort dynamic linking overhead

2023-10-09 Thread Pádraig Brady

On 08/10/2023 21:53, Pádraig Brady wrote:

On 08/10/2023 14:36, Pádraig Brady wrote:

On 07/10/2023 22:29, Paul Eggert wrote:

On 2023-10-07 04:42, Pádraig Brady wrote:


The auto linking is globally controlled with the --with-openssl
cofigure option, but you could build sort (and md5sum)
without that dependency with:

  ./configure ac_cv_lib_crypto_MD5=no


Thanks, I was thinking more along the lines that Bruno suggested, which
to continue to link to libcrypto, but do it with dlopen/dlsym in 'sort'
only when need_random is true.

It's not clear to me offhand whether this should be done entirely in
Coreutils, or whether we should add some Gnulib support to make it
easier to do this sort of lazier linking.


I was wondering if this was worth worrying about at all,
but it is a significant overhead that's worth improving.
To quantify the overhead I compared optimized builds,
with and without the above configure option, giving:

$ time seq 1 | xargs -I'{}' src/sort /dev/null -k'{}'
real0m7.009s
user0m3.462s
sys 0m3.578s

$ time seq 1 | xargs -I'{}' src/sort-lc /dev/null -k'{}'
real0m12.950s
user0m3.754s
sys 0m9.200s


So we should do something. Now dlopening libcrypto on demand
would work, but there may be better solutions.
sort doesn't have to use md5. It could use blake2 routines
already in coreutils to avoid the issue (and get some speed ups).
Alternatively it might use some other hash function.
For example see the other 128 bit functions compared at:
https://github.com/Cyan4973/xxHash

BTW there was mention of static linking as an option in this thread.
That's is an option to provide better speed an isolation for binaries,
however it's best left to the system builders to use this for their builds.
There can be security implications for prompt library updating,
and libcrypto is particularly sensitive in this regard.


Adding coreutils list...

So above we've demonstrated that sort dynamically loading libcrypto
does nearly double the startup time for the process.

Attached is a patch to use the coreutils reference blake2b hash instead
of the optimized libcrypto md5 routines.

$ seq 100 > 1.txt

$ time src/sort-md5-lc -R < 1.txt > /dev/null
real0m6.734s
user0m23.258s
sys 0m0.047s

$ time src/sort-blake2 -R < 1.txt > /dev/null
real0m7.215s
user0m25.683s
sys 0m0.043s

$ grep 'model name' /proc/cpuinfo | head -n1
model name  : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
$ rpm -q openssl-libs
openssl-libs-3.0.9-2.fc38.x86_64

So while this avoids the startup overhead,
the reference blake2 routines are a little less efficient
than the optimized md5 libcrypto routines.


An incremental patch attached to use xxhash128 (0.8.2)
shows a good improvement (note avx2 being used on this cpu):

  $ time src/sort-xxh -R < 1.txt > /dev/null
  real  0m4.111s
  user  0m14.429s
  sys   0m0.058s

I'm not sure how best to avail of it though.
Perhaps embed, or maybe link statically if available?

cheers,
PádraigFrom 912c5d139c24bf0ad5aa5459bd02506be30ab206 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 9 Oct 2023 14:45:01 +0100
Subject: [PATCH] sort: use XXH128 rather than blake2b

  $ seq 100 > 1.txt

  $ time src/sort-md5-lc -R < 1.txt > /dev/null
  real	0m6.734s
  user	0m23.258s
  sys	0m0.047s

  $ time src/sort-blake2 -R < 1.txt > /dev/null
  real	0m7.215s
  user	0m25.683s
  sys	0m0.043s

  Using xxhash128 (0.8.2) shows a good improvement
  (note avx2 being used on this cpu):

  $ time src/sort-xxh -R < 1.txt > /dev/null
  real	0m4.111s
  user	0m14.429s
  sys	0m0.058s

  $ grep 'model name' /proc/cpuinfo | head -n1
  model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
  $ rpm -q openssl-libs
  openssl-libs-3.0.9-2.fc38.x86_64

* src/sort.c: Use xxh128 routines rather than blake2b.
* src/xxhash.h: Include xxhash appropriately.
---
 src/sort.c   | 26 +-
 src/xxhash.h |  3 +++
 2 files changed, 16 insertions(+), 13 deletions(-)
 create mode 100644 src/xxhash.h

diff --git a/src/sort.c b/src/sort.c
index 14c3951ce..5f04d6921 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -31,7 +31,7 @@
 #include "system.h"
 #include "argmatch.h"
 #include "assure.h"
-#include "blake2/blake2.h"
+#include "xxhash.h"
 #include "fadvise.h"
 #include "filevercmp.h"
 #include "flexmember.h"
@@ -2085,7 +2085,7 @@ getmonth (char const *month, char **ea)
 }
 
 /* A randomly chosen state, used for random comparison.  */
-static blake2b_state random_hash_state;
+static XXH3_state_t random_hash_state;
 #define RANDOM_HASH_BYTES 16
 
 /* Initialize the randomly chosen hash state.  */
@@ -2100,8 +2100,8 @@ random_hash_state_init (char const *random_source)
   randread (r, buf, sizeof buf);
   if (randread_free (r) != 0)
 sort_die (_("close failed"), random_source);
-  blake2b_init (_hash_state, RANDOM_HASH_BYTES);
-  blake2b_update (_hash_state, buf, sizeof buf);
+