On Wed, Apr 23, 2025 at 05:11:18PM +0200, Theo Buehler wrote:
> On Wed, Apr 23, 2025 at 12:35:51PM +0200, Andreas Bartelt wrote:
> > Hi,
> > 
> > I've noticed that aes-128-gcm performance with scp(1) on amd64 based CPUs is
> > much slower than expected on OpenBSD (i.e., I remember throughput being
> > significantly better some time ago -- I think I saw much better throughput
> > around the time when LRO and TSO were initially enabled for ix(4)). It looks
> > to me like AES-NI isn't effectively used anymore.
> 
> Right. Thanks for the report. The immediate reason for this is that ssh
> relies on calls to OpenSSL_add_all_algorithms() to initialize libcrypto.
> However, the call to OPENSSL_cpuid_setup() was removed from this function
> (OPENSSL_add_all_algorithms_noconf()) in c_all.c r1.32 aka
> 
> https://github.com/openbsd/src/commit/b2368ebdada0d6d022d20bbe96eab69dbc406e5a
> 
> which means that the cpuid probe choosing an accelerated version if HW
> support is available is no longer set up. This coincidentally happened
> about a week after LRO was enabled by bluhm for all drivers in:
> 
> https://github.com/openbsd/src/commit/3e1926f859efd008e94373bdb5bd5e8d9fb98874
> 
> Another bit that will hurt is that ssh switched from aes-128-ctr to
> aes-128-gcm by default last December:
> 
> https://github.com/openbsd/src/commit/08d45e79c0d607376dd5c42234e36d78473c3ae0
> 
> This doesn't make much of a difference in the unaccelerated case:
> 
> Without AES-NI
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
> aes-128-gcm     174617.32k   211996.90k   693919.98k   754392.03k 775449.26k
> aes-128-ctr     185805.70k   216658.12k   778577.33k   888563.84k 915544.45k
> 
> but, since our GCM ASM is pretty bad, this will hurt in the accelerated
> case. jsing will be looking into improving that since this is also
> important for TLS.
> 
> With AES-NI:
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
> aes-128-gcm     669421.74k  1886143.60k  3064423.66k  3495542.89k 3564934.49k
> aes-128-ctr     990493.56k  3246635.81k  6959034.82k  9812672.93k 11506436.47k
> 
> While we could (and probably should) add OPENSSL_init_crypto() calls to
> the various *add_all* API, I think a better first fix will be this,
> which means that the cpuid_setup happens whenever a cipher or a digest
> is invoked via EVP and the accelerated implementation should be chosen
> if available:

And here's the diff including the *add_all* API, which are also needed
(otherwise the first call to cipher_init() will still end up using the
unaccelerated implementation).

Index: crypto_init.c
===================================================================
RCS file: /cvs/src/lib/libcrypto/crypto_init.c,v
diff -u -p -r1.22 crypto_init.c
--- crypto_init.c       17 Oct 2024 14:27:57 -0000      1.22
+++ crypto_init.c       23 Apr 2025 12:27:08 -0000
@@ -99,18 +99,24 @@ LCRYPTO_ALIAS(OPENSSL_cleanup);
 void
 OpenSSL_add_all_ciphers(void)
 {
+       /* Prayer and clean living lets you ignore errors, OpenSSL style. */
+       (void)OPENSSL_init_crypto(0, NULL);
 }
 LCRYPTO_ALIAS(OpenSSL_add_all_ciphers);
 
 void
 OpenSSL_add_all_digests(void)
 {
+       /* Prayer and clean living lets you ignore errors, OpenSSL style. */
+       (void)OPENSSL_init_crypto(0, NULL);
 }
 LCRYPTO_ALIAS(OpenSSL_add_all_digests);
 
 void
 OPENSSL_add_all_algorithms_noconf(void)
 {
+       /* Prayer and clean living lets you ignore errors, OpenSSL style. */
+       (void)OPENSSL_init_crypto(0, NULL);
 }
 LCRYPTO_ALIAS(OPENSSL_add_all_algorithms_noconf);
 
Index: evp/evp_cipher.c
===================================================================
RCS file: /cvs/src/lib/libcrypto/evp/evp_cipher.c,v
diff -u -p -r1.23 evp_cipher.c
--- evp/evp_cipher.c    10 Apr 2024 15:00:38 -0000      1.23
+++ evp/evp_cipher.c    23 Apr 2025 13:52:22 -0000
@@ -614,6 +614,9 @@ LCRYPTO_ALIAS(EVP_DecryptFinal_ex);
 EVP_CIPHER_CTX *
 EVP_CIPHER_CTX_new(void)
 {
+       if (!OPENSSL_init_crypto(0, NULL))
+               return NULL;
+
        return calloc(1, sizeof(EVP_CIPHER_CTX));
 }
 LCRYPTO_ALIAS(EVP_CIPHER_CTX_new);
Index: evp/evp_digest.c
===================================================================
RCS file: /cvs/src/lib/libcrypto/evp/evp_digest.c,v
diff -u -p -r1.14 evp_digest.c
--- evp/evp_digest.c    10 Apr 2024 15:00:38 -0000      1.14
+++ evp/evp_digest.c    23 Apr 2025 13:14:36 -0000
@@ -226,6 +226,9 @@ LCRYPTO_ALIAS(EVP_Digest);
 EVP_MD_CTX *
 EVP_MD_CTX_new(void)
 {
+       if (!OPENSSL_init_crypto(0, NULL))
+               return NULL;
+
        return calloc(1, sizeof(EVP_MD_CTX));
 }
 LCRYPTO_ALIAS(EVP_MD_CTX_new);

Reply via email to