Hi,after building Nettle natively for Apple devices running Apple Silicon, I noticed a drastic performance different between the native build, and an x86_64 build emulated via Apple's Rosetta. The latter, despite emulation, was over 10 times faster in some algorithms, e.g. AES-128-GCM.
I found out that get_arm64_features did not at all detect the CPU capabilities on Apple devices.
The attached patch fixes the issue for me, with it in place the CPU features are correctly detected.
AES-128-GCM benchmark results: Native (pre-patch): 200MB/s Emulated: 3.2GB/s Native (patched): 5.2GB/s Regards, Tim Kosse
--- nettle-3.9.1_orig/fat-arm64.c 2023-12-05 02:16:50 +++ nettle-3.9.1/fat-arm64.c 2023-12-05 02:15:23 @@ -50,6 +50,8 @@ # include <sys/sysctl.h> # include <machine/cpu.h> # include <machine/armreg.h> +#elif defined(__APPLE__) +# include <sys/sysctl.h> #endif #include "nettle-types.h" @@ -85,6 +87,16 @@ #define MATCH(s, slen, literal, llen) \ ((slen) == (llen) && memcmp ((s), (literal), llen) == 0) + +#if defined(__APPLE__) +static int +check_sysctlbyname(const char* name) +{ + int val; + size_t s = sizeof(val); + return sysctlbyname(name, &val, &s, NULL, 0) ? 0 : val; +} +#endif static void get_arm64_features (struct arm64_features *features) @@ -141,6 +153,12 @@ = (ID_AA64ISAR0_SHA1(isar0) >= ID_AA64ISAR0_SHA1_BASE); features->have_sha2 = (ID_AA64ISAR0_SHA2(isar0) >= ID_AA64ISAR0_SHA2_BASE); +#elif defined(__APPLE__) + /* See https://developer.apple.com/documentation/kernel/1387446-sysctlbyname/determining_instruction_set_characteristics */ + features->have_aes = check_sysctlbyname("hw.optional.arm.FEAT_AES"); + features->have_pmull = check_sysctlbyname("hw.optional.arm.FEAT_PMULL"); + features->have_sha1 = check_sysctlbyname("hw.optional.arm.FEAT_SHA1"); + features->have_sha2 = check_sysctlbyname("hw.optional.arm.FEAT_SHA2"); #endif } }
_______________________________________________ nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se