Branch: refs/heads/master Home: https://github.openssl.org/openssl/openssl Commit: b1b2146ded9ce5a84c62f30c6c4a922b449f6c90 https://github.openssl.org/openssl/openssl/commit/b1b2146ded9ce5a84c62f30c6c4a922b449f6c90 Author: Daniel Hu <daniel...@arm.com> Date: 2022-05-03 (Tue, 03 May 2022)
Changed paths: M crypto/arm64cpuid.pl M crypto/arm_arch.h M crypto/armcap.c A crypto/chacha/asm/chacha-armv8-sve.pl M crypto/chacha/asm/chacha-armv8.pl M crypto/chacha/build.info Log Message: ----------- Acceleration of chacha20 on aarch64 by SVE This patch accelerates chacha20 on aarch64 when Scalable Vector Extension (SVE) is supported by CPU. Tested on modern micro-architecture with 256-bit SVE, it has the potential to improve performance up to 20% The solution takes a hybrid approach. SVE will handle multi-blocks that fit the SVE vector length, with Neon/Scalar to process any tail data Test result: With SVE type 1024 bytes 8192 bytes 16384 bytes ChaCha20 1596208.13k 1650010.79k 1653151.06k Without SVE (by Neon/Scalar) type 1024 bytes 8192 bytes 16384 bytes chacha20 1355487.91k 1372678.83k 1372662.44k The assembly code has been reviewed internally by ARM engineer fangming.f...@arm.com Signed-off-by: Daniel Hu <daniel...@arm.com> Reviewed-by: Tomas Mraz <to...@openssl.org> Reviewed-by: Paul Dale <pa...@openssl.org> (Merged from https://github.com/openssl/openssl/pull/17916)