On Thu, 20 Feb 2025 21:49:42 GMT, Volodymyr Paprotski <vpaprot...@openjdk.org> wrote:
> Add AVX2 montgomery multiplication intrinsic. (About 60-80% gain) > > Also add reduction to existing AVX512 multiplication (this was left-over from > https://github.com/openjdk/jdk/pull/19893 where a quick fix was required). > This is mostly for cleanup, but there is about 1-2% gain. > > Before (no AVX512) > > Benchmark (algorithm) (dataSize) (keyLength) > (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 > thrpt 40 3720.589 ± 17.879 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 > thrpt 40 3605.940 ± 15.807 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 > thrpt 40 1076.502 ± 4.190 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 > thrpt 40 1069.624 ± 2.484 ops/s > Benchmark (algorithm) (keyLength) > (kpgAlgorithm) (provider) Mode Cnt Score Error Units > KeyAgreementBench.EC.generateSecret ECDH 256 > EC thrpt 40 830.448 ± 2.285 ops/s > > After (with AVX2) > > Benchmark (algorithm) (dataSize) (keyLength) > (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 > thrpt 40 6000.496 ± 39.923 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 > thrpt 40 5739.878 ± 34.838 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 > thrpt 40 1942.437 ± 12.179 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 > thrpt 40 1921.770 ± 8.992 ops/s > Benchmark (algorithm) (keyLength) > (kpgAlgorithm) (provider) Mode Cnt Score Error Units > KeyAgreementBench.EC.generateSecret ECDH 256 > EC thrpt 40 1399.761 ± 6.238 ops/s > > > Before (with AVX512): > > Benchmark (algorithm) (dataSize) (keyLength) > (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 > thrpt 40 9621.950 ± 27.260 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 > thrpt 40 8975.654 ± 26.707 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 102... This pull request has now been integrated. Changeset: a269bef0 Author: Volodymyr Paprotski <vpaprot...@openjdk.org> URL: https://git.openjdk.org/jdk/commit/a269bef04cf3c9c8b731edcbf7618624f7571a2d Stats: 760 lines in 9 files changed: 641 ins; 16 del; 103 mod 8350459: MontgomeryIntegerPolynomialP256 multiply intrinsic with AVX2 on x86_64 Reviewed-by: ascarpino, sviswanathan ------------- PR: https://git.openjdk.org/jdk/pull/23719