On Mon, 11 May 2026 16:47:23 GMT, Volodymyr Paprotski <[email protected]> wrote:
> This PR: > - changes existing AVX512 SHA3 intrinsic to be more parallel > - adds an AVX2 SHA3 intrinsic > - change `SHA3Parallel.java` to NR=4 (to be able to exploit the AVX512 > parallelism while keeping doubleKeccak for platforms where double parallelism > is preferable. I experimented with NR=8 as well, does also gain a few > percent, but I think NR=4 is sufficient tradeoff) > > Performance gains: > - `MessageDigestBench.digest`: > - AVX2: **16%-39%** > - AVX512: **24%-33%** > - `SignatureBench.MLDSA.sign` > - AVX2: **6-12%** > - AVX512: **11%-18%** > - `SignatureBench.MLDSA.verify` > - AVX2: **2%-14%** > - AVX512: **31%-40%** > - `KEMBench.MLKEM` > - AVX2: **~5%** > - AVX512: **14%-23%** > - `KEMBench.JSSE_*` > - appears unaffected > > Note on intrinsics. (As noted in the code..) there are multiple entrypoints > wrapping the same intrinsic.. > - `SHA3.implCompress`: single blockSize of user data xored with keccak > - `DigestBase.implCompressMultiBlock`: loop over user data and xor with keccak > - `SHA3Parallel.doubleKeccak`: (still used for AVX2) no message data, just > two state vectors > - `SHA3Parallel.quadKeccak`: (AVX512 benefit) no message data, four state > vectors > > Note 1: `make test > TEST="micro:org.openjdk.bench.javax.crypto.full.MessageDigestBench > micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA > micro:org.openjdk.bench.javax.crypto.full.KEMBench"` > Note 2: I have left more targeted fuzzing and benchmarks out of this PR, but > they are preserved at [on my > branch](https://github.com/vpaprotsk/jdk/compare/sha3-avx-quad...vpaprotsk:jdk:sha3-avx-quad-extras?expand=1). > If there is something you rather see pulled in.. (otherwise, can include a > diff in JBS for 'future reference') > > --------- > - [X] I confirm that I make this contribution in accordance with the [OpenJDK > Interim AI Policy](https://openjdk.org/legal/ai). This pull request has now been integrated. Changeset: 114e3c61 Author: Volodymyr Paprotski <[email protected]> URL: https://git.openjdk.org/jdk/commit/114e3c61060752e34f7a075318ea7d2cff40744b Stats: 1362 lines in 19 files changed: 943 ins; 102 del; 317 mod 8384353: SHA3 AVX2 and AVX512 intrinsics and improvements Reviewed-by: sviswanathan, ascarpino, semery ------------- PR: https://git.openjdk.org/jdk/pull/31125
