On Mon, 11 May 2026 16:47:23 GMT, Volodymyr Paprotski <[email protected]> 
wrote:

> This PR:
> - changes existing AVX512 SHA3 intrinsic to be more parallel
> - adds an AVX2 SHA3 intrinsic
> - change `SHA3Parallel.java` to NR=4 (to be able to exploit the AVX512 
> parallelism while keeping doubleKeccak for platforms where double parallelism 
> is preferable. I experimented with NR=8 as well, does also gain a few 
> percent, but I think NR=4 is sufficient tradeoff)
> 
> Performance gains:
> - `MessageDigestBench.digest`:
>   - AVX2: **16%-39%**
>   - AVX512: **24%-33%**
> - `SignatureBench.MLDSA.sign`
>   - AVX2: **6-12%**
>   - AVX512: **11%-18%**
> - `SignatureBench.MLDSA.verify`
>   - AVX2: **2%-14%**
>   - AVX512: **31%-40%**
> - `KEMBench.MLKEM`
>   - AVX2: **~5%**
>   - AVX512: **14%-23%**
> - `KEMBench.JSSE_*`
>   - appears unaffected
> 
> Note on intrinsics. (As noted in the code..) there are multiple entrypoints 
> wrapping the same intrinsic..
> - `SHA3.implCompress`: single blockSize of user data xored with keccak
> - `DigestBase.implCompressMultiBlock`: loop over user data and xor with keccak
> - `SHA3Parallel.doubleKeccak`: (still used for AVX2) no message data, just 
> two state vectors
> - `SHA3Parallel.quadKeccak`: (AVX512 benefit) no message data, four state 
> vectors
> 
> Note 1: `make test 
> TEST="micro:org.openjdk.bench.javax.crypto.full.MessageDigestBench 
> micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA 
> micro:org.openjdk.bench.javax.crypto.full.KEMBench"`
> Note 2: I have left more targeted fuzzing and benchmarks out of this PR, but 
> they are preserved at [on my 
> branch](https://github.com/vpaprotsk/jdk/compare/sha3-avx-quad...vpaprotsk:jdk:sha3-avx-quad-extras?expand=1).
>  If there is something you rather see pulled in.. (otherwise, can include a 
> diff in JBS for 'future reference')
> 
> ---------
> - [X] I confirm that I make this contribution in accordance with the [OpenJDK 
> Interim AI Policy](https://openjdk.org/legal/ai).

This pull request has now been integrated.

Changeset: 114e3c61
Author:    Volodymyr Paprotski <[email protected]>
URL:       
https://git.openjdk.org/jdk/commit/114e3c61060752e34f7a075318ea7d2cff40744b
Stats:     1362 lines in 19 files changed: 943 ins; 102 del; 317 mod

8384353: SHA3 AVX2 and AVX512 intrinsics and improvements

Reviewed-by: sviswanathan, ascarpino, semery

-------------

PR: https://git.openjdk.org/jdk/pull/31125

Reply via email to