On Mon, 31 Mar 2025 14:40:56 GMT, Ferenc Rakoczi <d...@openjdk.org> wrote:
>> By using the AVX-512 vector registers the speed of the computation of the >> ML-DSA algorithms (key generation, document signing, signature verification) >> can be approximately doubled. > > Ferenc Rakoczi has updated the pull request incrementally with one additional > commit since the last revision: > > Reacting to comments by Volodymyr. src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 359: > 357: __ kmovbl(k4, rax); > 358: __ addl(rax, 16); > 359: __ kmovbl(k5, rax); We could use the sequence from generate_sha3_implCompress to setup the K registers, that has less dependency: __ movl(rax, 0x1F); __ kmovbl(k5, rax); __ kshiftrbl(k4, k5, 1); __ kshiftrbl(k3, k5, 2); __ kshiftrbl(k2, k5, 3); __ kshiftrbl(k1, k5, 4); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23860#discussion_r2023769620