On Thu, 15 May 2025 13:33:42 GMT, Ferenc Rakoczi <d...@openjdk.org> wrote:
>> By using the AVX-512 vector registers the speed of the computation of the >> ML-KEM algorithms (key generation, encapsulation, decapsulation) can be >> approximately doubled. > > Ferenc Rakoczi has updated the pull request incrementally with one additional > commit since the last revision: > > Response to review comment + loading constants with broadcast op. src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 250: > 248: static void montmul(int outputRegs[], int inputRegs1[], int inputRegs2[], > 249: int scratchRegs1[], int scratchRegs2[], MacroAssembler > *_masm) { > 250: for (int i = 0; i < 4; i++) { In the intrinsic for montMul we are treating as if MONT_R_BITS is 16 and MONT_Q_INV_MOD_R is 0xF301 whereas in the Java code MONT_R_BITS is 20 and MONT_Q_INT_MOD_R is 0x8F301. Are these equivalent? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2092137164