On Thu, 10 Apr 2025 13:19:05 GMT, Ferenc Rakoczi <d...@openjdk.org> wrote:
>> By using the aarch64 vector registers the speed of the computation of the >> ML-KEM algorithms (key generation, encapsulation, decapsulation) can be >> approximately doubled. > > Ferenc Rakoczi has updated the pull request incrementally with two additional > commits since the last revision: > > - Code rearrange, some renaming, fixing comments > - Changes suggested by Andrew Dinn. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5684: > 5682: VSeq<2> vs5(vs3[1], delta); > 5683: kyber_montmul16(vs5, vz, vs5, vs_front(vs2), vq); > 5684: // add results in pairs storing in vs3 Suggestion: // add results in pairs storing in vs3 // vs3[0] <- montmul(a0, b0) + montmul(montmul(a1, b1), z0); // vs3[1] <- montmul(a0, b1) + montmul(a1, b0); src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5686: > 5684: // add results in pairs storing in vs3 > 5685: vs_addv(vs_front(vs3), __ T8H, vs_even(vs3), vs_odd(vs3)); > 5686: vs_addv(vs_back(vs3), __ T8H, vs_even(vs1), vs_odd(vs1)); Suggestion: // vs3[2] <- montmul(a2, b2) + montmul(montmul(a3, b3), z1); // vs3[3] <- montmul(a2, b3) + montmul(a3, b2); vs_addv(vs_back(vs3), __ T8H, vs_even(vs1), vs_odd(vs1)); src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5687: > 5685: vs_addv(vs_front(vs3), __ T8H, vs_even(vs3), vs_odd(vs3)); > 5686: vs_addv(vs_back(vs3), __ T8H, vs_even(vs1), vs_odd(vs1)); > 5687: // montmul result by constant vc and store result in vs1 Suggestion: // vs1 <- montmul(vs3, montRSquareModQ) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23663#discussion_r2044712516 PR Review Comment: https://git.openjdk.org/jdk/pull/23663#discussion_r2044714830 PR Review Comment: https://git.openjdk.org/jdk/pull/23663#discussion_r2044726778