On Thu, 10 Apr 2025 13:19:05 GMT, Ferenc Rakoczi <d...@openjdk.org> wrote:
>> By using the aarch64 vector registers the speed of the computation of the >> ML-KEM algorithms (key generation, encapsulation, decapsulation) can be >> approximately doubled. > > Ferenc Rakoczi has updated the pull request incrementally with two additional > commits since the last revision: > > - Code rearrange, some renaming, fixing comments > - Changes suggested by Andrew Dinn. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5590: > 5588: __ add(tmpAddr, coeffs, 0); > 5589: store64shorts(vs2, tmpAddr); > 5590: I'd like to make explicit the fact that we have avoided doing an add here (and in the next two cases) by adding a commented out generation step i.e. at this line insert // __ add(tmpAddr, coeffs, 128); // unneeded as implied by preceding load src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5595: > 5593: __ add(tmpAddr, coeffs, 128); > 5594: store64shorts(vs2, tmpAddr); > 5595: Likewise insert: // __ add(tmpAddr, coeffs, 256); // unneeded as implied by preceding load src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5601: > 5599: store64shorts(vs2, tmpAddr); > 5600: > 5601: load64shorts(vs1, tmpAddr); Likewise insert: // __ add(tmpAddr, coeffs, 384); // unneeded as implied by preceding load ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23663#discussion_r2037607688 PR Review Comment: https://git.openjdk.org/jdk/pull/23663#discussion_r2037609104 PR Review Comment: https://git.openjdk.org/jdk/pull/23663#discussion_r2037611049