Re: RFR: 8349721: Add aarch64 intrinsics for ML-KEM [v7]

Andrew Dinn Thu, 10 Apr 2025 07:31:30 -0700

On Thu, 10 Apr 2025 13:19:05 GMT, Ferenc Rakoczi <[email protected]> wrote:


>> By using the aarch64 vector registers the speed of the computation of the 
>> ML-KEM algorithms (key generation, encapsulation, decapsulation) can be 
>> approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - Code rearrange, some renaming, fixing comments
>  - Changes suggested by Andrew Dinn.

src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5278:

> 5276:     // level 4
> 5277:     vs_ldpq(vq, kyberConsts);
> 5278:     int offsets3[8] = { 0, 32, 64, 96, 128, 160, 192, 224 };

I'd like to add comment here to explain the coefficient grouping
and likewise at level 5 and 6. So here we have:
// Up to level 3 the coefficients multiplied by or added/subtracted
// to the zetas occur in discrete blocks whose size is some multiple
// of 32. At level 4 coefficients occur in 8 discrete blocks of size 16
// so they are loaded using employing an ldr at 8 distinct offsets.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23663#discussion_r2037560706

Re: RFR: 8349721: Add aarch64 intrinsics for ML-KEM [v7]

Reply via email to