On Mon, 27 Apr 2026 11:11:03 GMT, Ferenc Rakoczi <[email protected]> wrote:
> An aarch64 implementation of the MontgomeryIntegerPolynomial256.mult() method > and IntegerPolynomial.conditionalAssign(). Since 64-bit multiplication is not > supported on Neon and manually performing this operation with 32-bit limbs is > slower than with GPRs, a hybrid neon/gpr approach is used. Neon instructions > are used to compute intermediate values used in the last two iterations of > the main "loop", while the GPRs compute the first few iterations. At the > method level this improves performance by ~9% and at the API level roughly 5%. > > > > --------- > - [x] I confirm that I make this contribution in accordance with the [OpenJDK > Interim AI Policy](https://openjdk.org/legal/ai). This pull request has now been integrated. Changeset: f1cd7f6a Author: Ferenc Rakoczi <[email protected]> Committer: Andrew Dinn <[email protected]> URL: https://git.openjdk.org/jdk/commit/f1cd7f6ab9c162736ea3fc8f1523294ec004776e Stats: 1033 lines in 6 files changed: 1030 ins; 1 del; 2 mod 8355216: Accelerate P-256 arithmetic on aarch64 Reviewed-by: adinn, aph ------------- PR: https://git.openjdk.org/jdk/pull/30941
