On Thu, 21 Aug 2025 23:30:08 GMT, Ben Perez <bpe...@openjdk.org> wrote:
>> There are several places where MontgomeryIntegerPolynomialP256.mult() can be >> optimized. In particular, since modulus[2] = 0 several multiplications can >> be removed. Other multiplications can be replaced by shifts, which also >> saves time. Preliminary tests indicate an improvement between 5-10%. > > Ben Perez has updated the pull request incrementally with one additional > commit since the last revision: > > minor edit Can you also simplify the line near the end of the method where you're doing: `c2 = c7 - modulus[2] + (c1 >> BITS_PER_LIMB);` to be `c2 = c7 + (c1 >> BITS_PER_LIMB);`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26792#issuecomment-3239343260