On Thu, 21 Aug 2025 23:30:08 GMT, Ben Perez <bpe...@openjdk.org> wrote:

>> There are several places where MontgomeryIntegerPolynomialP256.mult() can be 
>> optimized. In particular, since modulus[2] = 0 several multiplications can 
>> be removed. Other multiplications can be replaced by shifts, which also 
>> saves time. Preliminary tests indicate an improvement between 5-10%.
>
> Ben Perez has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   minor edit

Can you also simplify the line near the end of the method where you're doing:
`c2 = c7 - modulus[2] + (c1 >> BITS_PER_LIMB);` to be `c2 = c7 + (c1 >> 
BITS_PER_LIMB);`?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26792#issuecomment-3239343260

Reply via email to