On Mon, 22 Jun 2026 19:50:13 GMT, Andrew Haley <[email protected]> wrote:

> One more thought: it might just be unrolling and inlining.

Apparently, it is C2 generating optimal code: 
 dup selector
 ldr a
 ldr b
 bsl
 str result
in the neon registers, unrolled once plus code for lengths not divisible by 4.

I haven't paid much attention to this because it is not critical for the 
performance of the elliptic curve computation, but it is definitely better if 
it is optimal.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/30941#issuecomment-4773037600

Reply via email to