On Tue, 23 Jun 2026 18:34:13 GMT, Sergey Bylokhov <[email protected]> wrote:
> Then maybe an existing benchmark can be update, to show some improvement or > the new one added? As @ferakocz said the intrinsic is useful to ensure constant time execution in all cases. We don't actually need it to improve performance. We would only need to worry about it if it was responsible for a noticeable degradation in performance. @ferakocz says we don't see that in the tests he has performed and I think that means we are likely to face that situation with any other kernels that rely on conditionalAssign. I'm reassured by the fact that the Java version runs roughly twice as fast as the intrinsic. That matches the opportunity the provision of bytecode offers the compiler, as noted above, to halve the number of EOR instructions. Since this would not be an option in cases where set is not hard-wired I don't think we need to care about the result of this test. @mrserb please feel free to create a new benchmark that reliably tests the Java code vs the intrinsic for the general case (i.e. where set cannot be derived or predicted) and if there is a true degradation on either x86 or aarch64 then we can consider whether or not to keep the intrinsic. ------------- PR Comment: https://git.openjdk.org/jdk/pull/30941#issuecomment-4787118490
