On Mon, 17 Feb 2025 10:36:52 GMT, Emanuel Peter <epe...@openjdk.org> wrote:
> I think we should be able to see the same issue here, actually. Yes. Here a > quick benchmark below: I observe the same: Warmup 751 3 b TestIntMax::test1 (27 bytes) Run Time: 360 550 158 Warmup 1862 15 b TestIntMax::test2 (34 bytes) Run Time: 92 116 170 But then with this: diff --git a/src/hotspot/cpu/x86/x86_64.ad b/src/hotspot/cpu/x86/x86_64.ad index 8cc4a970bfd..9abda8f4178 100644 --- a/src/hotspot/cpu/x86/x86_64.ad +++ b/src/hotspot/cpu/x86/x86_64.ad @@ -12037,16 +12037,20 @@ instruct cmovI_reg_l(rRegI dst, rRegI src, rFlagsReg cr) %} -instruct maxI_rReg(rRegI dst, rRegI src) +instruct maxI_rReg(rRegI dst, rRegI src, rFlagsReg cr) %{ match(Set dst (MaxI dst src)); + effect(KILL cr); ins_cost(200); - expand %{ - rFlagsReg cr; - compI_rReg(cr, dst, src); - cmovI_reg_l(dst, src, cr); + ins_encode %{ + Label done; + __ cmpl($src$$Register, $dst$$Register); + __ jccb(Assembler::less, done); + __ mov($dst$$Register, $src$$Register); + __ bind(done); %} + ins_pipe(pipe_cmov_reg); %} // ============================================================================ the performance gap narrows: Warmup 770 3 b TestIntMax::test1 (27 bytes) Run Time: 94 951 677 Warmup 1312 15 b TestIntMax::test2 (34 bytes) Run Time: 70 053 824 (the number of test2 fluctuates quite a bit). Does it ever make sense to implement `MaxI` with a conditional move then? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2663379660