On Fri, 9 Jan 2026 22:59:16 GMT, Mohamed Issa <[email protected]> wrote:

>> Intel&reg; AVX10 ISA [1] extensions added new floating point comparison 
>> instructions. They set the EFLAGS register so that relationships can be 
>> tested independently to avoid extra checks when one of the inputs is NaN.
>> 
>> Most of the work is covered in the architecture definition (`x86.ad`) file. 
>> A new comparison operand was created to be used by new CMove and JMP 
>> definitions with the APX specific portions of the CMove section being 
>> updated to rely on the new instructions because both sets of instructions 
>> are always expected to be available on the same platform. New floating point 
>> comparison definitions were also added.
>> 
>> This change uses the new AVX10.2 (UCOMXSS or UCOMXSD) instructions on 
>> supported platforms to avoid the extra handling required with existing 
>> (UCOMISS or UCOMISD) instructions. To make sure no new failures were 
>> introduced, tier1, tier2, and tier3 tests were run on builds with and 
>> without the changes. Additionally, the JTREG tests listed below were used to 
>> verify correctness with `-XX:-UseAPX` / `-XX:+UseAPX` options. The baseline 
>> build used is [OpenJDK 
>> v26-b26](https://github.com/openjdk/jdk/releases/tag/jdk-26%2B26).
>> 
>> 1. `jtreg:test/hotspot/jtreg/compiler/c2/irTests/CMoveLConstants.java`
>> 2. `jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFPComparison.java`
>> 3. 
>> `jtreg:test/hotspot/jtreg/compiler/intrinsics/math/TestSignumIntrinsic.java`
>> 4. `jtreg:test/hotspot/jtreg/compiler/vectorization/TestSignumVector.java`
>> 
>> Finally, the JMH micro-benchmark listed below was updated to separately 
>> exercise CMove and JMP code paths.
>> 
>> 1. `micro:test/micro/org/openjdk/bench/java/lang/FPComparison.java`
>> 
>> [1] 
>> https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10
>
> Mohamed Issa has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Remove extra jump instruction in signum floating point and unify three-way 
> floating point comparison logic in x86.ad

src/hotspot/cpu/x86/x86.ad line 1706:

> 1704:   Label done;
> 1705:   __ setcc(Assembler::above, dst);
> 1706:   __ jcc(Assembler::aboveEqual, done);

It would be better to keep the original sequence with optimization:
  __ movl(dst, -1);
  __ jcc(Assembler::below, done);
  __ setcc(Assembler::notEqual, dst);
This would work for both AVX10_2 and prior as the Assembler::below covers both 
unordered and less.

src/hotspot/cpu/x86/x86.ad line 9186:

> 9184: %}
> 9185: 
> 9186: instruct cmovI_regUCFE(cmpOpUCFE cop, rFlagsRegUCFE cr, rRegI dst, 
> rRegI src) %{

The following instructs could be removed and only the ndd version need to be 
kept:
cmovI_regUCFE
cmovI_memUCFE
cmovN_regUCFE
cmovP_regUCFE
cmovL_regUCFE
cmovL_memUCFE
cmpOpUCFE/rFlagsRegUCFE should only match when both AVX10_2 and APX is 
available otherwise fallback to cmpOpUCF/cmpOfUCF2/rFlagsRegUCF.
We can then remove the predicates from cmovI_reg/cmovI_regUCFE_ndd etc.
This is to avoid explosion of ad file instructs for various combination of APX 
and AVX10_2.

src/hotspot/cpu/x86/x86.ad line 14609:

> 14607:   ins_encode %{
> 14608:     __ ucomxss($src1$$XMMRegister, $src2$$XMMRegister);
> 14609:     emit_cmpfp3(masm, $dst$$Register);

It looks to me that the only difference between cmpF_reg and cmpF_regE is 
ucomxss vs ucomiss. In which case it is better to keep only the original 
cmpF_reg and remove cmpF_regE. Likewise for cmpF_memE, cmpF_immE, cmpD_regE, 
cmpD_memE, cmpD_immE.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28337#discussion_r2696383810
PR Review Comment: https://git.openjdk.org/jdk/pull/28337#discussion_r2696362925
PR Review Comment: https://git.openjdk.org/jdk/pull/28337#discussion_r2684261529

Reply via email to