Although the scalar AVX10 floating point min/max instructions (VMINMAXSD, 
VMINMAXSS, VMINMAXSH) are compact, it's better not to use them in reduction 
loops. This is because of serial data dependencies that get triggered across 
loop iterations. An alternate implementation using comparisons and jumps 
leverages branch prediction and limits the effects of data dependencies to 
cheaper instructions (e.g, MOV).

With that background provided, these changes remove AVX10 floating point 
min/max instructions from single and double precision floating point reduction 
loops. Instead, a separate sequence of instructions is used. Currently, min/max 
half precision floating point reduction loops aren't detectable, so they will 
be handled in a separate PR. There is also some code cleanup to remove unused 
instruction definitions while also adding necessary supporting infrastructure. 
The JTREG tests listed below were used to verify correctness with the 
recommended JVM options mentioned in corresponding source files. All 
modifications and tests used [OpenJDK 
v27-b12](https://github.com/openjdk/jdk/releases/tag/jdk-27%2B12) as the 
baseline build.

1. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector64Tests.java`
2. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector128Tests.java`
3. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector256Tests.java`
4. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector512Tests.java`
5. `jtreg:test/jdk/jdk/incubator/vector/DoubleVectorMaxTests.java`
6. `jtreg:test/jdk/jdk/incubator/vector/FloatVector64Tests.java`
7. `jtreg:test/jdk/jdk/incubator/vector/FloatVector128Tests.java`
8. `jtreg:test/jdk/jdk/incubator/vector/FloatVector256Tests.java`
9. `jtreg:test/jdk/jdk/incubator/vector/FloatVector512Tests.java`
10. `jtreg:test/jdk/jdk/incubator/vector/FloatVectorMaxTests.java`
11. 
`jtreg:test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java`
12. 
`jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java`
13. `jtreg:test/hotspot/jtreg/compiler/igvn/TestMinMaxIdentity.java`
14. 
`jtreg:test/hotspot/jtreg/compiler/intrinsics/float16/TestFloat16MaxMinSpecialValues.java`

Finally, the JMH micro-benchmarks listed below were updated to ensure all code 
paths are exercised.

1. 
`micro:test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java`
2. `micro:test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java`

-------------

Commit messages:
 - Remove half precision min/max reduction definitions and adjust corresponding 
benchmarks.
 - Use alternative instruction flow for half precision reduction loops and add 
supporting infrastructure.
 - Merge branch 'master' into user/missa-prime/avx10_2
 - Replace scalar AVX10.2 floating point min/max instructions with more 
efficient sequence

Changes: https://git.openjdk.org/jdk/pull/29831/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29831&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8378295
  Stats: 601 lines in 9 files changed: 425 ins; 86 del; 90 mod
  Patch: https://git.openjdk.org/jdk/pull/29831.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/29831/head:pull/29831

PR: https://git.openjdk.org/jdk/pull/29831

Reply via email to