Although the scalar AVX10 floating point min/max instructions (VMINMAXSD, VMINMAXSS, VMINMAXSH) are compact, it's better not to use them in reduction loops. This is because of serial data dependencies that get triggered across loop iterations. An alternate implementation using comparisons and jumps leverages branch prediction and limits the effects of data dependencies to cheaper instructions (e.g, MOV).
With that background provided, these changes remove AVX10 floating point min/max instructions from single and double precision floating point reduction loops. Instead, a separate sequence of instructions is used. Currently, min/max half precision floating point reduction loops aren't detectable, so they will be handled in a separate PR. There is also some code cleanup to remove unused instruction definitions while also adding necessary supporting infrastructure. The JTREG tests listed below were used to verify correctness with the recommended JVM options mentioned in corresponding source files. All modifications and tests used [OpenJDK v27-b12](https://github.com/openjdk/jdk/releases/tag/jdk-27%2B12) as the baseline build. 1. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector64Tests.java` 2. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector128Tests.java` 3. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector256Tests.java` 4. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector512Tests.java` 5. `jtreg:test/jdk/jdk/incubator/vector/DoubleVectorMaxTests.java` 6. `jtreg:test/jdk/jdk/incubator/vector/FloatVector64Tests.java` 7. `jtreg:test/jdk/jdk/incubator/vector/FloatVector128Tests.java` 8. `jtreg:test/jdk/jdk/incubator/vector/FloatVector256Tests.java` 9. `jtreg:test/jdk/jdk/incubator/vector/FloatVector512Tests.java` 10. `jtreg:test/jdk/jdk/incubator/vector/FloatVectorMaxTests.java` 11. `jtreg:test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java` 12. `jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java` 13. `jtreg:test/hotspot/jtreg/compiler/igvn/TestMinMaxIdentity.java` 14. `jtreg:test/hotspot/jtreg/compiler/intrinsics/float16/TestFloat16MaxMinSpecialValues.java` Finally, the JMH micro-benchmarks listed below were updated to ensure all code paths are exercised. 1. `micro:test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java` 2. `micro:test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java` ------------- Commit messages: - Remove half precision min/max reduction definitions and adjust corresponding benchmarks. - Use alternative instruction flow for half precision reduction loops and add supporting infrastructure. - Merge branch 'master' into user/missa-prime/avx10_2 - Replace scalar AVX10.2 floating point min/max instructions with more efficient sequence Changes: https://git.openjdk.org/jdk/pull/29831/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29831&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8378295 Stats: 601 lines in 9 files changed: 425 ins; 86 del; 90 mod Patch: https://git.openjdk.org/jdk/pull/29831.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29831/head:pull/29831 PR: https://git.openjdk.org/jdk/pull/29831
