> Although the scalar AVX10 floating point min/max instructions (VMINMAXSD, > VMINMAXSS, VMINMAXSH) are compact, it's better not to use them in reduction > loops. This is because of serial data dependencies that get triggered across > loop iterations. An alternate implementation using comparisons and jumps > leverages branch prediction and limits the effects of data dependencies to > cheaper instructions (e.g, MOV). > > With that background provided, these changes remove AVX10 floating point > min/max instructions from single and double precision floating point > reduction loops. Instead, a separate sequence of instructions is used. > Currently, min/max half precision floating point reduction loops aren't > detectable, so they will be handled in a separate PR. There is also some code > cleanup to remove unused instruction definitions while also adding necessary > supporting infrastructure. The JTREG tests listed below were used to verify > correctness with the recommended JVM options mentioned in corresponding > source files. All modifications and tests used [OpenJDK > v27-b12](https://github.com/openjdk/jdk/releases/tag/jdk-27%2B12) as the > baseline build. > > 1. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector64Tests.java` > 2. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector128Tests.java` > 3. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector256Tests.java` > 4. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector512Tests.java` > 5. `jtreg:test/jdk/jdk/incubator/vector/DoubleVectorMaxTests.java` > 6. `jtreg:test/jdk/jdk/incubator/vector/FloatVector64Tests.java` > 7. `jtreg:test/jdk/jdk/incubator/vector/FloatVector128Tests.java` > 8. `jtreg:test/jdk/jdk/incubator/vector/FloatVector256Tests.java` > 9. `jtreg:test/jdk/jdk/incubator/vector/FloatVector512Tests.java` > 10. `jtreg:test/jdk/jdk/incubator/vector/FloatVectorMaxTests.java` > 11. > `jtreg:test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java` > 12. > `jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java` > 13. `jtreg:test/hotspot/jtreg/compiler/igvn/TestMinMaxIdentity.java` > 14. > `jtreg:test/hotspot/jtreg/compiler/intrinsics/float16/TestFloat16MaxMinSpecialValues.java` > > Finally, the JMH micro-benchmarks listed below were updated to ensure all > code paths are exercised. > > 1. > `micro:test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java` > 2. `micro:test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java`
Mohamed Issa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into user/missa-prime/avx10_2 - Remove half precision min/max reduction definitions and adjust corresponding benchmarks. - Use alternative instruction flow for half precision reduction loops and add supporting infrastructure. - Merge branch 'master' into user/missa-prime/avx10_2 - Replace scalar AVX10.2 floating point min/max instructions with more efficient sequence ------------- Changes: https://git.openjdk.org/jdk/pull/29831/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29831&range=01 Stats: 601 lines in 9 files changed: 425 ins; 86 del; 90 mod Patch: https://git.openjdk.org/jdk/pull/29831.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29831/head:pull/29831 PR: https://git.openjdk.org/jdk/pull/29831
