On Mon, 8 Dec 2025 03:29:03 GMT, Eric Fang <[email protected]> wrote:
> This patch adds intrinsic support for UMIN and UMAX reduction operations in > the Vector API on AArch64, enabling direct hardware instruction mapping for > better performance. > > Changes: > -------- > > 1. C2 mid-end: > - Added UMinReductionVNode and UMaxReductionVNode > > 2. AArch64 Backend: > - Added uminp/umaxp/sve_uminv/sve_umaxv instructions > - Updated match rules for all vector sizes and element types > - Both NEON and SVE implementation are supported > > 3. Test: > - Added UMIN_REDUCTION_V and UMAX_REDUCTION_V to IRNode.java > - Added assembly tests in aarch64-asmtest.py for new instructions > - Added a JTReg test file VectorUMinMaxReductionTest.java > > Different configurations were tested on aarch64 and x86 machines, and all > tests passed. > > Test results of JMH benchmarks from the panama-vector project: > -------- > > On a Nvidia Grace machine with 128-bit SVE: > > Benchmark Unit Before Error After Error > Uplift > Byte128Vector.UMAXLanes ops/ms 411.60 42.18 25226.51 33.92 > 61.29 > Byte128Vector.UMAXMaskedLanes ops/ms 558.56 85.12 25182.90 28.74 > 45.09 > Byte128Vector.UMINLanes ops/ms 645.58 780.76 28396.29 > 103.11 43.99 > Byte128Vector.UMINMaskedLanes ops/ms 621.09 718.27 26122.62 42.68 > 42.06 > Byte64Vector.UMAXLanes ops/ms 296.33 34.44 14357.74 15.95 > 48.45 > Byte64Vector.UMAXMaskedLanes ops/ms 376.54 44.01 14269.24 21.41 > 37.90 > Byte64Vector.UMINLanes ops/ms 373.45 426.51 15425.36 66.20 > 41.31 > Byte64Vector.UMINMaskedLanes ops/ms 353.32 346.87 14201.37 13.79 > 40.19 > Int128Vector.UMAXLanes ops/ms 174.79 192.51 9906.07 > 286.93 56.67 > Int128Vector.UMAXMaskedLanes ops/ms 157.23 206.68 10246.77 11.44 > 65.17 > Int64Vector.UMAXLanes ops/ms 95.30 126.49 4719.30 98.57 > 49.52 > Int64Vector.UMAXMaskedLanes ops/ms 88.19 87.44 4693.18 19.76 > 53.22 > Long128Vector.UMAXLanes ops/ms 80.62 97.82 5064.01 35.52 > 62.82 > Long128Vector.UMAXMaskedLanes ops/ms 78.15 102.91 5028.24 8.74 > 64.34 > Long64Vector.UMAXLanes ops/ms 47.56 62.01 46.76 52.28 > 0.98 > Long64Vector.UMAXMaskedLanes ops/ms 45.44 46.76 45.79 42.91 > 1.01 > Short128Vector.UMAXLanes ops/ms 316.65 410.30 14814.82 23.65 > 46.79 > Short128Vector.UMAXMaskedLanes ops/ms 308.90 351.78 15155.26 31.03 > 49.06 > Sh... Nice work. Thanks for your support! I noticed that this PR contains the same commit of https://github.com/openjdk/jdk/pull/28692. Could you please split the change from this PR? If this PR depends on https://github.com/openjdk/jdk/pull/28692, I wonder whether we can change the target merge branch to `pr/28692` instead of `master` please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28693#issuecomment-3640490083
