On Fri, 5 Sep 2025 08:13:28 GMT, erifan <d...@openjdk.org> wrote: > Intrinsic support for `VectorMask.laneIsSet` with a **variable** input index > was introduced in PR #14200, but was inadvertently broken by PR #25673. This > PR restores the intrinsic functionality and adds some JTReg tests. > > Benchmarks on Nvidia Grace machine with 128-bit SVE: > > Benchmark Unit Before Score > Error After Score Error Uplift > microMaskLaneIsSetByte128_var ops/ms 21702.14415 91.902159 > 103472.9391 36.057447 4.767867 > microMaskLaneIsSetByte64_var ops/ms 21468.51868 107.94177 > 103365.6561 69.47736 4.814754 > microMaskLaneIsSetDouble128_var ops/ms 77489.32791 153.242699 > 413499.4127 311.854079 5.336211 > microMaskLaneIsSetFloat128_var ops/ms 41034.95204 399.421823 > 206840.0988 74.702234 5.040583 > microMaskLaneIsSetFloat64_var ops/ms 77607.40268 175.938921 > 413745.3001 149.716794 5.33126 > microMaskLaneIsSetInt128_var ops/ms 41452.48893 76.143208 > 206845.9754 59.371129 4.989953 > microMaskLaneIsSetInt64_var ops/ms 77726.2542 173.180518 > 413427.8838 363.575023 5.319024 > microMaskLaneIsSetLong128_var ops/ms 77646.11218 177.496587 > 413403.4404 236.609314 5.3242 > microMaskLaneIsSetShort128_var ops/ms 21374.93265 48.13101 > 103417.4618 34.827021 4.838259 > microMaskLaneIsSetShort64_var ops/ms 41066.19395 353.320621 > 206801.109 106.408938 5.035799 > > > Benchmarks on Intel 6444y machine with 512-bit avx3: > > Benchmark Unit Before Score > Error After Score Error Uplift > microMaskLaneIsSetByte128_var ops/ms 57658.45497 240.209309 > 211643.8406 29.214532 3.670647 > microMaskLaneIsSetByte256_var ops/ms 57451.68169 116.994128 > 211609.4652 160.48513 3.683259 > microMaskLaneIsSetByte512_var ops/ms 57530.22411 311.63868 > 199802.8084 408.144015 3.473005 > microMaskLaneIsSetByte64_var ops/ms 57642.2672 161.406221 > 205252.4464 196.86852 3.560797 > microMaskLaneIsSetDouble256_var ops/ms 114401.3789 231.797375 > 361400.344 565.593984 3.159055 > microMaskLaneIsSetDouble512_var ops/ms 57379.27882 159.699503 > 211476.1138 136.980026 3.685583 > microMaskLaneIsSetFloat128_var ops/ms 113943.9512 141.062663 > 360855.3915 494.471996 3.166955 > microMaskLaneIsSetFloat256_var ops/ms 57682.78182 138.142053 > 211659.5098 30.167972 3.66937 > microMaskLaneIsSetFloat512_var ops/ms 57617.66405 301.748599 > 211246.8588 597.18949 3.666355 > microMaskLaneIsSetInt128_var ops/ms 113914.5062 118.681382 > 360856.4465 555.097397 3.167783 > microMaskLaneIsSetInt256_var ops/ms 57681.79883 112.391639 > 211555.6742 217.556981 3.667633 > microMaskLaneIsSetInt512_var ops/ms 57350.20346 206.146723 > 211657.7207 68.461571 3.690618 > microMaskLane...
This looks fine to me. I took another look at [JDK-8358749](https://bugs.openjdk.org/browse/JDK-8358749), and I think this is the only place where we can really accept the non-constant input. In all other cases, we either pull `is_con()` or `const_oop()` out of the input. I think we will bikeshed about the tests a bit. test/micro/org/openjdk/bench/jdk/incubator/vector/VectorExtractBenchmark.java line 34: > 32: @Warmup(iterations = 5, time = 1) > 33: @Measurement(iterations = 5, time = 1) > 34: @Fork(value = 1, jvmArgs = {"--add-modules=jdk.incubator.vector"}) Don't do 1 fork, do at least 3. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27113#pullrequestreview-3188769547 PR Review Comment: https://git.openjdk.org/jdk/pull/27113#discussion_r2324679427