On Fri, 5 Sep 2025 08:13:28 GMT, erifan <d...@openjdk.org> wrote:

> Intrinsic support for `VectorMask.laneIsSet` with a **variable** input index 
> was introduced in PR #14200, but was inadvertently broken by PR #25673. This 
> PR restores the intrinsic functionality and adds some JTReg tests.
> 
> Benchmarks on Nvidia Grace machine with 128-bit SVE:
> 
> Benchmark                                 Unit        Before          Score 
> Error     After           Score Error     Uplift
> microMaskLaneIsSetByte128_var ops/ms  21702.14415     91.902159       
> 103472.9391     36.057447       4.767867
> microMaskLaneIsSetByte64_var  ops/ms  21468.51868     107.94177       
> 103365.6561     69.47736        4.814754
> microMaskLaneIsSetDouble128_var       ops/ms  77489.32791     153.242699      
> 413499.4127     311.854079      5.336211
> microMaskLaneIsSetFloat128_var        ops/ms  41034.95204     399.421823      
> 206840.0988     74.702234       5.040583
> microMaskLaneIsSetFloat64_var ops/ms  77607.40268     175.938921      
> 413745.3001     149.716794      5.33126
> microMaskLaneIsSetInt128_var  ops/ms  41452.48893     76.143208       
> 206845.9754     59.371129       4.989953
> microMaskLaneIsSetInt64_var       ops/ms      77726.2542      173.180518      
> 413427.8838     363.575023      5.319024
> microMaskLaneIsSetLong128_var ops/ms  77646.11218     177.496587      
> 413403.4404     236.609314      5.3242
> microMaskLaneIsSetShort128_var        ops/ms  21374.93265     48.13101        
> 103417.4618     34.827021       4.838259
> microMaskLaneIsSetShort64_var ops/ms  41066.19395     353.320621      
> 206801.109      106.408938      5.035799
> 
> 
> Benchmarks on Intel 6444y machine with 512-bit avx3:
> 
> Benchmark                                 Unit        Before          Score 
> Error     After           Score Error     Uplift
> microMaskLaneIsSetByte128_var ops/ms  57658.45497     240.209309      
> 211643.8406     29.214532       3.670647
> microMaskLaneIsSetByte256_var ops/ms  57451.68169     116.994128      
> 211609.4652     160.48513       3.683259
> microMaskLaneIsSetByte512_var ops/ms  57530.22411     311.63868       
> 199802.8084     408.144015      3.473005
> microMaskLaneIsSetByte64_var  ops/ms  57642.2672      161.406221      
> 205252.4464     196.86852       3.560797
> microMaskLaneIsSetDouble256_var       ops/ms  114401.3789     231.797375      
> 361400.344      565.593984      3.159055
> microMaskLaneIsSetDouble512_var       ops/ms  57379.27882     159.699503      
> 211476.1138     136.980026      3.685583
> microMaskLaneIsSetFloat128_var        ops/ms  113943.9512     141.062663      
> 360855.3915     494.471996      3.166955
> microMaskLaneIsSetFloat256_var        ops/ms  57682.78182     138.142053      
> 211659.5098     30.167972       3.66937
> microMaskLaneIsSetFloat512_var        ops/ms  57617.66405     301.748599      
> 211246.8588     597.18949       3.666355
> microMaskLaneIsSetInt128_var  ops/ms  113914.5062     118.681382      
> 360856.4465     555.097397      3.167783
> microMaskLaneIsSetInt256_var  ops/ms  57681.79883     112.391639      
> 211555.6742     217.556981      3.667633
> microMaskLaneIsSetInt512_var  ops/ms  57350.20346     206.146723      
> 211657.7207     68.461571       3.690618
> microMaskLane...

This looks fine to me. I took another look at 
[JDK-8358749](https://bugs.openjdk.org/browse/JDK-8358749), and I think this is 
the only place where we can really accept the non-constant input. In all other 
cases, we either pull `is_con()` or `const_oop()` out of the input.

I think we will bikeshed about the tests a bit.

test/micro/org/openjdk/bench/jdk/incubator/vector/VectorExtractBenchmark.java 
line 34:

> 32: @Warmup(iterations = 5, time = 1)
> 33: @Measurement(iterations = 5, time = 1)
> 34: @Fork(value = 1, jvmArgs = {"--add-modules=jdk.incubator.vector"})

Don't do 1 fork, do at least 3.

-------------

Marked as reviewed by shade (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27113#pullrequestreview-3188769547
PR Review Comment: https://git.openjdk.org/jdk/pull/27113#discussion_r2324679427

Reply via email to