Re: RFR: 8329077: C2 SuperWord: Add MoveD2L, MoveL2D, MoveF2I, MoveI2F

Bhavana Kilambi Fri, 01 Aug 2025 05:45:18 -0700

On Thu, 24 Jul 2025 10:29:15 GMT, Galder Zamarreño <[email protected]> wrote:


> I've added support to vectorize `MoveD2L`, `MoveL2D`, `MoveF2I` and `MoveI2F` 
> nodes. The implementation follows a similar pattern to what is done with 
> conversion (`Conv*`) nodes. The tests in `TestCompatibleUseDefTypeSize` have 
> been updated with the new expectations.
> 
> Also added a JMH benchmark which measures throughput (the higher the number 
> the better) for methods that exercise these nodes. On darwin/aarch64 it shows:
> 
> 
> Benchmark                                (seed)  (size)   Mode  Cnt      Base 
>      Patch   Units   Diff
> VectorBitConversion.doubleToLongBits          0    2048  thrpt    8  1168.782 
>   1157.717  ops/ms    -1%
> VectorBitConversion.doubleToRawLongBits       0    2048  thrpt    8  3999.387 
>   7353.936  ops/ms   +83%
> VectorBitConversion.floatToIntBits            0    2048  thrpt    8  1200.338 
>   1188.206  ops/ms    -1%
> VectorBitConversion.floatToRawIntBits         0    2048  thrpt    8  4058.248 
>  14792.474  ops/ms  +264%
> VectorBitConversion.intBitsToFloat            0    2048  thrpt    8  3050.313 
>  14984.246  ops/ms  +391%
> VectorBitConversion.longBitsToDouble          0    2048  thrpt    8  3022.691 
>   7379.360  ops/ms  +144%
> 
> 
> The improvements observed are a result of vectorization. The lack of 
> vectorization in `doubleToLongBits` and `floatToIntBits` demonstrates that 
> these changes do not affect their performance. These methods do not vectorize 
> because of flow control.
> 
> I've run the tier1-3 tests on linux/aarch64 and didn't observe any 
> regressions.

test/micro/org/openjdk/bench/java/lang/VectorBitConversion.java line 67:

> 65: 
> 66:     @Benchmark
> 67:     public long[] doubleToLongBits() {

Would something like this be more concise (and maybe more readable as well) - 

@Benchmark
public long[] doubleToLongBits() {
    for (int i = 0; i < doubles.length; i++) {
        resultLongs[i] = Double.doubleToLongBits(doubles[i]);
    }
    return resultLongs;
}


The loop should still get vectorized (if vectorizable).

Same for other benchmarks.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26457#discussion_r2247880010

Re: RFR: 8329077: C2 SuperWord: Add MoveD2L, MoveL2D, MoveF2I, MoveI2F

Reply via email to