Re: RFR: 8286279: [vectorapi] Only check index of masked lanes if offset is out of array boundary for masked store [v3]

2022-06-06 Thread Paul Sandoz
On Thu, 2 Jun 2022 01:29:54 GMT, Xiaohong Gong  wrote:

>> Checking whether the indexes of masked lanes are inside of the valid memory 
>> boundary is necessary for masked vector memory access. However, this could 
>> be saved if the given offset is inside of the vector range that could make 
>> sure no IOOBE (IndexOutOfBoundaryException) happens. The masked load APIs 
>> have saved this kind of check for common cases. And this patch did the 
>> similar optimization for the masked vector store.
>> 
>> The performance for the new added store masked benchmarks improves about 
>> `1.83x ~ 2.62x` on a x86 system:
>> 
>> Benchmark   BeforeAfter Gain Units
>> StoreMaskedBenchmark.byteStoreArrayMask   12757.936 23291.118  1.826 ops/ms
>> StoreMaskedBenchmark.doubleStoreArrayMask  1520.932  3921.616  2.578 ops/ms
>> StoreMaskedBenchmark.floatStoreArrayMask   2713.031  7122.535  2.625 ops/ms
>> StoreMaskedBenchmark.intStoreArrayMask 4113.772  8220.206  1.998 ops/ms
>> StoreMaskedBenchmark.longStoreArrayMask1993.986  4874.148  2.444 ops/ms
>> StoreMaskedBenchmark.shortStoreArrayMask   8543.593 17821.086  2.086 ops/ms
>> 
>> Similar performane gain can also be observed on ARM hardware.
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains three commits:
> 
>  - Merge branch 'jdk:master' to JDK-8286279
>  - Wrap the offset check into a static method
>  - 8286279: [vectorapi] Only check index of masked lanes if offset is out of 
> array boundary for masked store

Marked as reviewed by psandoz (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/8620


Re: RFR: 8286279: [vectorapi] Only check index of masked lanes if offset is out of array boundary for masked store [v3]

2022-06-01 Thread Xiaohong Gong
> Checking whether the indexes of masked lanes are inside of the valid memory 
> boundary is necessary for masked vector memory access. However, this could be 
> saved if the given offset is inside of the vector range that could make sure 
> no IOOBE (IndexOutOfBoundaryException) happens. The masked load APIs have 
> saved this kind of check for common cases. And this patch did the similar 
> optimization for the masked vector store.
> 
> The performance for the new added store masked benchmarks improves about 
> `1.83x ~ 2.62x` on a x86 system:
> 
> Benchmark   BeforeAfter Gain Units
> StoreMaskedBenchmark.byteStoreArrayMask   12757.936 23291.118  1.826 ops/ms
> StoreMaskedBenchmark.doubleStoreArrayMask  1520.932  3921.616  2.578 ops/ms
> StoreMaskedBenchmark.floatStoreArrayMask   2713.031  7122.535  2.625 ops/ms
> StoreMaskedBenchmark.intStoreArrayMask 4113.772  8220.206  1.998 ops/ms
> StoreMaskedBenchmark.longStoreArrayMask1993.986  4874.148  2.444 ops/ms
> StoreMaskedBenchmark.shortStoreArrayMask   8543.593 17821.086  2.086 ops/ms
> 
> Similar performane gain can also be observed on ARM hardware.

Xiaohong Gong has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains three commits:

 - Merge branch 'jdk:master' to JDK-8286279
 - Wrap the offset check into a static method
 - 8286279: [vectorapi] Only check index of masked lanes if offset is out of 
array boundary for masked store

-

Changes: https://git.openjdk.java.net/jdk/pull/8620/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8620&range=02
  Stats: 216 lines in 9 files changed: 179 ins; 0 del; 37 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8620/head:pull/8620

PR: https://git.openjdk.java.net/jdk/pull/8620