> The goal of this PR is to fix the performance regression in Arrays.fill() x86 
> stubs caused by masked AVX stores. The fix is to replace the masked AVX 
> stores with store instructions without masks (i.e. unmasked stores). 
> `fill32_masked()` and `fill64_masked()` stubs are replaced with 
> `fill32_unmasked()` and `fill64_unmasked()` respectively.
> 
> To speedup unmasked stores, array fills for sizes < 64 bytes are broken down 
> into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size.
> 
> 
> ### **Performance comparison for byte array fills in a loop for 1 million 
> times**
> 
> 
> UseAVX=3   ByteArray Size | +OptimizeFill    (Masked store   stub)     [secs] 
> | -OptimizeFill   (No stub)   [secs] | --->This PR: +OptimizeFill   (Unmasked 
> store   stub)   [secs]
> -- | -- | -- | --
> 1 | 0.46 | 0.14 | 0.189
> 2 | 0.46 | 0.16 | 0.191
> 3 | 0.46 | 0.176 | 0.199
> 4 | 0.46 | 0.244 | 0.212
> 5 | 0.46 | 0.29 | 0.364
> 10 | 0.46 | 0.58 | 0.354
> 15 | 0.46 | 0.42 | 0.325
> 16 | 0.46 | 0.46 | 0.281
> 17 | 0.21 | 0.5 | 0.365
> 20 | 0.21 | 0.37 | 0.326
> 25 | 0.21 | 0.59 | 0.343
> 31 | 0.21 | 0.53 | 0.317
> 32 | 0.21 | 0.58 | 0.249
> 35 | 0.5 | 0.77 | 0.303
> 40 | 0.5 | 0.61 | 0.312
> 45 | 0.5 | 0.52 | 0.364
> 48 | 0.5 | 0.66 | 0.283
> 49 | 0.22 | 0.69 | 0.367
> 50 | 0.22 | 0.78 | 0.344
> 55 | 0.22 | 0.67 | 0.332
> 60 | 0.22 | 0.67 | 0.312
> 64 | 0.22 | 0.82 | 0.253
> 70 | 0.51 | 1.1 | 0.394
> 80 | 0.49 | 0.89 | 0.346
> 90 | 0.225 | 0.68 | 0.385
> 100 | 0.54 | 1.09 | 0.364
> 110 | 0.6 | 0.98 | 0.416
> 120 | 0.26 | 0.75 | 0.367
> 128 | 0.266 | 1.1 | 0.342

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  Update ALL of ArraysFill JMH micro

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/28442/files
  - new: https://git.openjdk.org/jdk/pull/28442/files/5edff7f7..620ae44e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=12
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=11-12

  Stats: 8 lines in 1 file changed: 4 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/28442.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442

PR: https://git.openjdk.org/jdk/pull/28442

Reply via email to