> The goal of this PR is to fix the performance regression in Arrays.fill() x86 > stubs caused by masked AVX stores. The fix is to replace the masked AVX > stores with store instructions without masks (i.e. unmasked stores). > `fill32_masked()` and `fill64_masked()` stubs are replaced with > `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down > into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million > times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] > | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked > store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.189 > 2 | 0.46 | 0.16 | 0.191 > 3 | 0.46 | 0.176 | 0.199 > 4 | 0.46 | 0.244 | 0.212 > 5 | 0.46 | 0.29 | 0.364 > 10 | 0.46 | 0.58 | 0.354 > 15 | 0.46 | 0.42 | 0.325 > 16 | 0.46 | 0.46 | 0.281 > 17 | 0.21 | 0.5 | 0.365 > 20 | 0.21 | 0.37 | 0.326 > 25 | 0.21 | 0.59 | 0.343 > 31 | 0.21 | 0.53 | 0.317 > 32 | 0.21 | 0.58 | 0.249 > 35 | 0.5 | 0.77 | 0.303 > 40 | 0.5 | 0.61 | 0.312 > 45 | 0.5 | 0.52 | 0.364 > 48 | 0.5 | 0.66 | 0.283 > 49 | 0.22 | 0.69 | 0.367 > 50 | 0.22 | 0.78 | 0.344 > 55 | 0.22 | 0.67 | 0.332 > 60 | 0.22 | 0.67 | 0.312 > 64 | 0.22 | 0.82 | 0.253 > 70 | 0.51 | 1.1 | 0.394 > 80 | 0.49 | 0.89 | 0.346 > 90 | 0.225 | 0.68 | 0.385 > 100 | 0.54 | 1.09 | 0.364 > 110 | 0.6 | 0.98 | 0.416 > 120 | 0.26 | 0.75 | 0.367 > 128 | 0.266 | 1.1 | 0.342
Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Update ALL of ArraysFill JMH micro ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/5edff7f7..620ae44e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=11-12 Stats: 8 lines in 1 file changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442
