On Fri, 9 May 2025 14:11:27 GMT, Andrew Haley <a...@openjdk.org> wrote:
> This intrinsic is generally faster than the current implementation for Panama > segment operations for all writes larger than about 8 bytes in size, > increasing to more than 2* the performance on larger memory blocks on > Graviton 2, between "panama" (C2 generated, what we use now) and "unsafe" > (this intrinsic). > > > Benchmark (aligned) (size) Mode Cnt Score > Error Units > MemorySegmentFillUnsafe.panama true 262143 avgt 10 7295.638 ± > 0.422 ns/op > MemorySegmentFillUnsafe.panama false 262143 avgt 10 8345.300 ± > 80.161 ns/op > MemorySegmentFillUnsafe.unsafe true 262143 avgt 10 2930.594 ± > 0.180 ns/op > MemorySegmentFillUnsafe.unsafe false 262143 avgt 10 3136.828 ± > 0.232 ns/op Apple M1: Benchmark (ELEM_SIZE) Mode Cnt Score Error Units SegmentBulkFill.heapSegmentFillJava 2 avgt 10 1.727 ± 0.017 ns/op SegmentBulkFill.heapSegmentFillJava 3 avgt 10 1.721 ± 0.002 ns/op SegmentBulkFill.heapSegmentFillJava 4 avgt 10 1.876 ± 0.002 ns/op SegmentBulkFill.heapSegmentFillJava 5 avgt 10 1.876 ± 0.001 ns/op SegmentBulkFill.heapSegmentFillJava 6 avgt 10 1.876 ± 0.002 ns/op SegmentBulkFill.heapSegmentFillJava 7 avgt 10 1.876 ± 0.002 ns/op SegmentBulkFill.heapSegmentFillJava 8 avgt 10 2.502 ± 0.003 ns/op SegmentBulkFill.heapSegmentFillJava 64 avgt 10 4.064 ± 0.002 ns/op SegmentBulkFill.heapSegmentFillJava 512 avgt 10 6.601 ± 0.051 ns/op SegmentBulkFill.heapSegmentFillJava 4096 avgt 10 44.050 ± 0.076 ns/op SegmentBulkFill.heapSegmentFillJava 32768 avgt 10 330.328 ± 0.450 ns/op SegmentBulkFill.heapSegmentFillJava 262144 avgt 10 4138.154 ± 6.509 ns/op SegmentBulkFill.heapSegmentFillJava 2097152 avgt 10 33089.966 ± 48.068 ns/op SegmentBulkFill.heapSegmentFillJava 16777216 avgt 10 352669.548 ± 571.433 ns/op SegmentBulkFill.heapSegmentFillJava 134217728 avgt 10 4482510.192 ± 7177.637 ns/op SegmentBulkFill.heapSegmentFillLoop 2 avgt 10 1.977 ± 0.003 ns/op SegmentBulkFill.heapSegmentFillLoop 3 avgt 10 3.447 ± 0.002 ns/op SegmentBulkFill.heapSegmentFillLoop 4 avgt 10 4.073 ± 0.042 ns/op SegmentBulkFill.heapSegmentFillLoop 5 avgt 10 4.377 ± 0.004 ns/op SegmentBulkFill.heapSegmentFillLoop 6 avgt 10 5.337 ± 0.071 ns/op SegmentBulkFill.heapSegmentFillLoop 7 avgt 10 5.629 ± 0.004 ns/op SegmentBulkFill.heapSegmentFillLoop 8 avgt 10 5.947 ± 0.010 ns/op SegmentBulkFill.heapSegmentFillLoop 64 avgt 10 8.127 ± 0.003 ns/op SegmentBulkFill.heapSegmentFillLoop 512 avgt 10 16.045 ± 0.027 ns/op SegmentBulkFill.heapSegmentFillLoop 4096 avgt 10 46.627 ± 0.164 ns/op SegmentBulkFill.heapSegmentFillLoop 32768 avgt 10 333.233 ± 1.040 ns/op SegmentBulkFill.heapSegmentFillLoop 262144 avgt 10 4134.009 ± 11.125 ns/op SegmentBulkFill.heapSegmentFillLoop 2097152 avgt 10 33148.671 ± 322.905 ns/op SegmentBulkFill.heapSegmentFillLoop 16777216 avgt 10 343832.913 ± 233.881 ns/op SegmentBulkFill.heapSegmentFillLoop 134217728 avgt 10 4475821.911 ± 6101.380 ns/op SegmentBulkFill.heapSegmentFillUnsafe 2 avgt 10 3.133 ± 0.034 ns/op SegmentBulkFill.heapSegmentFillUnsafe 3 avgt 10 3.130 ± 0.005 ns/op SegmentBulkFill.heapSegmentFillUnsafe 4 avgt 10 3.128 ± 0.004 ns/op SegmentBulkFill.heapSegmentFillUnsafe 5 avgt 10 3.139 ± 0.030 ns/op SegmentBulkFill.heapSegmentFillUnsafe 6 avgt 10 3.135 ± 0.035 ns/op SegmentBulkFill.heapSegmentFillUnsafe 7 avgt 10 3.135 ± 0.030 ns/op SegmentBulkFill.heapSegmentFillUnsafe 8 avgt 10 2.665 ± 0.006 ns/op SegmentBulkFill.heapSegmentFillUnsafe 64 avgt 10 2.841 ± 0.032 ns/op SegmentBulkFill.heapSegmentFillUnsafe 512 avgt 10 6.246 ± 0.100 ns/op SegmentBulkFill.heapSegmentFillUnsafe 4096 avgt 10 41.241 ± 0.107 ns/op SegmentBulkFill.heapSegmentFillUnsafe 32768 avgt 10 331.001 ± 4.521 ns/op SegmentBulkFill.heapSegmentFillUnsafe 262144 avgt 10 3038.808 ± 29.750 ns/op SegmentBulkFill.heapSegmentFillUnsafe 2097152 avgt 10 21996.375 ± 2617.947 ns/op SegmentBulkFill.heapSegmentFillUnsafe 16777216 avgt 10 241814.864 ± 24300.854 ns/op SegmentBulkFill.heapSegmentFillUnsafe 134217728 avgt 10 2811655.392 ± 24737.911 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/25147#issuecomment-2866961810