On Fri, 9 May 2025 15:39:35 GMT, Andrew Haley <a...@openjdk.org> wrote:
>> This intrinsic is generally faster than the current implementation for >> Panama segment operations for all writes larger than about 8 bytes in size, >> increasing to more than 2* the performance on larger memory blocks on >> Graviton 2, between "panama" (C2 generated, what we use now) and "unsafe" >> (this intrinsic). >> >> >> Benchmark (aligned) (size) Mode Cnt Score >> Error Units >> MemorySegmentFillUnsafe.panama true 262143 avgt 10 7295.638 ± >> 0.422 ns/op >> MemorySegmentFillUnsafe.panama false 262143 avgt 10 8345.300 ± >> 80.161 ns/op >> MemorySegmentFillUnsafe.unsafe true 262143 avgt 10 2930.594 ± >> 0.180 ns/op >> MemorySegmentFillUnsafe.unsafe false 262143 avgt 10 3136.828 ± >> 0.232 ns/op > > Andrew Haley has updated the pull request incrementally with one additional > commit since the last revision: > > generate_unsafecopy_common_error_exit Looking at the improvements made, I suggest we also change (in `SegmentBulkOperations`): private static final int NATIVE_THRESHOLD_FILL = powerOfPropertyOr("fill", Architecture.isAARCH64() ? 18 : 5); to private static final int NATIVE_THRESHOLD_FILL = powerOfPropertyOr("fill", 5); ------------- PR Comment: https://git.openjdk.org/jdk/pull/25147#issuecomment-2871092439