On Fri, 9 May 2025 15:39:35 GMT, Andrew Haley <[email protected]> wrote:
>> This intrinsic is generally faster than the current implementation for
>> Panama segment operations for all writes larger than about 8 bytes in size,
>> increasing to more than 2* the performance on larger memory blocks on
>> Graviton 2, between "panama" (C2 generated, what we use now) and "unsafe"
>> (this intrinsic).
>>
>>
>> Benchmark (aligned) (size) Mode Cnt Score
>> Error Units
>> MemorySegmentFillUnsafe.panama true 262143 avgt 10 7295.638 ±
>> 0.422 ns/op
>> MemorySegmentFillUnsafe.panama false 262143 avgt 10 8345.300 ±
>> 80.161 ns/op
>> MemorySegmentFillUnsafe.unsafe true 262143 avgt 10 2930.594 ±
>> 0.180 ns/op
>> MemorySegmentFillUnsafe.unsafe false 262143 avgt 10 3136.828 ±
>> 0.232 ns/op
>
> Andrew Haley has updated the pull request incrementally with one additional
> commit since the last revision:
>
> generate_unsafecopy_common_error_exit
Looking at the improvements made, I suggest we also change (in
`SegmentBulkOperations`):
private static final int NATIVE_THRESHOLD_FILL = powerOfPropertyOr("fill",
Architecture.isAARCH64() ? 18 : 5);
to
private static final int NATIVE_THRESHOLD_FILL = powerOfPropertyOr("fill", 5);
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25147#issuecomment-2871092439