On Mon, 12 May 2025 09:37:24 GMT, Andrew Haley <a...@openjdk.org> wrote:

>> This intrinsic is generally faster than the current implementation for 
>> Panama segment operations for all writes larger than about 8 bytes in size, 
>> increasing to more than 2* the performance on larger memory blocks on 
>> Graviton 2, between "panama" (C2 generated, what we use now) and "unsafe" 
>> (this intrinsic).
>> 
>> 
>> Benchmark                       (aligned)  (size)  Mode  Cnt     Score    
>> Error  Units
>> MemorySegmentFillUnsafe.panama       true  262143  avgt   10  7295.638 ±  
>> 0.422  ns/op
>> MemorySegmentFillUnsafe.panama      false  262143  avgt   10  8345.300 ± 
>> 80.161  ns/op
>> MemorySegmentFillUnsafe.unsafe       true  262143  avgt   10  2930.594 ±  
>> 0.180  ns/op
>> MemorySegmentFillUnsafe.unsafe      false  262143  avgt   10  3136.828 ±  
>> 0.232  ns/op
>
> Andrew Haley has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Stub stack frame

src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 2619:

> 2617: 
> 2618:     __ bind(tail);
> 2619:     // __ add(count, count, 64);

I can see why you commented this out (and prefer that to deleting it). However, 
a comment explaining why it is not needed might avoid maintainers being 
side-tracked.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25147#discussion_r2084293642

Reply via email to