On Tue, 25 Feb 2025 09:51:08 GMT, Maurizio Cimadamore <mcimadam...@openjdk.org> 
wrote:

>> Benchmark results for the latest revision appears performance neutral. 
>> bytestacks same as last revision. 11 jobs left in tier 1-3, no failure so 
>> far. Also created CSR for this minor behavioral change.
>> 
>> Benchmark                                                (polluteProfile)  
>> Mode  Cnt  Score   Error  Units
>> LoopOverNonConstantHeap.BB_get                                      false  
>> avgt   30  0.934 ± 0.027  ns/op
>> LoopOverNonConstantHeap.BB_get                                       true  
>> avgt   30  0.946 ± 0.028  ns/op
>> LoopOverNonConstantHeap.BB_loop                                     false  
>> avgt   30  0.208 ± 0.004  ms/op
>> LoopOverNonConstantHeap.BB_loop                                      true  
>> avgt   30  0.211 ± 0.003  ms/op
>> LoopOverNonConstantHeap.segment_get                                 false  
>> avgt   30  1.123 ± 0.040  ns/op
>> LoopOverNonConstantHeap.segment_get                                  true  
>> avgt   30  1.120 ± 0.040  ns/op
>> LoopOverNonConstantHeap.segment_loop                                false  
>> avgt   30  0.205 ± 0.004  ms/op
>> LoopOverNonConstantHeap.segment_loop                                 true  
>> avgt   30  0.202 ± 0.003  ms/op
>> LoopOverNonConstantHeap.segment_loop_instance                       false  
>> avgt   30  0.209 ± 0.005  ms/op
>> LoopOverNonConstantHeap.segment_loop_instance                        true  
>> avgt   30  0.202 ± 0.003  ms/op
>> LoopOverNonConstantHeap.segment_loop_instance_unaligned             false  
>> avgt   30  0.209 ± 0.004  ms/op
>> LoopOverNonConstantHeap.segment_loop_instance_unaligned              true  
>> avgt   30  0.210 ± 0.004  ms/op
>> LoopOverNonConstantHeap.segment_loop_readonly                       false  
>> avgt   30  0.206 ± 0.004  ms/op
>> LoopOverNonConstantHeap.segment_loop_readonly                        true  
>> avgt   30  0.206 ± 0.005  ms/op
>> LoopOverNonConstantHeap.segment_loop_slice                          false  
>> avgt   30  0.203 ± 0.002  ms/op
>> LoopOverNonConstantHeap.segment_loop_slice                           true  
>> avgt   30  0.207 ± 0.004  ms/op
>> LoopOverNonConstantHeap.segment_loop_unaligned                      false  
>> avgt   30  0.206 ± 0.004  ms/op
>> LoopOverNonConstantHeap.segment_loop_unaligned                       true  
>> avgt   30  0.209 ± 0.003  ms/op
>> LoopOverNonConstantHeap.unsafe_get                                  false  
>> avgt   30  0.386 ± 0.017  ns/op
>> LoopOverNonConstantHeap.unsafe_get                                   true  
>> avgt   30  0.381 ± 0.017  ns/op
>> Loo...
>
> Great work @liach -- I've bumped the number of reviewers as this whole area 
> is rather tricky, and I think the PR can probably benefit from another 
> independent pass.

> Unfortunately no. I don't really know about the Foreign Memory API as I only 
> started to investigate its performance bottlenecks recently. @mcimadamore or 
> @JornVernee were more involved in the development and likely know better.

After a chat with @PaulSandoz , it seems like the exclusion was not a 
deliberate one. Probably, sub-int atomic unsafe primitives were added in 
parallel with the var handle work:

https://github.com/openjdk/jdk/commit/e663206d

The above commit did add support for sub-word atomics, but only in the field 
access case. The BB view/array view var handles were left out. Then came FFM, 
whose implementation was originally based off the BB view var handle -- so we 
ended up inheriting same limitations.

It would be good to address this -- in a separate PR. As for this PR, as we 
discussed offline, I'd suggest to keep the ScopedMemoryAccess changes (even if 
some methods are now unused) and drop the commented lines from the var handle 
gensrc file. Then we can come back with a new PR and add support for missing 
access modes.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23720#issuecomment-2690284867

Reply via email to