On Thu, 22 May 2025 11:52:34 GMT, Per Minborg <pminb...@openjdk.org> wrote:

>> This PR builds on a concept John Rose told me about some time ago. Instead 
>> of combining memory operations of various sizes, a single large and skewed 
>> memory operation can be made to clean up the tail of remaining bytes.
>> 
>> This has the effect of simplifying and shortening the code. The number of 
>> branches to evaluate is reduced.
>> 
>> It should be noted that the performance of the fill operation affects the 
>> allocation of new segments (as they are zeroed out before being returned to 
>> the client code).
>
> Per Minborg has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Update benchmark to reflect new fill method

Very cool!  I'm glad it worked.  This came out of some background work I was 
doing to find fast ways to feed a vectorized loop over an input measured in 
bytes (any number of them).

https://cr.openjdk.org/~jrose/jvm/PartialMemoryWord.cpp

The corresponding read technique works quite well, also.  It has the property 
that (if you combine the partial overlapping reads correctly) that each byte is 
read exactly once, which might be a good property for building concurrent data 
structures.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25383#issuecomment-2901911070

Reply via email to