On Fri, 8 Aug 2025 13:25:27 GMT, Per Minborg <pminb...@openjdk.org> wrote:
>> This PR proposes to use overlapping memory areas in >> `SegmentBulkOperations::copy`, similar to what is proposed for >> `SegmentBulkOperations::fill` in https://github.com/openjdk/jdk/pull/25383. >> >> This PR passes `tier1`, `tier2`, and `tier3`testing on multiple platforms. > > Per Minborg has updated the pull request incrementally with one additional > commit since the last revision: > > Update copyright year src/java.base/share/classes/jdk/internal/foreign/AbstractMemorySegmentImpl.java line 257: > 255: // Returns 1 if the regions overlap, otherwise 0. > 256: @ForceInline > 257: int overlaps(AbstractMemorySegmentImpl that) { This modification provides a 10% total (i.e., for copy) performance improvement across all platforms for some likely common copy sizes like 16 and 24. Branching is sometimes an expensive operation that can disrupt the execution pipeline. With this, we avoid some branching. On the flip side, now both the expressions are always executed (in their branchless form). All things considered, this seems to be a win performance-wise at the expense of more complex code. So, there is a tradeoff to consider. I've tried to mitigate code complexity by commenting in the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26672#discussion_r2265793851