RFR: 8371260: Improve scaling of downcalls using MemorySegments allocated with shared arenas, take 2

Peter Levart Sun, 22 Feb 2026 06:32:31 -0800

Hi,

When administering my mailing lists, my attention was drawn to this pull 
request: https://github.com/openjdk/jdk/pull/28575, which tries to tackle this 
scaling problem. Although it was dismissed, I remembered that I was dealing 
with a similar problem in the past, so I looked closely...


Here's an alternative take at the problem. It reuses a maintained public 
component of JDK, the LongAdder, so in this respect, it does not add complexity 
and maintainance burden. It also does not change the internal API of the 
MemorySessionImpl. The size of the patch is also smaller.

For experimenting and benchmarking, I created a separate impmenetation of just 
the acquire/release/close logic with existing "simple" and this new "striped" 
implementations here:

https://github.com/plevart/acquire-release-close

Running it on my 8 core (16 threads) Linux PC, it gives promising results 
without regression for single-threaded use:


** Simple, measure run #1...
concurrency: 1, nanos: 39909697 (x 1.0)
concurrency: 2, nanos: 164735444 (x 4.127704702944751)
concurrency: 4, nanos: 394283724 (x 9.87939657873123)
concurrency: 8, nanos: 672278915 (x 16.84500172978011)
concurrency: 16, nanos: 2169282886 (x 54.3547821473062)
** Simple, measure run #2...
concurrency: 1, nanos: 40318379 (x 1.0)
concurrency: 2, nanos: 163438657 (x 4.053701092496799)
concurrency: 4, nanos: 399382210 (x 9.905710991009832)
concurrency: 8, nanos: 694862623 (x 17.23438888750959)
concurrency: 16, nanos: 2182386494 (x 54.12882531810121)
** Simple, measure run #3...
concurrency: 1, nanos: 39871197 (x 1.0)
concurrency: 2, nanos: 168843686 (x 4.234728292707139)
concurrency: 4, nanos: 375489497 (x 9.417562683156966)
concurrency: 8, nanos: 675885694 (x 16.951728186138983)
concurrency: 16, nanos: 2083500812 (x 52.255787856080666)
** end.

** Striped, measure run #1...
concurrency: 1, nanos: 36698350 (x 1.0)
concurrency: 2, nanos: 47349695 (x 1.290240433152989)
concurrency: 4, nanos: 58622304 (x 1.5974098018030782)
concurrency: 8, nanos: 60548173 (x 1.6498881557345222)
concurrency: 16, nanos: 70607406 (x 1.9239940215295783)
** Striped, measure run #2...
concurrency: 1, nanos: 37217044 (x 1.0)
concurrency: 2, nanos: 38610020 (x 1.0374284427317764)
concurrency: 4, nanos: 39166893 (x 1.0523912914738742)
concurrency: 8, nanos: 51778829 (x 1.3912665659314587)
concurrency: 16, nanos: 70277394 (x 1.8883120862581133)
** Striped, measure run #3...
concurrency: 1, nanos: 37589735 (x 1.0)
concurrency: 2, nanos: 38748261 (x 1.0308202758013592)
concurrency: 4, nanos: 38656911 (x 1.0283900910714054)
concurrency: 8, nanos: 40530711 (x 1.0782388064188269)
concurrency: 16, nanos: 52545852 (x 1.3978776918751887)
** end.

-------------

Commit messages:
 - 8371260: Improve scaling of downcalls using MemorySegments allocated with 
shared arenas, take 2

Changes: https://git.openjdk.org/jdk/pull/29866/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29866&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8371260
  Stats: 62 lines in 3 files changed: 32 ins; 13 del; 17 mod
  Patch: https://git.openjdk.org/jdk/pull/29866.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/29866/head:pull/29866

PR: https://git.openjdk.org/jdk/pull/29866

RFR: 8371260: Improve scaling of downcalls using MemorySegments allocated with shared arenas, take 2

Reply via email to