On Wed, 17 Jun 2026 11:54:04 GMT, Per Minborg <[email protected]> wrote:

>> ## Summary
>> 
>> This PR proposes to introduce a pooled confined arena as an optimization for 
>> `Arena.ofConfined()`, where small native allocations can be served from a 
>> reusable per-thread/per-slot memory pool instead of calling the regular 
>> native allocator for every short-lived arena. The arena remains confined to 
>> its owner thread and is still closed normally, but its backing storage can 
>> be reset and reused when the arena closes. The feature requires no API 
>> changes.
>> 
>> ### Outline
>> 
>> Platform threads: one lazily allocated pool per Thread, encoded in 
>> `Thread.confinedMemoryPool`.
>> Virtual threads: fixed shared native pool with CAS-protected slots, because 
>> per-virtual-thread native pools would not scale.
>> 
>> Pooled memory is zeroed out upon _closing_ an Arena to minimize data 
>> visibility between reuse. This means the data is visible only within a TWR 
>> block, and never outside it.
>> 
>> By default, a confined arena has access to 64 bytes of pooled data.  The 
>> pool size is configurable via a system property and can be 8, 16, 32, or 64 
>> bytes. Pooling can also be turned off completely by setting the pool 
>> power-of-two size to zero. Nested confined arenas are not supported
>> 
>> ## Static Analysis
>> 
>> An extensive static corpus analysis of third-party libraries and the JDK 
>> itself has been conducted with respect to `Area.ofConfined()` usage, 
>> revealing that confined arenas were used _only_ in TWR blocks and _never_ in 
>> an unstructured way. The static analysis further revealed that in most 
>> cases, only a small amount of native memory was ever allocated, usually less 
>> than 32 bytes, and in many cases, 8 bytes or less. This usage pattern lends 
>> itself well to pooling. 
>> 
>> ## Dynamic Analysis
>> 
>> A dynamic statistical analysis of actual runs was also made, where various 
>> properties of confined arenas were recorded and summarized during a complete 
>> tier1 test run. While a tier1 run is not necessarily representative of a 
>> typical application workload, it provided some interesting results:
>> 
>> The run produced 93 per-process histogram blocks and 788,773,092 closed 
>> confined arenas. The result is dominated by arenas with no native allocation 
>> at all: 375,934,768 arenas (47.661%) are in the zero-byte bucket. Counting 
>> arenas up to 63 bytes covers 99.997% of all arena closures.
>> 
>> The largest count bucket is 8-15 bytes per arena with 400,951,293 arenas 
>> (50.832% of all arenas). The largest byte bucket is 8-15 bytes per arena 
>> with 3,207,623,039 B (3,059.03 MiB) (46.794% of all by...
>
> Per Minborg has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Remove class

Left some initial comments. I still need to look more

src/java.base/share/classes/java/lang/Thread.java line 394:

> 392:      * Returns a pointer to the pooled memory or zero if the pool cannot 
> be acquired.
> 393:      */
> 394:     @ForceInline

This doesn't look like a good use case for `@ForceInline`. Force inline should 
generally only be used for methods that 'evaporate' to almost nothing, e.g. 
because some intrinsic folds away most of the code (i.e. cases where estimates 
based on the bytecode size of the method are wrong). That doesn't seem to be 
the case here.

src/java.base/share/classes/java/lang/Thread.java line 469:

> 467:             case 0: break;
> 468:             default: throw new AssertionError(size);
> 469:         }

This doesn't look right. This would at most zero 8 bytes of memory (just at 
different offsets). Is there a loop missing here?

src/java.base/share/classes/jdk/internal/foreign/ConfinedSession.java line 97:

> 95:     @Override
> 96:     @ForceInline
> 97:     NativeMemorySegmentImpl allocateLowLevel(long byteSize, long 
> byteAlignment, boolean init) {

This shouldn't be in `ConfinedSession`/`MemorySessionImpl`. Sessions are for 
tracking lifetimes, arenas should handle allocations. I suggest moving this to 
`ArenaImpl`, and having a sub class of `ArenaImpl` for confined allocations 
(whose close can also clean up the pool).

-------------

PR Review: https://git.openjdk.org/jdk/pull/31365#pullrequestreview-4459282119
PR Review Comment: https://git.openjdk.org/jdk/pull/31365#discussion_r3429073269
PR Review Comment: https://git.openjdk.org/jdk/pull/31365#discussion_r3381181494
PR Review Comment: https://git.openjdk.org/jdk/pull/31365#discussion_r3428997256

Reply via email to