While working on improving the TLAB sizing code for ZGC @kstefanj ran into an issue with the following tests failing: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorInterpreterObjectTest.java serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java
The reason for seeing the problems now is that with the new sizing code ZGC used smaller TLABs at first, before resizing them to a proper size (to lower the waste). In the HeapMonitor tests we don't allocate enough to trigger GCs that will resize the TLABs so most of the tests will now run with small TLABs. This should not be a problem, but it turns out the current sampling code is not working properly when you get a lot of outside TLAB allocations. You get those when trying to allocate a fairly large object (~1400B) that won't fit into the TLAB, but there are still quite a bit of room in the TLAB so we don't want to throw it away and take a new one. The problem in the current code is that we keep track of when to sample with multiple variables and when getting out of TLAB allocations these get out of sync. The proposed patch is the result of a restructuring and fixing of the the code that me and @kstefanj did to fix this issue. The idea is to better account how much we have allocated in three different buckets: * Outside of TLAB allocations * Accounted TLAB allocations * The last bit of TLAB allocations that hasn't been accounted yet And then use the sum of that to compare against a *non-changing* threshold to see if it is time to take a sample. There are a few things to think about when reading this code: * The thread can allocate and retire multiple TLABs before we cross the sample threshold. * The sampling can take multiple samples in a single TLAB * Any overshooting of the sample threshold triggers only one sample and the extra bytes are ignored when checking for the next sample. There are some restructuring in the patch to confine the ThreadHeapSampler variables and code. For example: 1) Moved accounting variables out of TLAB and into the ThreadHeapSampler 2) Moved thread allocation accounting and sampling code out of the TLAB 3) Moved retiring of TLABs out of make_parseable (needed to support (2)) Some of that could be extracted into a separate PR if that's deemed worthwhile. Tested with the HeapMonitor tests various TLAB sizes. If there are anyone using these APIs it would be nice to get feedback if these changes work well for you. ------------- Commit messages: - 8356372: JVMTI heap sampling not working properly with outside TLAB allocations Changes: https://git.openjdk.org/jdk/pull/25114/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25114&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356372 Stats: 224 lines in 14 files changed: 136 ins; 42 del; 46 mod Patch: https://git.openjdk.org/jdk/pull/25114.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25114/head:pull/25114 PR: https://git.openjdk.org/jdk/pull/25114