While working on improving the TLAB sizing code for ZGC @kstefanj ran into an 
issue with the following tests failing:
serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorInterpreterObjectTest.java
serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java

The reason for seeing the problems now is that with the new sizing code ZGC 
used smaller TLABs at first, before resizing them to a proper size (to lower 
the waste). In the HeapMonitor tests we don't allocate enough to trigger GCs 
that will resize the TLABs so most of the tests will now run with small TLABs. 
This should not be a problem, but it turns out the current sampling code is not 
working properly when you get a lot of outside TLAB allocations. You get those 
when trying to allocate a fairly large object (~1400B) that won't fit into the 
TLAB, but there are still quite a bit of room in the TLAB so we don't want to 
throw it away and take a new one.

The problem in the current code is that we keep track of when to sample with 
multiple variables and when getting out of TLAB allocations these get out of 
sync.

The proposed patch is the result of a restructuring and fixing of the the code 
that me and @kstefanj did to fix this issue.

The idea is to better account how much we have allocated in three different 
buckets:
* Outside of TLAB allocations
* Accounted TLAB allocations
* The last bit of TLAB allocations that hasn't been accounted yet

And then use the sum of that to compare against a *non-changing* threshold to 
see if it is time to take a sample.

There are a few things to think about when reading this code:
* The thread can allocate and retire multiple TLABs before we cross the sample 
threshold.
* The sampling can take multiple samples in a single TLAB
* Any overshooting of the sample threshold triggers only one sample and the 
extra bytes are ignored when checking for the next sample.

There are some restructuring in the patch to confine the ThreadHeapSampler 
variables and code. For example:

1) Moved accounting variables out of TLAB and into the ThreadHeapSampler
2) Moved thread allocation accounting and sampling code out of the TLAB
3) Moved retiring of TLABs out of make_parseable (needed to support (2))

Some of that could be extracted into a separate PR if that's deemed worthwhile.

Tested with the HeapMonitor tests various TLAB sizes.

If there are anyone using these APIs it would be nice to get feedback if these 
changes work well for you.

-------------

Commit messages:
 - 8356372: JVMTI heap sampling not working properly with outside TLAB 
allocations

Changes: https://git.openjdk.org/jdk/pull/25114/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25114&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356372
  Stats: 224 lines in 14 files changed: 136 ins; 42 del; 46 mod
  Patch: https://git.openjdk.org/jdk/pull/25114.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25114/head:pull/25114

PR: https://git.openjdk.org/jdk/pull/25114

Reply via email to