ClhsdbScanOops.java assumes no GC should occur

Joel Sikström Fri, 05 Dec 2025 03:08:01 -0800

On Thu, 4 Dec 2025 17:12:44 GMT, Chris Plummer <[email protected]> wrote:


>> Hello,
>> 
>> If the initial heap size is set too low in 
>> serviceability/sa/ClhsdbScanOops.java, a GC migh run, which will interfere 
>> with the test and might cause it to fail. 
>> 
>> The test is scanning the oops in a region of the heap, and after a GC that 
>> region appears to be empty, so the output that the test expects is not 
>> present. Running the test with a larger explicit InitialHeapSize gives 
>> enough headroom to not run a GC.
>> 
>> Testing: 
>> * serviceability/sa/ClhsdbScanOops.java originally failed when run with 
>> `-XX:InitialRAMPercentage=0` (which is the new default). We now explicitly 
>> set `-XX:InitialHeapSize=100M`. I've rerun the test 10 times with Serial and 
>> Parallel for each test and they all pass.
>
>> > It's probably just the timing of the GC that determines whether the 
>> > initial small heap is a problem or not. If you want the SA tests to be 
>> > reliable with something like InitialRAMPercentage=0, probably all of the 
>> > tests should be updated. However, personally I don't think this type of 
>> > fix should be necessary unless you feel testing in the manner is something 
>> > we want to support. There are plenty of tests that start failing when 
>> > non-standard command line options are used.
>> 
>> FYI we just integrated a change that sets InitialRAMPercentage=0 for JDK 26 
>> that we've been working on (see #28641). We've run up to Oracle's tier8 
>> twice now, and apart from the tests that are included in this PR, we've not 
>> seen any other SA failures.
>> 
>> Of course there might be other intermittent failures in the future, in which 
>> case I see two approaches moving forward: problem listing or bumping the 
>> initial heap size for the affected tests, or going over all SA tests and 
>> making sure that they all run with a "large" initial heap size (like 100MB). 
>> Unless we start seing many (for some definition of many) test failures from 
>> now, a pragmatic compromise is to selectively bump the initial heap size of 
>> such tests, like I do in this PR.
>> 
>> Of course, the optimal approach would be to make any affected SA tests more 
>> robst to GC timings. But, since I'm not sure how much time we want to invest 
>> in improving SA tests, bumping the heap size is likely a good compromise 
>> here.
> 
> For the most part the SA tests are fine if there is a GC. The way they 
> usually work is to launch the debuggee, wait for it to reach a stable point 
> (all threads idle), and then start to query the debuggee. If a GC happens 
> before reaching stability, that should be fine, and after stability we 
> wouldn't expect any GCs no matter what the heap size is. There are some SA 
> tests that run on active processes where GCs can happen, but they are written 
> to allow for errors.

Thank you @plummercj for analysing the tests. I've removed the explicit 
InitialHeapSize from test/jdk/com/sun/jdi/MethodInvokeWithTraceOnTest.java in 
favor of https://github.com/openjdk/jdk/pull/28666. I've also removed 
test/hotspot/jtreg/serviceability/sa/ClhsdbScanOops.java from the ProblemList.

The two sets of tests that are associated with this issue but not addressed in 
this PR are about to be solved in https://github.com/openjdk/jdk/pull/28666 and 
https://github.com/openjdk/jdk/pull/28655. I'm holding off on this PR until 
those changes are integrated.

I've updated both the issue and this PR to reflect the new changes.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28637#issuecomment-3616404117

Re: RFR: 8373022: serviceability/sa/ClhsdbScanOops.java assumes no GC should occur

Reply via email to