On Tue, 11 Apr 2023 11:47:46 GMT, Roman Kennke <[email protected]> wrote:
>> src/hotspot/share/runtime/lockStack.inline.hpp line 111:
>>
>>> 109: int end = to_index(_top);
>>> 110: for (int i = end - 1; i >= 0; i--) {
>>> 111: if (NativeAccess<>::oop_load(&_base[i]) == o) {
>>
>> The use of NativeAccess here will break Generational ZGC. For other GCs it's
>> just a redundant GC barrier. The actual GC barrier for the oops in the
>> thread header is the start_processing() call.
>>
>> I was going to propose that you changed this to a plain load (as opposed to
>> using RawAccess), but @fisk pointed out that it looks like this code is used
>> from one thread looking into the data structures of another thread, which
>> would make such a load potentially racing. And that makes us also question
>> the plain load of `_top`. Is there anything that ensures that these are not
>> racy loads?
>
> The NativeAccess is a left-over from an earlier attempt, and yes I think the
> start_processing() is the actual barrier. There is a single call-path where
> we inspect another thread's lock-stack outside of a safepoint (from
> management/JMX code). We had some arguments back and forth with David about
> that (somewhere up in this PR) and the conclusion so far is that yes, it is
> racy, but it doesn't seem to be a problem. We might be getting wrong results
> in the sense that the other thread could change the state of locking in the
> moment right after we inspect it, but this doesn't look like a correctness
> problem in the code that's calling it and the problem is pre-existing with
> current stack-locking, too. See jmm_GetThreadInfo() in management.cpp around
> lines 1129ff.
It looks to me like the code could read racingly read the element just above
`_top`, which could contain a stale oop. If the address of the stale oop
matches the address of `o` then `contains` would incorrectly return true.
Did you consider rewriting the racing code to use thread-local handshakes to
remove the race?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1162729676