Re: Question about GetObjectMonitorUsage() JVMTI function

David Holmes Mon, 15 Jun 2020 18:32:14 -0700

On 16/06/2020 10:57 am, Daniel D. Daugherty wrote:

On 6/15/20 7:19 PM, David Holmes wrote:
On 16/06/2020 8:40 am, Daniel D. Daugherty wrote:
On 6/15/20 6:14 PM, David Holmes wrote:
Hi Dan,
On 15/06/2020 11:38 pm, Daniel D. Daugherty wrote:
On 6/15/20 3:26 AM, David Holmes wrote:
On 15/06/2020 4:02 pm, Yasumasa Suenaga wrote:
Hi David,

On 2020/06/15 14:15, David Holmes wrote:
Hi Yasumasa,

On 15/06/2020 2:49 pm, Yasumasa Suenaga wrote:
Hi all,
I wonder why JvmtiEnvBase::get_object_monitor_usage()(implementation of GetObjectMonitorUsage()) does not perform atsafepoint.
GetObjectMonitorUsage will use a safepoint if the target is notsuspended:
jvmtiError
JvmtiEnv::GetObjectMonitorUsage(jobject object,jvmtiMonitorUsage* info_ptr) {
   JavaThread* calling_thread = JavaThread::current();
jvmtiError err = get_object_monitor_usage(calling_thread,object, info_ptr);
   if (err == JVMTI_ERROR_THREAD_NOT_SUSPENDED) {
// Some of the critical threads were not suspended. go to asafepoint and try again VM_GetObjectMonitorUsage op(this, calling_thread, object,info_ptr);
     VMThread::execute(&op);
     err = op.result();
   }
   return err;
} /* end GetObject */
I saw this code, so I guess there are some cases whenJVMTI_ERROR_THREAD_NOT_SUSPENDED is not returned fromget_object_monitor_usage().
Monitor owner would be acquired from monitor object at first[1], but it would perform concurrently.If owner thread is not suspended, the owner might be changed toothers in subsequent code.
For example, the owner might release the monitor before [2].
The expectation is that when we find an owner thread it iseither suspended or not. If it is suspended then it cannotrelease the monitor. If it is not suspended we detect that andredo the whole query at a safepoint.
I think the owner thread might resume unfortunately aftersuspending check.
Yes you are right. I was thinking resuming also required asafepoint but it only requires the Threads_lock. So yes the codeis wrong.
Which code is wrong?

Yes, a rogue resume can happen when the GetObjectMonitorUsage() caller
has started the process of gathering the information while not at a
safepoint. Thus the information returned by GetObjectMonitorUsage()
might be stale, but that's a bug in the agent code.
The code tries to make sure that it either collects data about amonitor owned by a thread that is suspended, or else it collectsthat data at a safepoint. But the owning thread can be resumed justafter the code determined it was suspended. The monitor can then bereleased and the information gathered not only stale but potentiallycompletely wrong as it could now be owned by a different thread andwill report that thread's entry count.
If the agent is not using SuspendThread(), then as soon as
GetObjectMonitorUsage() returns to the caller the information
can be stale. In fact as soon as the implementation returns
from the safepoint that gathered the info, the target thread
could have moved on.
That isn't the issue. That the info is stale is fine. But theexpectation is that the information was actually an accurate snapshotof the state of the monitor at some point in time. The current codedoes not ensure that.
Please explain. I clearly don't understand why you think the info
returned isn't "an accurate snapshot of the state of the monitor
at some point in time".

Because it may not be a "snapshot" at all. There is no atomicity**. Thereported owner thread may not own it any longer when the entry count isread, so straight away you may have the wrong entry count information.The set of threads trying to acquire the monitor, or wait on the monitorcan change in unexpected ways. It would be possible for instance toreport the same thread as being the owner, being blocked trying to enterthe monitor, and being in the wait-set of the monitor - apparently allat the same time!

** even if the owner is suspended we don't have complete atomicitybecause threads can join the set of threads trying to enter the monitor(unless they are all suspended).


David
-----

The only way to make sure you don't have stale information is
to use SuspendThread(), but it's not required. Perhaps the doc
should have more clear about the possibility of returning stale
info. That's a question for Robert F.
GetObjectMonitorUsage says nothing about thread's being suspended soI can't see how this could be construed as an agent bug.
In your scenario above, you mention that the target thread was
suspended, GetObjectMonitorUsage() was called while the target
was suspended, and then the target thread was resumed after
GetObjectMonitorUsage() checked for suspension, but before
GetObjectMonitorUsage() was able to gather the info.

All three of those calls: SuspendThread(), GetObjectMonitorUsage()
and ResumeThread() are made by the agent and the agent should not
resume the target thread while also calling GetObjectMonitorUsage().
The calls were allowed to be made out of order so agent bug.
Perhaps. I was thinking more generally about an independent resume,but you're right that doesn't really make a lot of sense. But when thespec says nothing about suspension ...
And it is intentional that suspension is not required. JVM/DI and JVM/PI
used to require suspension for these kinds of get-the-info APIs. JVM/TI
intentionally was designed to not require suspension.

As I've said before, we could add a note about the data being potentially
stale unless SuspendThread is used. I think of it like stat(2). You can
fetch the file's info, but there's no guarantee that the info is current
by the time you process what you got back. Is it too much motherhood to
state that the data might be stale? I could go either way...
Using a handshake on the owner thread will allow this to be fixed inthe future without forcing/using any safepoints.
I have to think about that which is why I'm avoiding talking about
handshakes in this thread.
Effectively the handshake can "suspend" the thread whilst the monitoris queried. In effect the operation would create a per-thread safepoint.
I "know" that, but I still need time to think about it and probably
see the code to see if there are holes...
Semantically it is no different to the code actually suspending theowner thread, but it can't actually do that because suspends/resumedon't nest.
Yeah... we used have a suspend count back when we tracked internal and
external suspends separately. That was a nightmare...

Dan
Cheers,
David
Dan
Cheers,
David
Dan
JavaThread::is_ext_suspend_completed() is used to check threadstate, it returns `true` when the thread is sleeping [3], or whenit performs in native [4].
Sure but if the thread is actually suspended it can't continueexecution in the VM or in Java code.
This appears to be an optimisation for the assumed common casewhere threads are first suspended and then the monitors arequeried.
I agree with this, but I could find out it from JVMTI spec - itjust says "Get information about the object's monitor."
Yes it was just an implementation optimisation, nothing to do withthe spec.
GetObjectMonitorUsage() might return incorrect information insome case.
It starts with finding owner thread, but the owner might be justbefore wakeup.So I think it is more safe if GetObjectMonitorUsage() is calledat safepoint in any case.
Except we're moving away from safepoints to using Handshakes, sothis particular operation will require that the apparent owner isHandshake-safe (by entering a handshake with it) before queryingthe monitor. This would still be preferable I think to alwaysusing a safepoint for the entire operation.
Cheers,
David
-----
Thanks,

Yasumasa
[3]http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/runtime/thread.cpp#l671[4]http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/runtime/thread.cpp#l684
However there is still a potential bug as the thread reported asthe owner may not be suspended at the time we first see it, andmay release the monitor, but then it may get suspended before wecall:
owning_thread =Threads::owning_thread_from_monitor_owner(tlh.list(), owner);
and so we think it is still the monitor owner and proceed toquery the monitor information in a racy way. This can't happenwhen suspension itself requires a safepoint as the currentthread won't go to that safepoint during this code. However, ifsuspension is implemented via a direct handshake with the targetthread then we have a problem.
David
-----
Thanks,

Yasumasa
[1]http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/prims/jvmtiEnvBase.cpp#l973[2]http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/prims/jvmtiEnvBase.cpp#l996

Re: Question about GetObjectMonitorUsage() JVMTI function

Reply via email to