Re: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value"

coleen . phillimore Wed, 04 Dec 2019 04:21:55 -0800



On 12/3/19 11:39 PM, David Holmes wrote:

On 3/12/2019 11:35 pm, coleen.phillim...@oracle.com wrote:
On 12/3/19 8:31 AM, David Holmes wrote:
On 3/12/2019 11:08 pm, coleen.phillim...@oracle.com wrote:
On 12/2/19 11:52 PM, David Holmes wrote:
Hi Coleen,

On 3/12/2019 12:43 am, coleen.phillim...@oracle.com wrote:
On 11/26/19 7:03 PM, David Holmes wrote:
(adding runtime as well)

Hi Coleen,

On 27/11/2019 12:22 am, coleen.phillim...@oracle.com wrote:
Summary: Add local deferred event list to thread to post eventsoutside CodeCache_lock.
This patch builds on the patch for JDK-8173361. With thispatch, I made the JvmtiDeferredEventQueue an instance class(not AllStatic) and have one per thread. The CodeBlob eventthat used to drop the CodeCache_lock and raced with the sweeperthread, adds the events it wants to post to its thread locallist, and processes it outside the lock. The list is walked inGC and by the sweeper to keep the nmethods from being unloadedand zombied, respectively.
Sorry I don't understand why we would want/need a deferred eventqueue for every JavaThread? Isn't this only relevant fornon-JavaThreads that need to have the ServiceThread process thedeferred event?
I thought I'd written this in the bug but I had only discussedthis with Erik. I've added a comment to the bug to explain why Iadded the per-JavaThread queue. In order to process these eventsafter the CodeCache_lock is dropped, I have to queue themsomewhere safe. The ServiceThread queue is safe, *but* theServiceThread can't keep up with the events, especially from thistest case. So the test case gets a native OOM.
So I've added the safe queue as a field to each JavaThreadbecause multiple JavaThreads could be posting these events at thesame time, and there didn't seem to be a better safe place tocache them, without adding another layer of queuing code.
I think I'm getting the picture now. At the time the events aregenerated we can't post them directly because the current threadis inside compiler code. Hence the events must be deferred. Usingthe ServiceThread to handle the deferred events is one way to dealwith this - but it can't keep up in this scenario. So instead westore the events in the current thread and when the current threadreturns to code where it is safe to post the events, it does soitself. Is that generally correct?
Yes.
I admit I'm not keen on adding this additional field per-threadjust for a temporary usage. Some kind of stack allocated helperwould be preferable, but would need to be passed through the callchain so that the events could be added to it.
Right, and the GC and nmethods_do has to find it somehow. It wasn'tmy first choice of where to put it also because there is too manythings in JavaThread. Might be time for a future cleanup of Thread.
I see.
Also I'm not clear why we aggressively delete the_jvmti_event_queue after posting the events. I'd be worried aboutthe overhead we are introducing for creating and deleting thisqueue. When the JvmtiDeferredEventQueue data structure wasintended only for use by the ServiceThread its dynamic nodeallocation may have made more sense. But now that seems like aliability to me - if JvmtiDeferredEvents could be linked directlywe wouldn't need dynamic nodes, nor dynamic per-thread queues(just a per-thread pointer).
I'm not following. The queue is for multiple events that might beposted while in the CodeCache_lock, so they need to be in order andlinked together. While we post them and take them off, if thecallback safepoints (maybe calls back into the JVM), we don't wantto have GC or nmethods_do walk the one that's been posted already.So a queue seems to make sense.
Yes but you can make a queue just by having each event have a _nextpointer, rather than dynamically creating nodes to hold the event.Each event is its own queue node implicitly.
One thing that I experimented with was to have the ServiceThreadtake ownership of the queue in it's local thread queue and postthem all, which could be a future enhancement. It didn't help myOOM situation.
Your OOM situation seems to be a basic case of overwhelming theServiceThread. A single serviceThread will always have a limit onhow many events it can handle. Maybe this test is being toounrealistic in its expectations of the current design?
I think the JVMTI API where you can generate an COMPILED_METHOD_LOADfor all the events in the queue is going to be overwhelming unless itwaits for the events to be posted.
Taking things off the service thread would seem to be a good thingthen :)
Deleting the queue after all the events are posted allowsJavaThread::oops_do and nmethods_do only a null check to deal withthis jvmti wart.
If the nodes are not dynamically allocated you don't need to deleteyou just set the queue-head pointer to NULL - actually it willalready be NULL once the last event has been processed.
I could revisit the data structure as a future RFE. The goal was toreuse code that's already there, and I don't think there's asignificant difference in performance. I did some measurement of thestress case and the times were equivalent, actually better in the newcode.
Okay.

Is this a code review then? I think Serguei promised to review the codetoo.


thanks,
Coleen

Thanks,
David
Thanks,
Coleen
David
-----
Thanks,
Coleen
Just some thoughts.

Thanks,
David
I did write comments to this effect here:
http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html
Thanks,
Coleen
David
Also, the jmethod_id field in nmethod was only used as aboolean so don't create a jmethod_id until needed forpost_compiled_method_unload.
Ran hs tier1-8 on linux-x64-debug and the stress test thatcrashed in the original bug report.
open webrev athttp://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8212160

Thanks,
Coleen

Re: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value"

Reply via email to