Hi Roman,

Yes. Thank you for the explanation.

Thanks,
Serguei

On 3/26/20 01:44, Roman Kennke wrote:
That was in the previous implementation: I got a condition wrong in the
table lookup (as noted by Serguei), and this prevented any
class-unload-events from getting out. I have fixed this, but found other
problems in that implementation (deadlocks and a crash).

  The current implementation has none of these problems: we don't need
table-lookups - we simply pass-through the signatures, and locking is
much simpler and in particular we don't need a lock around the JVMTI
call (SetTag) which was the cause of the deadlock.

Does that answer your questions?

Thanks,
Roman

Hi Roman,

It passed all my testing. I think before you push Serguei has a question
regarding an issue you brought up a while back. You mentioned that you
weren't getting some events, and suddenly started seeing them. We were
discussing it today and it was unclear if this was an issue you were
seeing before your changes, and your changes resolved it, or it was
initially caused by an earlier version of your changes, and you later
fixed it. We just want to better understand what this issue was and how
it was fixed.

thanks,

Chris

On 3/25/20 3:22 PM, Roman Kennke wrote:
The new job finished, its ID is:

   mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289

Thank you,
Roman


Yes, please submit a new job. I'll start my testing once I see that the
builds are done.

Chris

On 3/25/20 12:59 PM, Roman Kennke wrote:
Hi Chris,

Apparently we can get into classTrack_reset() before calling
activate(),
and we're seeing a null deletedSignatureBag. A simple NULL-check around
the cleaning routine fixes the problem for me.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/

Should I post another submit-repo job with that fix?

Thanks,
Roman


Hi Roman,

com/sun/jdi/JdwpAllowTest.java crashed on many runs:

Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
sp=0x00007fbb791f8af0,  free space=1022k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
C  [libjdwp.so+0xdb71]  bagEnumerateOver+0x11
C  [libjdwp.so+0xe365]  classTrack_reset+0x25
C  [libjdwp.so+0xfca1]  debugInit_reset+0x71
C  [libjdwp.so+0x12e0d]  debugLoop_run+0x38d
C  [libjdwp.so+0x25700]  acceptThread+0x80
V  [libjvm.so+0xf4b5a7]  JvmtiAgentThread::call_start_function()+0x1c7
V  [libjvm.so+0x15215c6]  JavaThread::thread_main_inner()+0x226
V  [libjvm.so+0x1527736]  Thread::call_run()+0xf6
V  [libjvm.so+0x1250ade]  thread_native_entry(Thread*)+0x10e


This happened during a test task run of open/test/jdk/:jdk_jdi. There
doesn't seem to be anything magic on the command line that might be
triggering. Pretty much I see it with all the various VM configs we
test.

I'm also seeing crashes in the following tests, but not as often:

serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java


vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java


vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java


vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java



thanks,

Chris


On 3/25/20 11:37 AM, Roman Kennke wrote:
Hi Chris,

Regarding the new assert:

    105     if (gdata && gdata->assertOn) {
    106         // Check this is not already tagged.
    107         jlong tag;
    108         error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
klass, &tag);
    109         if (error != JVMTI_ERROR_NONE) {
    110             EXIT_ERROR(error, "Unable to GetTag with class
trackingEnv");
    111         }
    112         JDI_ASSERT(tag == NOT_TAGGED);
    113     }

I think you should remove the gdata check. gdata should never be
NULL
when you get to this code. If it is ever NULL then there's a bug,
and
the check will hide the bug.
Ok, will remove this.

Regarding testing, after you do the submit repo testing let me know
the
jobID and I'll do additional testing on it.
I did the submit repo earlier today, and it came back green:

mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762

Thanks,
Roman

thanks,

Chris

On 3/25/20 6:00 AM, Roman Kennke wrote:
Hi Sergei,

The fix looks pretty clean now.
I also like new name of the lock.:)
Thank you!

Just one comment below.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html




110 if (tag != 0l) {
111 return; // Already added
     112     }

     It is better to use a named constant or macro instead.
     Also, it'd be nice to add a short comment about this value is.
As I replied to Chris earlier, this whole block can be turned
into an
assert. I also made a constant for the value 0, which should be
pretty
much self-explaining.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/

How do you test the fix?
I am using a manual test that is provided in this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1751985

"Script to compare performance of GC with and without debugger,
when
many classes are loaded and classes are being unloaded":

https://bugzilla.redhat.com/attachment.cgi?id=1640688

I am also using this test and manually attach/detach jdb a
couple of
times in a row to check that disconnecting and reconnecting works
well
(this tended to deadlock or crash with an earlier version of the
patch,
and is now looking good).

I am also running tier1 and tier2 tests locally, and as soon as we
all
agree that the fix is reasonable, I will push it to the submit
repo. I
am not sure if any of those tests actually exercise that code,
though.
Let me know if you want me to run any specific tests.

Thank you,
Roman



Thanks,
Serguei


On 3/20/20 08:30, Roman Kennke wrote:
I believe I came up with a much simpler solution that also
solves the
problems of the existing one, and the ones I proposed earlier.

It turns out that we can take advantage of the fact that we
can use
*anything* as tags in JVMTI, even pointers to stuff (this is
explicitely
mentioned in the JVMTI spec). This means we can simply stick a
pointer
to the signature of a class into the tag, and pull it out again
when we
get notified that the class gets unloaded.

This means we don't need an extra data-structure to keep track of
classes and signatures, and it also makes the story around
locking
*much* simpler. Performance-wise this is O(1), i.e. no scanning
of all
classes needed (as in the current implementation) and no
searching of
table needed (like in my previous attempts).

Please review this new revision:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/

(Notice that there still appears to be a performance bottleneck
with
class-unloading when an actual debugger is attached. This
doesn't seem
to be related to the classTrack.c implementation though, but
looks like
a consequence of getting all those class-unload notifications
over the
wire. My testcase generates 1000s of them, and it's clogging
up the
buffers.)

I am not sure why jdb needs to enable class-unload listener
always. A
simple hack disables it, and performance is brilliant, even when
jdb is
attached:
http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch


But this is not in the scope of this bug.)

Roman


On 3/16/20 8:05 AM, serguei.spit...@oracle.com wrote:
Sorry, forgot to complete my comments at the end (see below).


On 3/15/20 23:57, serguei.spit...@oracle.com wrote:
Hi Roman,

Thank you for the update and sorry for the latency in review.

Some comments are below.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html




87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
      88 {
89 debugMonitorEnter(deletedSignatureLock);
90 if (currentClassTag == -1) {
91 // Class tracking not initialized, nobody's interested
92 debugMonitorExit(deletedSignatureLock);
93 return;
      94     }
Just a question:
      Q1: Should the ObjectFree events be disabled for the
jvmtiEnv
that does
          the class tracking if class tracking has not been
initialized?

70 static jlong currentClassTag; I'm thinking if the name is
better to
be something like: lastClassTag or highestClassTag.

99 KlassNode* klass = *klass_ptr;
100 102 while (klass != NULL && klass->klass_tag != tag) { 103
klass_ptr = &klass->next; 104 klass = *klass_ptr;
105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
klass
not
found - ignore.
107 debugMonitorExit(deletedSignatureLock);
108 return;
     109     }
     It seems to me, something is wrong in the condition at L106
above.
     Should it be? :
        if (klass == NULL || klass->klass_tag != tag)

     Otherwise, how can the second check ever work correctly
as the
return
will always happen when (klass != NULL)?

     There are several places in this file with the the indent:
90 if (currentClassTag == -1) {
91 // Class tracking not initialized, nobody's interested
92 debugMonitorExit(deletedSignatureLock);
93 return;
      94     }
     ...
152 if (currentClassTag == -1) {
153 // Class tracking not initialized yet, nobody's interested
154 debugMonitorExit(deletedSignatureLock);
155 return;
     156     }
     ...
161 if (error != JVMTI_ERROR_NONE) {
162 EXIT_ERROR(error, "Unable to GetTag with class
trackingEnv");
     163     }
164 if (tag != 0l) {
165 debugMonitorExit(deletedSignatureLock);
166 return; // Already added
     167     }
     ...
281 cleanDeleted(void *signatureVoid, void *arg)
282 {
283 char* sig = (char*)signatureVoid;
284 jvmtiDeallocate(sig);
285 return JNI_TRUE;
     286 }
     ...
     291 void
     292 classTrack_reset(void)
     293 {
294 int idx;
295 debugMonitorEnter(deletedSignatureLock);
296
297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
298 KlassNode* node = table[idx];
299 while (node != NULL) {
300 KlassNode* next = node->next;
301 jvmtiDeallocate(node->signature);
302 jvmtiDeallocate(node);
303 node = next;
304 }
305 }
306 jvmtiDeallocate(table);
307
308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
309 bagDestroyBag(deletedSignatureBag);
310
311 currentClassTag = -1;
312
313
(void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);


314 trackingEnv = NULL;
315
316 debugMonitorExit(deletedSignatureLock);

Could you, please, fix several comments below?
63 * The JVMTI tracking env to keep track of klass tags, for
class-unloads
     The comma is not needed.
     Would it better to replace: klass tags => klass_tag's ?


73 * Lock to keep table, currentClassTag and
deletedSignatureBag
consistent
     Maybe: Lock to guard ... or lock to keep integrity of ...

84 * Callback when classes are freed, Finds the signature and
remembers it in deletedSignatureBag. Would be better to use
words
like
"store" or "record", "Find" should not start from capital
letter:
Invoke the callback when classes are freed, find and record the
signature in deletedSignatureBag.

96 // Find deleted KlassNode 133 // Class tracking not
initialized,
nobody's interested 153 // Class tracking not initialized yet,
nobody's interested 158 /* Check this is not a duplicate */
Missed
dot
at the end. 106 if (klass != NULL || klass->klass_tag != tag)
{ //
klass not found - ignore. In opposite, dot is not needed as the
comment does not start from a capital letter. 111 // At this
point we
have the KlassNode corresponding to the tag
112 // in klass, and the pointer to it in klass_node.
     The comment above can be better. Maybe, something like:
       " At this point, we found the KlassNode matching the klass
tag(and it is
linked).

113 // Remember the unloaded signature.
     Better: Record the signature of the unloaded class and
unlink it.

Thanks,
Serguei

Thanks,
Serguei

On 3/9/20 05:39, Roman Kennke wrote:
Hello all,

Can I please get reviews of this change? In the meantime,
we've done
more testing and also field-/torture-testing by a customer
who is
happy
now. :-)

Thanks,
Roman


Hi Serguei,

Thanks for reviewing!

I updated the patch to reflect your suggestions, very good!
It also includes a fix to allow re-connecting an agent after
disconnect,
namely move setup of the trackingEnv and
deletedSignatureBag to
_activate() to ensure have those structures after re-connect.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/

Let me know what you think!
Roman

Hi Roman,

Thank you for taking care about this scalability issue!

I have a couple of quick comments.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html




72 /*
73 * Lock to protect deletedSignatureBag
74 */
75 static jrawMonitorID deletedSignatureLock; 76 77 /*
78 * A bag containing all the deleted classes' signatures.
Must be
accessed under
79 * deletedTagLock,
      80  */
81 struct bag* deletedSignatureBag;

      The comments contradict to each other.
      I guess, the lock name at line 79 has to be
deletedSignatureLock
instead of deletedTagLock.
      Also, comma at the end must be replaced with dot.


101 // Tag not found? Ignore.
102 if (klass == NULL) {
103 debugMonitorExit(deletedSignatureLock);
104 return;
105 }
     106
107 // Scan linked-list.
108 jlong found_tag = klass->klass_tag;
109 while (klass != NULL && found_tag != tag) {
110 klass_ptr = &klass->next;
111 klass = *klass_ptr;
112 found_tag = klass->klass_tag;
     113     }
114
115 // Tag not found? Ignore.
116 if (found_tag != tag) {
117 debugMonitorExit(deletedSignatureLock);
118 return;
     119     }


     The code above can be simplified, so that the lines
101-105
are not
needed anymore.
     It can be something like this:

// Scan linked-list.
while (klass != NULL && klass->klass_tag != tag) {
klass_ptr = &klass->next;
klass = *klass_ptr;
         }
if (klass == NULL || klass->klass_tag != tag) { // klass not
found - ignore.
debugMonitorExit(deletedSignatureLock);
return;
         }

It will take more time when I get a chance to look at the
rest.


Thanks,
Serguei




On 12/21/19 13:24, Roman Kennke wrote:
Here comes an update that resolves some races that happen
when
disconnecting an agent. In particular, we need to take the
lock on
basically every operation, and also need to check whether
or not
class-tracking is active and return an appropriate result
(e.g. an empty
list) when we're not.

Updated webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/

Thanks,
Roman


So, here comes the O(1) implementation:

- Whenever a class is 'prepared', it is registered with a
tag, and we
set-up a listener to get notified when it is unloaded.
- Prepared classes are kept in a datastructure that is a
table, which
each entry being the head of a linked-list of KlassNode*.
The
table is
indexed by tag % slot-count, and then simply prepend
the new
KlassNode*.
This is O(1) operation.
- When we get notified of unloading a class, we look up
the
signature of
the reported tag in that table, and remember it in a bag.
The
KlassNode*
is then unlinked from the table and deallocated. This is
~O(1) operation
too, depending on the depth of the table. In my testcase
which hammered
the code with class-loads and unloads, I usually see
depths
of like 2-3,
but not usually more. It should be ok.
- when processUnloads() gets called, we simply hand out
that
bag, and
allocate a new one.
- I also added cleanup-code in classTrack_reset() to avoid
leaking the
signatures and KlassNode* etc when debug agent gets
detached
and/or
re-attached (was missing before).
- I also added locks around data-structure-manipulation
(was
missing
before).
- Also, I only activate this whole process when an actual
listener gets
registered on EI_GC_FINISH. This seems to happen right
when
attaching a
jdb, not sure why jdb does that though. This may be
something
to improve
in the future?

In my tests, the performance of class-tracking itself
looks
really good.
The bottleneck now is clearly actual synthesizing the
class-unload
events. I don't see how this can be helped when the debug
agent asks for it?

Updated webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/

Please let me know what you think of it.

Thanks,
Roman


Alright, the perfectionist in me got me. I am
implementing
the even more
efficient ~O(1) class tracking. Please hold off reviewing
for now.

Thanks,Roman

     Hi Chris,
I'll have a look at this, although it might not be
for a
few days. In
the meantime, maybe you can describe your new
implementation in
classTrack.c so it's easier to look through the
changes.
Sure.

The purpose of this class-tracking is to be able to
determine the
signatures of unloaded classes when GC/class-unloading
happened, so that
we can generate the appropriate JDWP event.

The current implementation does so by maintaining a
table
of currently
prepared classes by building that table when
classTrack is
initialized,
and then add new classes whenever a class gets loaded.
When
unloading
occurs, that cache is rebuilt into a new table, and
compared with the
old table, and whatever is in the old, but not in the
new
table gets
returned. The problem is that when GCs happen frequently
and/or many
classes get loaded+unloaded, this amounts to
O(classCount*gcCount)
complexity.

The new implementation keeps a linked-list of prepared
classes, and also
tracks unloads via the listener cbTrackingObjectFree().
Whenever an
unload/GC occurs, the list of prepared classes is
scanned,
and classes
that are also in the deletedTagBag are unlinked (thus
maintaining the
prepared-classes-list) and its signature put in the list
that gets returned.

The implementation is not perfect. In order to determine
whether or not
a class is unloaded, it needs to scan the deletedTagBag.
That process is
therefore still O(unloadedClassCount). The assumption
here
is that
unloadedClassCount << classCount. In my experiments this
seems to be
true, and also reasonable to expect.

(I have some ideas how to improve the implementation to
~O(1) but it
would be considerably more complex: have to maintain a
(hash)table that
maps tags -> KlassNode*, unlink them directly upon
unload,
and build the
unloaded-signatures list there, but I don't currently
see
that it's
worth the effort).

In addition to all that, this process is only activated
when there's an
actual listener registered for EI_GC_FINISH.

Thanks,
Roman


Chris

On 12/18/19 5:05 AM, Roman Kennke wrote:
Hello all,

Issue:
https://bugs.openjdk.java.net/browse/JDK-8227269

I am proposing what amounts to a rewrite of
classTrack.c.
It avoids
throwing away the class cache on GC, and instead keeps
track of
loaded/unloaded classes one-by-one.

In addition to that, it avoids this whole dance
until an
agent
registers interest in EI_GC_FINISH.

Webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/



Testing: manual testing of provided test scenarios and
timing.

Eg with the testcase provided here:
https://bugzilla.redhat.com/show_bug.cgi?id=1751985

I am getting those numbers:
unpatched: no debug: 84s with debug: 225s
patched:   no debug: 85s with debug: 95s

I also tested successfully through jdk/submit repo

Can I please get a review?

Thanks,
Roman



Reply via email to