That was in the previous implementation: I got a condition wrong in the table lookup (as noted by Serguei), and this prevented any class-unload-events from getting out. I have fixed this, but found other problems in that implementation (deadlocks and a crash).
The current implementation has none of these problems: we don't need table-lookups - we simply pass-through the signatures, and locking is much simpler and in particular we don't need a lock around the JVMTI call (SetTag) which was the cause of the deadlock. Does that answer your questions? Thanks, Roman > Hi Roman, > > It passed all my testing. I think before you push Serguei has a question > regarding an issue you brought up a while back. You mentioned that you > weren't getting some events, and suddenly started seeing them. We were > discussing it today and it was unclear if this was an issue you were > seeing before your changes, and your changes resolved it, or it was > initially caused by an earlier version of your changes, and you later > fixed it. We just want to better understand what this issue was and how > it was fixed. > > thanks, > > Chris > > On 3/25/20 3:22 PM, Roman Kennke wrote: >> The new job finished, its ID is: >> >> mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289 >> >> Thank you, >> Roman >> >> >>> Yes, please submit a new job. I'll start my testing once I see that the >>> builds are done. >>> >>> Chris >>> >>> On 3/25/20 12:59 PM, Roman Kennke wrote: >>>> Hi Chris, >>>> >>>> Apparently we can get into classTrack_reset() before calling >>>> activate(), >>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around >>>> the cleaning routine fixes the problem for me. >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ >>>> >>>> Should I post another submit-repo job with that fix? >>>> >>>> Thanks, >>>> Roman >>>> >>>> >>>>> Hi Roman, >>>>> >>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >>>>> >>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000], >>>>> sp=0x00007fbb791f8af0, free space=1022k >>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>>>> j=interpreted, Vv=VM code, C=native code) >>>>> C [libjdwp.so+0xdb71] bagEnumerateOver+0x11 >>>>> C [libjdwp.so+0xe365] classTrack_reset+0x25 >>>>> C [libjdwp.so+0xfca1] debugInit_reset+0x71 >>>>> C [libjdwp.so+0x12e0d] debugLoop_run+0x38d >>>>> C [libjdwp.so+0x25700] acceptThread+0x80 >>>>> V [libjvm.so+0xf4b5a7] JvmtiAgentThread::call_start_function()+0x1c7 >>>>> V [libjvm.so+0x15215c6] JavaThread::thread_main_inner()+0x226 >>>>> V [libjvm.so+0x1527736] Thread::call_run()+0xf6 >>>>> V [libjvm.so+0x1250ade] thread_native_entry(Thread*)+0x10e >>>>> >>>>> >>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There >>>>> doesn't seem to be anything magic on the command line that might be >>>>> triggering. Pretty much I see it with all the various VM configs we >>>>> test. >>>>> >>>>> I'm also seeing crashes in the following tests, but not as often: >>>>> >>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >>>>> >>>>> >>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >>>>> >>>>> >>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >>>>> >>>>> >>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >>>>> >>>>> >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> >>>>> On 3/25/20 11:37 AM, Roman Kennke wrote: >>>>>> Hi Chris, >>>>>> >>>>>>> Regarding the new assert: >>>>>>> >>>>>>> 105 if (gdata && gdata->assertOn) { >>>>>>> 106 // Check this is not already tagged. >>>>>>> 107 jlong tag; >>>>>>> 108 error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, >>>>>>> klass, &tag); >>>>>>> 109 if (error != JVMTI_ERROR_NONE) { >>>>>>> 110 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>> trackingEnv"); >>>>>>> 111 } >>>>>>> 112 JDI_ASSERT(tag == NOT_TAGGED); >>>>>>> 113 } >>>>>>> >>>>>>> I think you should remove the gdata check. gdata should never be >>>>>>> NULL >>>>>>> when you get to this code. If it is ever NULL then there's a bug, >>>>>>> and >>>>>>> the check will hide the bug. >>>>>> Ok, will remove this. >>>>>> >>>>>>> Regarding testing, after you do the submit repo testing let me know >>>>>>> the >>>>>>> jobID and I'll do additional testing on it. >>>>>> I did the submit repo earlier today, and it came back green: >>>>>> >>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>>>>> Hi Sergei, >>>>>>>> >>>>>>>>> The fix looks pretty clean now. >>>>>>>>> I also like new name of the lock.:) >>>>>>>> Thank you! >>>>>>>> >>>>>>>>> Just one comment below. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 110 if (tag != 0l) { >>>>>>>>> 111 return; // Already added >>>>>>>>> 112 } >>>>>>>>> >>>>>>>>> It is better to use a named constant or macro instead. >>>>>>>>> Also, it'd be nice to add a short comment about this value is. >>>>>>>> As I replied to Chris earlier, this whole block can be turned >>>>>>>> into an >>>>>>>> assert. I also made a constant for the value 0, which should be >>>>>>>> pretty >>>>>>>> much self-explaining. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>>>>> >>>>>>>>> How do you test the fix? >>>>>>>> I am using a manual test that is provided in this bug report: >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>> >>>>>>>> "Script to compare performance of GC with and without debugger, >>>>>>>> when >>>>>>>> many classes are loaded and classes are being unloaded": >>>>>>>> >>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>>>>> >>>>>>>> I am also using this test and manually attach/detach jdb a >>>>>>>> couple of >>>>>>>> times in a row to check that disconnecting and reconnecting works >>>>>>>> well >>>>>>>> (this tended to deadlock or crash with an earlier version of the >>>>>>>> patch, >>>>>>>> and is now looking good). >>>>>>>> >>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we >>>>>>>> all >>>>>>>> agree that the fix is reasonable, I will push it to the submit >>>>>>>> repo. I >>>>>>>> am not sure if any of those tests actually exercise that code, >>>>>>>> though. >>>>>>>> Let me know if you want me to run any specific tests. >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>>>> solves the >>>>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>>>> >>>>>>>>>> It turns out that we can take advantage of the fact that we >>>>>>>>>> can use >>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>>>> explicitely >>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>>>> pointer >>>>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>>>> when we >>>>>>>>>> get notified that the class gets unloaded. >>>>>>>>>> >>>>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>>>> classes and signatures, and it also makes the story around >>>>>>>>>> locking >>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>>>>> of all >>>>>>>>>> classes needed (as in the current implementation) and no >>>>>>>>>> searching of >>>>>>>>>> table needed (like in my previous attempts). >>>>>>>>>> >>>>>>>>>> Please review this new revision: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>>>> >>>>>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>>>>> with >>>>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>>>> doesn't seem >>>>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>>>> looks like >>>>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>>>> over the >>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging >>>>>>>>>> up the >>>>>>>>>> buffers.) >>>>>>>>>> >>>>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>>>> always. A >>>>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>>>> jdb is >>>>>>>>>> attached: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> But this is not in the scope of this bug.) >>>>>>>>>> >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/16/20 8:05 AM, serguei.spit...@oracle.com wrote: >>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/15/20 23:57, serguei.spit...@oracle.com wrote: >>>>>>>>>>>> Hi Roman, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>>>> >>>>>>>>>>>> Some comments are below. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>>>> 88 { >>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 93 return; >>>>>>>>>>>> 94 } >>>>>>>>>>>> Just a question: >>>>>>>>>>>> Q1: Should the ObjectFree events be disabled for the >>>>>>>>>>>> jvmtiEnv >>>>>>>>>>>> that does >>>>>>>>>>>> the class tracking if class tracking has not been >>>>>>>>>>>> initialized? >>>>>>>>>>>> >>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>>>> better to >>>>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>>>> >>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>>>> klass >>>>>>>>>>>> not >>>>>>>>>>>> found - ignore. >>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 108 return; >>>>>>>>>>>> 109 } >>>>>>>>>>>> It seems to me, something is wrong in the condition at L106 >>>>>>>>>>>> above. >>>>>>>>>>>> Should it be? : >>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>>>> >>>>>>>>>>>> Otherwise, how can the second check ever work correctly >>>>>>>>>>>> as the >>>>>>>>>>>> return >>>>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>>>> >>>>>>>>>>>> There are several places in this file with the the indent: >>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 93 return; >>>>>>>>>>>> 94 } >>>>>>>>>>>> ... >>>>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 155 return; >>>>>>>>>>>> 156 } >>>>>>>>>>>> ... >>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>>>>>> trackingEnv"); >>>>>>>>>>>> 163 } >>>>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 166 return; // Already added >>>>>>>>>>>> 167 } >>>>>>>>>>>> ... >>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>>>> 282 { >>>>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>>>> 286 } >>>>>>>>>>>> ... >>>>>>>>>>>> 291 void >>>>>>>>>>>> 292 classTrack_reset(void) >>>>>>>>>>>> 293 { >>>>>>>>>>>> 294 int idx; >>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>> 296 >>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>>>> 303 node = next; >>>>>>>>>>>> 304 } >>>>>>>>>>>> 305 } >>>>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>>>> 307 >>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>>>> 310 >>>>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>>>> 312 >>>>>>>>>>>> 313 >>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>>>> 315 >>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> >>>>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>>>> class-unloads >>>>>>>>>>>> The comma is not needed. >>>>>>>>>>>> Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and >>>>>>>>>>>> deletedSignatureBag >>>>>>>>>>>> consistent >>>>>>>>>>>> Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>>>> >>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use >>>>>>>>>>>> words >>>>>>>>>>>> like >>>>>>>>>>>> "store" or "record", "Find" should not start from capital >>>>>>>>>>>> letter: >>>>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>>>> >>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>>>> initialized, >>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>>>> Missed >>>>>>>>>>>> dot >>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>>>>> { // >>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>>>> point we >>>>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>>>> The comment above can be better. Maybe, something like: >>>>>>>>>>> " At this point, we found the KlassNode matching the klass >>>>>>>>>>> tag(and it is >>>>>>>>>>> linked). >>>>>>>>>>> >>>>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>>>> Better: Record the signature of the unloaded class and >>>>>>>>>>> unlink it. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>>>> Hello all, >>>>>>>>>>>>> >>>>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>>>> we've done >>>>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>>>> who is >>>>>>>>>>>>> happy >>>>>>>>>>>>> now. :-) >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>>>> disconnect, >>>>>>>>>>>>>> namely move setup of the trackingEnv and >>>>>>>>>>>>>> deletedSignatureBag to >>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 72 /* >>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>>>> 74 */ >>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>>>> Must be >>>>>>>>>>>>>>> accessed under >>>>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>>>> 80 */ >>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The comments contradict to each other. >>>>>>>>>>>>>>> I guess, the lock name at line 79 has to be >>>>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>>>> Also, comma at the end must be replaced with dot. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>> 104 return; >>>>>>>>>>>>>>> 105 } >>>>>>>>>>>>>>> 106 >>>>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>>>> 113 } >>>>>>>>>>>>>>> 114 >>>>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>> 118 return; >>>>>>>>>>>>>>> 119 } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The code above can be simplified, so that the lines >>>>>>>>>>>>>>> 101-105 >>>>>>>>>>>>>>> are not >>>>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>>>> It can be something like this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>> return; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>>>> rest. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>>>> lock on >>>>>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>>>>> or not >>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend >>>>>>>>>>>>>>>>> the new >>>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>>>>> depths >>>>>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets >>>>>>>>>>>>>>>>> detached >>>>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation >>>>>>>>>>>>>>>>> (was >>>>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right >>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself >>>>>>>>>>>>>>>>> looks >>>>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am >>>>>>>>>>>>>>>>>> implementing >>>>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be >>>>>>>>>>>>>>>>>>>> for a >>>>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the >>>>>>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>>>>> prepared classes by building that table when >>>>>>>>>>>>>>>>>>> classTrack is >>>>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the >>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is >>>>>>>>>>>>>>>>>>> scanned, >>>>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption >>>>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon >>>>>>>>>>>>>>>>>>> unload, >>>>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently >>>>>>>>>>>>>>>>>>> see >>>>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance >>>>>>>>>>>>>>>>>>>>> until an >>>>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>>>> patched: no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>> >>> > >
signature.asc
Description: OpenPGP digital signature