Hello all, Can I please get reviews of this change? In the meantime, we've done more testing and also field-/torture-testing by a customer who is happy now. :-)
Thanks, Roman > Hi Serguei, > > Thanks for reviewing! > > I updated the patch to reflect your suggestions, very good! > It also includes a fix to allow re-connecting an agent after disconnect, > namely move setup of the trackingEnv and deletedSignatureBag to > _activate() to ensure have those structures after re-connect. > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ > > Let me know what you think! > Roman > >> Hi Roman, >> >> Thank you for taking care about this scalability issue! >> >> I have a couple of quick comments. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >> >> 72 /* >> 73 * Lock to protect deletedSignatureBag >> 74 */ >> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >> 78 * A bag containing all the deleted classes' signatures. Must be >> accessed under >> 79 * deletedTagLock, >> 80 */ >> 81 struct bag* deletedSignatureBag; >> >> The comments contradict to each other. >> I guess, the lock name at line 79 has to be deletedSignatureLock >> instead of deletedTagLock. >> Also, comma at the end must be replaced with dot. >> >> >> 101 // Tag not found? Ignore. >> 102 if (klass == NULL) { >> 103 debugMonitorExit(deletedSignatureLock); >> 104 return; >> 105 } >> 106 >> 107 // Scan linked-list. >> 108 jlong found_tag = klass->klass_tag; >> 109 while (klass != NULL && found_tag != tag) { >> 110 klass_ptr = &klass->next; >> 111 klass = *klass_ptr; >> 112 found_tag = klass->klass_tag; >> 113 } >> 114 >> 115 // Tag not found? Ignore. >> 116 if (found_tag != tag) { >> 117 debugMonitorExit(deletedSignatureLock); >> 118 return; >> 119 } >> >> >> The code above can be simplified, so that the lines 101-105 are not >> needed anymore. >> It can be something like this: >> >> // Scan linked-list. >> while (klass != NULL && klass->klass_tag != tag) { >> klass_ptr = &klass->next; >> klass = *klass_ptr; >> } >> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >> debugMonitorExit(deletedSignatureLock); >> return; >> } >> >> It will take more time when I get a chance to look at the rest. >> >> >> Thanks, >> Serguei >> >> >> >> >> On 12/21/19 13:24, Roman Kennke wrote: >>> Here comes an update that resolves some races that happen when >>> disconnecting an agent. In particular, we need to take the lock on >>> basically every operation, and also need to check whether or not >>> class-tracking is active and return an appropriate result (e.g. an empty >>> list) when we're not. >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>> >>> Thanks, >>> Roman >>> >>> >>>> So, here comes the O(1) implementation: >>>> >>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>> set-up a listener to get notified when it is unloaded. >>>> - Prepared classes are kept in a datastructure that is a table, which >>>> each entry being the head of a linked-list of KlassNode*. The table is >>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>> This is O(1) operation. >>>> - When we get notified of unloading a class, we look up the signature of >>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>> too, depending on the depth of the table. In my testcase which hammered >>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>> but not usually more. It should be ok. >>>> - when processUnloads() gets called, we simply hand out that bag, and >>>> allocate a new one. >>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>> re-attached (was missing before). >>>> - I also added locks around data-structure-manipulation (was missing >>>> before). >>>> - Also, I only activate this whole process when an actual listener gets >>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>> jdb, not sure why jdb does that though. This may be something to improve >>>> in the future? >>>> >>>> In my tests, the performance of class-tracking itself looks really good. >>>> The bottleneck now is clearly actual synthesizing the class-unload >>>> events. I don't see how this can be helped when the debug agent asks for >>>> it? >>>> >>>> Updated webrev: >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>> >>>> Please let me know what you think of it. >>>> >>>> Thanks, >>>> Roman >>>> >>>> >>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>> >>>>> Thanks,Roman >>>>> >>>>> Hi Chris, >>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>> classTrack.c so it's easier to look through the changes. >>>>>> Sure. >>>>>> >>>>>> The purpose of this class-tracking is to be able to determine the >>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>> we can generate the appropriate JDWP event. >>>>>> >>>>>> The current implementation does so by maintaining a table of currently >>>>>> prepared classes by building that table when classTrack is initialized, >>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>> complexity. >>>>>> >>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>> prepared-classes-list) and its signature put in the list that gets >>>>>> returned. >>>>>> >>>>>> The implementation is not perfect. In order to determine whether or not >>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>> true, and also reasonable to expect. >>>>>> >>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>> worth the effort). >>>>>> >>>>>> In addition to all that, this process is only activated when there's an >>>>>> actual listener registered for EI_GC_FINISH. >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>> Hello all, >>>>>>>> >>>>>>>> Issue: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>> >>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>> >>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>> >>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>> >>>>>>>> Eg with the testcase provided here: >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>> >>>>>>>> I am getting those numbers: >>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>> patched: no debug: 85s with debug: 95s >>>>>>>> >>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>> >>>>>>>> Can I please get a review? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Roman >>>>>>>> >> >
signature.asc
Description: OpenPGP digital signature