On Thu, 2006-11-09 at 02:01 +0300, Ivan Volosyuk wrote: > Robin, > > thank you for detailed description of the algorithm. IMHO, this was > the most complicated place of the whole story: how to have a weak > reference to classloader and still be able to get it alive again. This > shouldn't be performance critical part and is quite doable. I > absolutely agree with your estimations about tracing extra reference > per object. The approach you propose is more efficient and quite > elegant. > -- > Ivan
Thanks :) > On 11/8/06, Robin Garner <[EMAIL PROTECTED]> wrote: > > Robin Garner wrote: > > > Aleksey Ignatenko wrote: > > >> Robin. > > >> > > >>> OK, well how about keeping a weak reference to the >j.l.ClassLoader > > >>> object instead of a strong one. When the reference >becomes (strong)ly > > >>> unreachable, invoke the class-unloading phase. > > >> > > >> > > >> If you have weak reference to j.l.Classloader - GC will collect it > > >> (with all > > >> appropriate jlClasses) as soon as there are no references to > > >> j.l.Classloaderand appropriate classes. But there is possible > > >> situation when there are some > > >> live objects of that classes and no references to jlClassloader and > > >> jlClasses. This will lead to unpredictable consequences (crash, etc). > > >> > > >> > > >> > > >> I want to remind that there 3 mandatory conditions of class unloading: > > >> > > >> 1. j.l.Classloader instance is unreachable. > > >> > > >> 2. Appropriate j.l.Class instances are unreachable. > > >> > > >> 3. No object of any class loaded by appropriate class loader exists. > > > > > > Let me repeat. I offer an efficient solution to (3). I don't purport > > > to have a solution to (1) and (2). > > > > Let me just add: This is because I don't think (1) or (2) are > > particularly difficult from a performance point of view, although I'm > > happy to accept that there may still be some subtle engineering challenges. > > > > Now this is just off the top of my head, but what about this for a design: > > - A j.l.ClassLoader maintains a collection of each of the classes it has > > loaded > > - A j.l.Class contains a pointer to its j.l.ClassLoader > > - A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1). > > The point of this is that a class loader and its classes are a 'self > > sustaining' data structure - if one element in it is reachable the whole > > thing is reachable. > > > > The VM maintains a weak reference to all its j.l.ClassLoader instances, > > and maintains a ReferenceQueue for weakly-reachable classloaders. > > ClassLoaders are placed on the ReferenceQueue if and only if they are > > unreachable from the heap (including via their j.l.Class objects). Note > > this is an irreversible condition: objects that are unreachable can > > never become reachable again, except through very specific methods. > > > > When it sweeps the ReferenceQueue for unreachable classloaders, the VM > > places the unreachable classloaders in a queue of classloaders that are > > candidates for unloading. This queue is part of the root set of the VM. > > A classloader in this queue is unreachable from the heap, and can be > > unloaded when there are no objects of any class it has loaded. > > > > This is where my mechanism comes into play. > > > > If an object executes getClass() then its classloader is removed from > > the unloadable classloader queue, its weak reference gets recreated and > > we're back at the initial state. My guess is that this is a pretty > > infrequent method call. > > > > I think this stage of the algorithm is easy in performance terms - > > difficult in terms of proving correctness, but if you have an efficient > > reachability mechanism for classes I think the building blocks are > > there, and the subtleties are nothing that a talented engineer can't solve. > > > > > > I'm not 100% sure what your counter-proposal is: I recall 2 approaches > > from the mailing list: > > 1) Each object has an additional word in its header that points back to > > its j.l.Class object, and we proceed from here. > > > > Given that the mean object size is ~28 bytes, this proposal adds 14% to > > each object size. This increases the frequency of GC by 14% and incurs > > a 14% slowdown. Of course this is an oversimplification but a 14% > > slowdown is a pretty lousy starting point to argue from. > > > > 2) The existing pointer in the GC header is traced during GC time. > > > > The average number of pointers per object (excluding the vtable) is > > between 1.5 and 2 for the majority of benchmarks I have looked at > > (footnote: if you know something different, drop me a line) (geometric > > mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}). Tracing one > > additional reference per object will therefore increase the cost of GC > > by ~60% on average. Again oversimplification but indicative. If we > > assume that GC accounts for 10% of runtime (more or less depending on > > heap size), this is a runtime overhead of 6%. > > > > My proposal has been measured at ~1% overhead in GC time, or 0.1% in > > execution time (caveats as above). If there is some complexity in > > establishing classloader reachability from this basis, I would assume it > > can easliy be absorbed. > > > > Therefore I think my proposal, while not complete, can form the basis of > > an efficient complete system for class unloading. > > > > (PS: I'd *love* to be proven wrong) > > > > cheers, > > Robin > > > > > Regards, > > > Robin > > > > > >> > > >> > > >> Aleksey. > > >> > > >> > > >> On 11/8/06, Robin Garner <[EMAIL PROTECTED]> wrote: > > >>> > > >>> Pavel Pervov wrote: > > >>> > Robin, > > >>> > > > >>> > The kind of model I had in mind was along the lines of: > > >>> >> - VM maintains a linked list (or other collection type) of the > > >>> currently > > >>> >> loaded classloaders, each of which in turn maintains the > > >>> collection of > > >>> >> classes loaded by that type. The sweep of classloaders goes > > >>> something > > >>> >> like: > > >>> >> > > >>> >> for (ClassLoader cl : classLoaders) > > >>> >> for (Class c : cl.classes) > > >>> >> cl.reachable |= c.vtable.reachable > > >>> > > > >>> > > > >>> > This is not enough. There are may be live j/l/Class'es and > > >>> > j/l/Classloader's > > >>> > in the heap. Even though no objects of any classes loaded by a > > >>> particual > > >>> > class loader are available in the heap, if we have live reference to > > >>> > j/l/ClassLoader itself, it just can't be unloaded. > > >>> > > >>> OK, well how about keeping a weak reference to the j.l.ClassLoader > > >>> object instead of a strong one. When the reference becomes (strong)ly > > >>> unreachable, invoke the class-unloading phase. > > >>> > > >>> To me the key issue from a performance POV is the reachability of > > >>> classes from objects in the heap. I don't pretend to have an answer to > > >>> the other questions---the performance critical one is the one I have > > >>> addressed, and I accept there may be many solutions to this part of the > > >>> question. > > >>> > > >>> > I believe that a separate heap trace pass, different from the standard > > >>> >> GC, that visited vtables and reachable resources from there would > > >>> also > > >>> >> be a viable solution. As mentioned in an earlier post, writing > > >>> this in > > >>> > > >>> >> MMTk (where a heap trace operation is a class that you can easily > > >>> >> subtype to do this) would be easy. > > >>> >> > > >>> >> One of the advantages of my other proposal is that it can be > > >>> implemented > > >>> >> in the VM independent of the GC to some extent. This additional > > >>> >> mark/scan phase may or may not be easy to implement, depending on the > > >>> >> structure of DRLVM GCs, which is something I haven't explored. > > >>> > > > >>> > > > >>> > DRLVM may work with (potentially) any number of GCs. Designing class > > >>> > unloading the way, which would require mark&scan cooperation from > > >>> GC, is > > >>> > not > > >>> > generally a good idea (from my HPOV). > > >>> > > >>> That's what I gathered. hence my proposal.