Re: Question in understanding ClassValue better
On 24.05.2016 15:33, Peter Levart wrote: On 05/24/2016 01:41 PM, Peter Levart wrote: Hm, It seems that my example will not help much. It also seems that the only problem with plain: static final ClassValue META_CLASS_CV = new ClassValue() { @Override protected MetaClass computeValue(Class type) { return new MetaClass(type); } }; ...is the fact that MetaClass is a class loaded by non-bootstrap class loader and that in case this is a Web app class loader, it prevents undeployment. Can you confirm that a MetaClass instance only references the 'type' Class it is derived from (it's Methods, Fields, etc.) and never references objects from any child class loaders of the type's class loader? If I have the meta class for Integer, then the metaclass itself is an object from a child loader of the loader of Integer. Which means "no" unless I understand the question wrongly. If that is the case, then you could replace MetaClass with a generic data structure, composed of instances of bootstrap classes (HashMap, ArrayList, Object[], ...). That way, Groovy runtime class loader will not be "captured" by a reference from an aClass loaded by bootstrap class loader. Is MetaClass a complicated data structure? ...peeking at Groovy sources, very much so. yes... probably more complex than needed. There's a solution though. Various Meta* classes in Groovy runtime reference at some point the reflective objects (Class, Method, Constructor, Field) describing the 'type' they are derived from. Every reference to a Class object from such Meta* object should be wrapped in something like the following: public final class ClassReference extends WeakReference> implements Supplier> { private static final ConcurrentHashMap MAP = new ConcurrentHashMap<>(); private static final ReferenceQueue> QUEUE = new ReferenceQueue<>(); public static ClassReference forClass(Class clazz) { ClassReference oldRef; while ((oldRef = (ClassReference) QUEUE.poll()) != null) { MAP.remove(oldRef); } ClassReference newRef = new ClassReference(clazz); oldRef = MAP.putIfAbsent(newRef, newRef); return oldRef == null ? newRef : oldRef; } private final String name; private final int hash; private ClassReference(Class clazz) { super(clazz, QUEUE); name = clazz.getName(); hash = clazz.hashCode(); } @Override public Class get() { Class clazz = super.get(); if (clazz == null) { throw new IllegalStateException( "Class " + name + " has already been unloaded"); } return clazz; } @Override public int hashCode() { return hash; } @Override public boolean equals(Object obj) { Class clazz; return obj == this || (obj instanceof ClassReference && (clazz = get()) != null && clazz == ((ClassReference) obj).get()); } } Every reference to a Method should be wrapped in something like this: public final class MethodReference implements Supplier { private static final ClassValue DECLARED_METHODS_CV = new ClassValue() { @Override protected Method[] computeValue(Class type) { return type.getDeclaredMethods(); } }; private final ClassReference declaringClassRef; private final int index; public MethodReference(Method method) { Class declaringClass = method.getDeclaringClass(); declaringClassRef = ClassReference.forClass(declaringClass); Method[] methods = DECLARED_METHODS_CV.get(declaringClass); index = Arrays.asList(methods).indexOf(method); } @Override public Method get() { return DECLARED_METHODS_CV.get(declaringClassRef.get())[index]; } @Override public int hashCode() { return declaringClassRef.hashCode() * 31 + index; } @Override public boolean equals(Object obj) { return obj == this || ( obj instanceof MethodReference && ((MethodReference) obj).declaringClassRef == this.declaringClassRef && ((MethodReference) obj).index == this.index ); } } And similar with every reference to a Constructor or Field. In addition, the MetaClass structure should be isolated from the class it is derived from with what I presented in the previous message (ClassValue> + ArrayList referenced from MetaClass static field) Would that work? So the rule of thumb would be to either use only bootstrap classes as AV (and values strongly reachable from it) if they reference aClass strongly. Or to store a WeakReference as AV, which then can have a value that makes aClass strongly reachable from there, since it would realize a weak reachability
Re: Question in understanding ClassValue better
On 05/24/2016 01:41 PM, Peter Levart wrote: Hm, It seems that my example will not help much. It also seems that the only problem with plain: static final ClassValue META_CLASS_CV = new ClassValue() { @Override protected MetaClass computeValue(Class type) { return new MetaClass(type); } }; ...is the fact that MetaClass is a class loaded by non-bootstrap class loader and that in case this is a Web app class loader, it prevents undeployment. Can you confirm that a MetaClass instance only references the 'type' Class it is derived from (it's Methods, Fields, etc.) and never references objects from any child class loaders of the type's class loader? If that is the case, then you could replace MetaClass with a generic data structure, composed of instances of bootstrap classes (HashMap, ArrayList, Object[], ...). That way, Groovy runtime class loader will not be "captured" by a reference from an aClass loaded by bootstrap class loader. Is MetaClass a complicated data structure? ...peeking at Groovy sources, very much so. There's a solution though. Various Meta* classes in Groovy runtime reference at some point the reflective objects (Class, Method, Constructor, Field) describing the 'type' they are derived from. Every reference to a Class object from such Meta* object should be wrapped in something like the following: public final class ClassReference extends WeakReference> implements Supplier> { private static final ConcurrentHashMapClassReference> MAP = new ConcurrentHashMap<>(); private static final ReferenceQueue> QUEUE = new ReferenceQueue<>(); public static ClassReference forClass(Class clazz) { ClassReference oldRef; while ((oldRef = (ClassReference) QUEUE.poll()) != null) { MAP.remove(oldRef); } ClassReference newRef = new ClassReference(clazz); oldRef = MAP.putIfAbsent(newRef, newRef); return oldRef == null ? newRef : oldRef; } private final String name; private final int hash; private ClassReference(Class clazz) { super(clazz, QUEUE); name = clazz.getName(); hash = clazz.hashCode(); } @Override public Class get() { Class clazz = super.get(); if (clazz == null) { throw new IllegalStateException( "Class " + name + " has already been unloaded"); } return clazz; } @Override public int hashCode() { return hash; } @Override public boolean equals(Object obj) { Class clazz; return obj == this || (obj instanceof ClassReference && (clazz = get()) != null && clazz == ((ClassReference) obj).get()); } } Every reference to a Method should be wrapped in something like this: public final class MethodReference implements Supplier { private static final ClassValue DECLARED_METHODS_CV = new ClassValue() { @Override protected Method[] computeValue(Class type) { return type.getDeclaredMethods(); } }; private final ClassReference declaringClassRef; private final int index; public MethodReference(Method method) { Class declaringClass = method.getDeclaringClass(); declaringClassRef = ClassReference.forClass(declaringClass); Method[] methods = DECLARED_METHODS_CV.get(declaringClass); index = Arrays.asList(methods).indexOf(method); } @Override public Method get() { return DECLARED_METHODS_CV.get(declaringClassRef.get())[index]; } @Override public int hashCode() { return declaringClassRef.hashCode() * 31 + index; } @Override public boolean equals(Object obj) { return obj == this || ( obj instanceof MethodReference && ((MethodReference) obj).declaringClassRef == this.declaringClassRef && ((MethodReference) obj).index == this.index ); } } And similar with every reference to a Constructor or Field. In addition, the MetaClass structure should be isolated from the class it is derived from with what I presented in the previous message (ClassValue> + ArrayList referenced from MetaClass static field) Would that work? Regards, Peter ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Question in understanding ClassValue better
On 05/24/2016 10:26 AM, Jochen Theodorou wrote: Peter, I fully understand if you cannot reply to this mail easily, just wanted to ping to ensure this is not forgotten ;) Sorry Jochen, I forgot about it... Thanks for reminding me! On 20.05.2016 01:33, Jochen Theodorou wrote: On 19.05.2016 21:32, Peter Levart wrote: [...] a ClassValue instance can be thought of as a component of a compound key. Together with a Class, they form a tuple (aClass, aClassValue) that can be associated with an "associated value", AV. And yes, the AVs associated with tuple containing a particular Class are strongly reachable from that Class. I see, I was mixing ClassValue and AV, I was more talking about AV, than ClassValue itself, though ClassValue surely plays an important role in here. Anyway, I am going for that AV is the value computed by aClassValue. You said that the AV is strongly reachable from aClass. Can I further assume, that aClassValue is not strongly reachable from aClass? And that aClassValue can be collected independent of aClass? Can I further assume, that aClassValue can be collected even if AVs for it continue to exist? ClassValue implementation is such that aClassValue is not strongly reachable from aClass as a consequence of (aClassValue, aClass) -> AV association. It can be reachable because of other unrelated references. Therefore, the answer is YES for all above questions. [...] An AV associated with a tuple (Integer.TYPE, aClassValue) -> AV can be garbage collected. But only if aClassValue can be garbage collected 1st. hmm... so in (aClass, aClassValue)->AV if aClassValue can be collected, AV can, but not the other way around... It is a two stage process with a GC cycle between the stages. In 1st stage aClassValue is unreferenced, then GC kicks-in, collects aClassValue and enqueues a WeakReference that was pointing to aClassValue, then the enqueued WeakReference triggers expunging of associated AV on next access to some association (???, aClass) for the same aClass. That's how it is implemented in current ClassValue. If there's no "next access", then there's no expunging and the AV remains reachable. what about aClass? if nothing but AV is referencing aClass, can AV be garbage collected, even if aClassValue cannot? Can I extend your statement to AV can be collected only if either aClass or aClassValue can be garbage collected first? aClass is different from aClassValue in that aClass and all associated AVs can be collected at the same time. AVs are reachable from associated aClass, but if aClass is not strongly reachable from anywhere, they can all be collected together. aClassValue reachability does not play a role here. Let us assume this is the case for now. This is the most tricky part to get right in order to prevent leaks. If in above example, aClassValue is reachable from the AV, then we have a leak. The reachability of a ClassValue instance from the associated value AV is not always obvious. One has to take into account the following non-obvious references: 1 - each object instance has an implicit reference to its implementing class 2 - each class has a reference to its defining ClassLoader 3 - each ClassLoader has a reference to all classes defined by it (except VM annonymous classes) 4 - each ClassLoader has a reference to all its predecessors (that it delegates to) Since a ClassValue instance is typically assigned to a static final field, such instance is reachable from the class that declares the field. I think you can get the picture from that. yeah... that is actually problematic. Because if I keep no hard reference the ClassValue can be collected, even if the AVs still exist... you need aClassValue to lookup the AVs. Without aClassValue, they are not retrievable. So you better keep a reference to aClassValue as long as you need it to lookup the AVs... meaning they would become unreachable. And if I keep one I have a memory leak... well more about in the program you have shown me later on. [...] Ok, let's set up the stage. If I understand you correctly, then: Groovy runtime is loaded by whatever class loader is loading the application (see the comment in MetaClass constructor if this is not true). This is either the ClassLoader.getSystemClassLoader() (the APP class loader) if started from command line or for example Web App class loader in a Web container. Well, actually... if you start a script on the command line, the loader is a child to the app class loader, when used as library it could be the app loader (for example if the groovy program is precompiled) and in a tomcat like scenario it could be either the class loader for the web app, or the loader for all web apps. But let's go with the cases you mentioned first ;) MetaClass(es) are objects implemented by Groovy runtime class(es). Let's call them simply MetaClass. good Here's how I would do that: public class MetaClass { // this list keeps MetaCl
Re: Question in understanding ClassValue better
Peter, I fully understand if you cannot reply to this mail easily, just wanted to ping to ensure this is not forgotten ;) On 20.05.2016 01:33, Jochen Theodorou wrote: On 19.05.2016 21:32, Peter Levart wrote: [...] a ClassValue instance can be thought of as a component of a compound key. Together with a Class, they form a tuple (aClass, aClassValue) that can be associated with an "associated value", AV. And yes, the AVs associated with tuple containing a particular Class are strongly reachable from that Class. I see, I was mixing ClassValue and AV, I was more talking about AV, than ClassValue itself, though ClassValue surely plays an important role in here. Anyway, I am going for that AV is the value computed by aClassValue. You said that the AV is strongly reachable from aClass. Can I further assume, that aClassValue is not strongly reachable from aClass? And that aClassValue can be collected independent of aClass? Can I further assume, that aClassValue can be collected even if AVs for it continue to exist? [...] An AV associated with a tuple (Integer.TYPE, aClassValue) -> AV can be garbage collected. But only if aClassValue can be garbage collected 1st. hmm... so in (aClass, aClassValue)->AV if aClassValue can be collected, AV can, but not the other way around... what about aClass? if nothing but AV is referencing aClass, can AV be garbage collected, even if aClassValue cannot? Can I extend your statement to AV can be collected only if either aClass or aClassValue can be garbage collected first? Let us assume this is the case for now. This is the most tricky part to get right in order to prevent leaks. If in above example, aClassValue is reachable from the AV, then we have a leak. The reachability of a ClassValue instance from the associated value AV is not always obvious. One has to take into account the following non-obvious references: 1 - each object instance has an implicit reference to its implementing class 2 - each class has a reference to its defining ClassLoader 3 - each ClassLoader has a reference to all classes defined by it (except VM annonymous classes) 4 - each ClassLoader has a reference to all its predecessors (that it delegates to) Since a ClassValue instance is typically assigned to a static final field, such instance is reachable from the class that declares the field. I think you can get the picture from that. yeah... that is actually problematic. Because if I keep no hard reference the ClassValue can be collected, even if the AVs still exist... meaning they would become unreachable. And if I keep one I have a memory leak... well more about in the program you have shown me later on. [...] Ok, let's set up the stage. If I understand you correctly, then: Groovy runtime is loaded by whatever class loader is loading the application (see the comment in MetaClass constructor if this is not true). This is either the ClassLoader.getSystemClassLoader() (the APP class loader) if started from command line or for example Web App class loader in a Web container. Well, actually... if you start a script on the command line, the loader is a child to the app class loader, when used as library it could be the app loader (for example if the groovy program is precompiled) and in a tomcat like scenario it could be either the class loader for the web app, or the loader for all web apps. But let's go with the cases you mentioned first ;) MetaClass(es) are objects implemented by Groovy runtime class(es). Let's call them simply MetaClass. good Here's how I would do that: public class MetaClass { // this list keeps MetaClass instances strongly reachable from the MetaClass // class(loader) since they are only weakly reachable from their associated // Class(es) private static final ArrayList META_CLASS_LIST = new ArrayList<>(); // this WeakReference is constructed so that it keeps a strong reference // to a referent until releaseStrong() is called private static final class WeakEntry extends WeakReference { private final AtomicReference strong; WeakEntry(MetaClass mc) { super(mc); strong = new AtomicReference<>(mc); } boolean releaseStrong() { MetaClass mc = strong.get(); return mc != null && strong.compareAndSet(mc, null); } } private static final ClassValue WEAK_ENTRY_CV = new ClassValue() { @Override protected WeakEntry computeValue(Class type) { return new WeakEntry(new MetaClass(type)); } }; // the public API public MetaClass getInstanceFor(Class type) { WeakEntry entry = WEAK_ENTRY_CV.get(type); MetaClass mc = entry.get(); if (entry.releaseStrong()) { synchronized (META_CLASS_LIST) { META_CLASS_LIST.add(mc); } } return mc; } MetaClass(Class type) {
Re: Question in understanding ClassValue better
On 19.05.2016 21:32, Peter Levart wrote: [...] a ClassValue instance can be thought of as a component of a compound key. Together with a Class, they form a tuple (aClass, aClassValue) that can be associated with an "associated value", AV. And yes, the AVs associated with tuple containing a particular Class are strongly reachable from that Class. I see, I was mixing ClassValue and AV, I was more talking about AV, than ClassValue itself, though ClassValue surely plays an important role in here. Anyway, I am going for that AV is the value computed by aClassValue. You said that the AV is strongly reachable from aClass. Can I further assume, that aClassValue is not strongly reachable from aClass? And that aClassValue can be collected independent of aClass? Can I further assume, that aClassValue can be collected even if AVs for it continue to exist? [...] An AV associated with a tuple (Integer.TYPE, aClassValue) -> AV can be garbage collected. But only if aClassValue can be garbage collected 1st. hmm... so in (aClass, aClassValue)->AV if aClassValue can be collected, AV can, but not the other way around... what about aClass? if nothing but AV is referencing aClass, can AV be garbage collected, even if aClassValue cannot? Can I extend your statement to AV can be collected only if either aClass or aClassValue can be garbage collected first? Let us assume this is the case for now. This is the most tricky part to get right in order to prevent leaks. If in above example, aClassValue is reachable from the AV, then we have a leak. The reachability of a ClassValue instance from the associated value AV is not always obvious. One has to take into account the following non-obvious references: 1 - each object instance has an implicit reference to its implementing class 2 - each class has a reference to its defining ClassLoader 3 - each ClassLoader has a reference to all classes defined by it (except VM annonymous classes) 4 - each ClassLoader has a reference to all its predecessors (that it delegates to) Since a ClassValue instance is typically assigned to a static final field, such instance is reachable from the class that declares the field. I think you can get the picture from that. yeah... that is actually problematic. Because if I keep no hard reference the ClassValue can be collected, even if the AVs still exist... meaning they would become unreachable. And if I keep one I have a memory leak... well more about in the program you have shown me later on. [...] Ok, let's set up the stage. If I understand you correctly, then: Groovy runtime is loaded by whatever class loader is loading the application (see the comment in MetaClass constructor if this is not true). This is either the ClassLoader.getSystemClassLoader() (the APP class loader) if started from command line or for example Web App class loader in a Web container. Well, actually... if you start a script on the command line, the loader is a child to the app class loader, when used as library it could be the app loader (for example if the groovy program is precompiled) and in a tomcat like scenario it could be either the class loader for the web app, or the loader for all web apps. But let's go with the cases you mentioned first ;) MetaClass(es) are objects implemented by Groovy runtime class(es). Let's call them simply MetaClass. good Here's how I would do that: public class MetaClass { // this list keeps MetaClass instances strongly reachable from the MetaClass // class(loader) since they are only weakly reachable from their associated // Class(es) private static final ArrayList META_CLASS_LIST = new ArrayList<>(); // this WeakReference is constructed so that it keeps a strong reference // to a referent until releaseStrong() is called private static final class WeakEntry extends WeakReference { private final AtomicReference strong; WeakEntry(MetaClass mc) { super(mc); strong = new AtomicReference<>(mc); } boolean releaseStrong() { MetaClass mc = strong.get(); return mc != null && strong.compareAndSet(mc, null); } } private static final ClassValue WEAK_ENTRY_CV = new ClassValue() { @Override protected WeakEntry computeValue(Class type) { return new WeakEntry(new MetaClass(type)); } }; // the public API public MetaClass getInstanceFor(Class type) { WeakEntry entry = WEAK_ENTRY_CV.get(type); MetaClass mc = entry.get(); if (entry.releaseStrong()) { synchronized (META_CLASS_LIST) { META_CLASS_LIST.add(mc); } } return mc; } MetaClass(Class type) { // derive it from 'type', but don't reference it // strongly if Groovy runtime is loaded by a parent // class loader
Re: Question in understanding ClassValue better
A small correction... On 05/19/2016 09:32 PM, Peter Levart wrote: // the public API public MetaClass getInstanceFor(Class type) { This should of course read: // the public API public *static* MetaClass getInstanceFor(Class type) { Regards, Peter ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Question in understanding ClassValue better
Hi Jochen, I'll try to answer your questions as profoundly as I can... On 05/19/2016 04:27 PM, Jochen Theodorou wrote: Hi, at the beginning of this year I had an exchange with Peter Lavart about JDK-8136353 (see http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html), and that is probably based on wrong assumptions. But I must confess I still have trouble understanding ClassValue semantics. And since the ClassValue problem in Groovy came up again, I though I make another try based on a list of assumptions and asking if they are wrong or right 1) ClassValue can be basically understood as a strong reference of a class to a class value a ClassValue instance can be thought of as a component of a compound key. Together with a Class, they form a tuple (aClass, aClassValue) that can be associated with an "associated value", AV. And yes, the AVs associated with tuple containing a particular Class are strongly reachable from that Class. 2) a ClassValue associated with a system class (for example Integer.TYPE) is never garbage collected An AV associated with a tuple (Integer.TYPE, aClassValue) -> AV can be garbage collected. But only if aClassValue can be garbage collected 1st. This is the most tricky part to get right in order to prevent leaks. If in above example, aClassValue is reachable from the AV, then we have a leak. The reachability of a ClassValue instance from the associated value AV is not always obvious. One has to take into account the following non-obvious references: 1 - each object instance has an implicit reference to its implementing class 2 - each class has a reference to its defining ClassLoader 3 - each ClassLoader has a reference to all classes defined by it (except VM annonymous classes) 4 - each ClassLoader has a reference to all its predecessors (that it delegates to) Since a ClassValue instance is typically assigned to a static final field, such instance is reachable from the class that declares the field. I think you can get the picture from that. 3) a ClassValue from a different loader than the system loader, associated with a system class, will prevent that loader to unload You mean an instance of a ClassValue subclass loaded by a non-system class loader? I don't think this matters much. Any object instance with a runtime class that is not a system class, while being reachable, holds the non-system ClassLoader non-reclaimable (this follows from the 1st and 2nd rules of non-obvious references above) a ClassValue instance is not associated with a Class. (aClass, aClassValue) tuple is associated with an associated value AV. Such association is implemented in a way where aClassValue is not strongly reachable from aClass BECAUSE OF THE ASSOCIATION ITSELF, if that is what you wanted to know. 4) a ClassValue referencing to the class it is associated with, does not prevent the collection of that class An association (aClass, aClassValue) -> AV is implemented in a way where aClass is not strongly reachable from aClassValue BECAUSE OF THE ASSOCIATION ITSELF. It can still be reachable because of other non-obvious references mentioned above. Point 2 and 3 are kind of problematic for me and I wish them wrong, but they would follow from 1. The exchange with Peter makes me think assumption 4 is wrong... just I don't understand why. If those assumptions are right, then I actually wonder in what cases I should use ClassValue without causing memory leaks. What I wanted to use it for is to associate a meta class with every class I need a meta class for. This includes system classes. If 3 is right, then doing so would prevent the Groovy runtime from being unloaded. Even if the meta classes are able to unload, the implementation of the ClassValue would still be there. And since that comes from the same loader, that loaded the runtime, that loader will stay. Now loading and (trying to) unload the Groovy runtime countless times would end up in a OOME at some point (permgen problem in older JDKs). And even if I would do something else for class from the standard loaders, I would still get into trouble on for example Tomcat. Not to forget that having two parallel structures for this raises the question as of why to use ClassValue at all. I think what it boils down to in the end is: When (under what conditions) for what to use ClassValue at all. bye Jochen Ok, let's set up the stage. If I understand you correctly, then: Groovy runtime is loaded by whatever class loader is loading the application (see the comment in MetaClass constructor if this is not true). This is either the ClassLoader.getSystemClassLoader() (the APP class loader) if started from command line or for example Web App class loader in a Web container. MetaClass(es) are objects implemented by Groovy runtime class(es). Let's call them simply MetaClass. Here's how I would do that: public class MetaClass { // this list keeps