Re: Improve Groovy class loading performance and memory management

2016-05-29 Thread Jochen Theodorou

On 29.05.2016 08:44, Alain Stalder wrote:
[...]

If I use a WeakHashMap or Collections.synchronizedMap(WeakHashMap)
instead of the map from the Spring Framework, classes are not
collectable in the script-running-use-case (OutOfMemoryError) unless I
replace klazz in ClassInfo with a WeakRefererence and then it becomes
(only) softly-collectable.


which is no wonder if the value strongly references the key


All in all, I think this means that universally weakly-collectable
Groovy classes are more of a dream at the moment, at least before a
Groovy 3, and the merge requests for GROOVY-7683 (weak reference to
Class in ClassInfo) and GROOVY-7646 (explicit cleanup after running
scripts in GroovyShell) seem to be the best that can be done at the moment?


at least the weak reference to Class in ClassInfo is something I think 
we can do without too much danger. Of course we may have to handle the 
case in which a ClassInfo still exists, but the class is collected...


As for the introspector. My suggestion would be to clean the 
introspector of the specific class at the end of MetaclassImpl#addProperties



PS: Just for fun, I wrote a version where the map from the Spring
Framework is using soft references for
java.*/javax.*/groovy.*/org.codehaus.groovy.* classes and weak
references for all others, and that passed all my test scripts.


yeah I think, that's not good enough for us. You can have extension 
methods to ther classes, so it is no good if these get collected without 
recovery for the extension methods


bye Jochen



Re: Improve Groovy class loading performance and memory management

2016-05-28 Thread Alain Stalder

I am running in circles:

If I keep using the map from the Spring Framework, but with a soft 
reference instead of a weak reference inside, things get (of course) 
only softly-collectable classes and performance degrades somewhat, 
becomes almost identical to the current implementation (2.5.0 master).


If I use a WeakHashMap or Collections.synchronizedMap(WeakHashMap) 
instead of the map from the Spring Framework, classes are not 
collectable in the script-running-use-case (OutOfMemoryError) unless I 
replace klazz in ClassInfo with a WeakRefererence and then it becomes 
(only) softly-collectable.


All in all, I think this means that universally weakly-collectable 
Groovy classes are more of a dream at the moment, at least before a 
Groovy 3, and the merge requests for GROOVY-7683 (weak reference to 
Class in ClassInfo) and GROOVY-7646 (explicit cleanup after 
running scripts in GroovyShell) seem to be the best that can be done at 
the moment?


Maybe let time find a solution...

Alain

PS: Just for fun, I wrote a version where the map from the Spring 
Framework is using soft references for 
java.*/javax.*/groovy.*/org.codehaus.groovy.* classes and weak 
references for all others, and that passed all my test scripts.



On 28.05.16 20:34, Alain Stalder wrote:

On 28.05.16 19:38, Alain Stalder wrote:
Hmn, not sure yet, but looks like the map from the Spring Framework 
I am using is treating both keys (Class) and values (ClassInfo) as 
weak references, not sure yet if this could easily be changed...


No, at least no indication of that so far, seems only to determine 
ClassInfo once per class.


Yes, that is the issue, map entries can be garbage collected even if 
the class is still loaded (e.g. Integer)...


e.g.

def scriptText = """
class Script1 extends Script {
   static class Inner {
   int x = 1;
   }

   Object run() {
   print "."
   return new Inner().x + new Parallel().y
   return x+y
   }
}

class Parallel {
int y = 2;
}
"""

def shell = new GroovyShell()
//for (int i=0; i<1000; i++) {
//   long t0 = System.nanoTime()
   for (int j=0; j<1000; j++) {
  shell.run(scriptText, "script", [])
   }
//   long t1 = System.nanoTime()
//   printf("%3d: %3.1fs%n", i, ((double)(t1-t0))/10)
//}

=>

Exception in thread "main" groovy.lang.MissingMethodException: No 
signature of method: java.lang.Integer.plus() is applicable for 
argument types: (java.lang.Integer) values: [2]
Possible solutions: sum(int, int), wait(), equals(java.lang.Object), 
wait(long), wait(long, int), equals(java.lang.Object)







Re: Improve Groovy class loading performance and memory management

2016-05-28 Thread Alain Stalder

On 28.05.16 19:38, Alain Stalder wrote:
Hmn, not sure yet, but looks like the map from the Spring Framework I 
am using is treating both keys (Class) and values (ClassInfo) as weak 
references, not sure yet if this could easily be changed...


No, at least no indication of that so far, seems only to determine 
ClassInfo once per class.


Yes, that is the issue, map entries can be garbage collected even if the 
class is still loaded (e.g. Integer)...


e.g.

def scriptText = """
class Script1 extends Script {
   static class Inner {
   int x = 1;
   }

   Object run() {
   print "."
   return new Inner().x + new Parallel().y
   return x+y
   }
}

class Parallel {
int y = 2;
}
"""

def shell = new GroovyShell()
//for (int i=0; i<1000; i++) {
//   long t0 = System.nanoTime()
   for (int j=0; j<1000; j++) {
  shell.run(scriptText, "script", [])
   }
//   long t1 = System.nanoTime()
//   printf("%3d: %3.1fs%n", i, ((double)(t1-t0))/10)
//}

=>

Exception in thread "main" groovy.lang.MissingMethodException: No 
signature of method: java.lang.Integer.plus() is applicable for argument 
types: (java.lang.Integer) values: [2]
Possible solutions: sum(int, int), wait(), equals(java.lang.Object), 
wait(long), wait(long, int), equals(java.lang.Object)




Re: Improve Groovy class loading performance and memory management

2016-05-28 Thread Alain Stalder


On 28.05.16 19:15, Alain Stalder wrote:
Hmn, not sure yet, but looks like the map from the Spring Framework I 
am using is treating both keys (Class) and values (ClassInfo) as weak 
references, not sure yet if this could easily be changed...


No, at least no indication of that so far, seems only to determine 
ClassInfo once per class.


But I have an issue where something involving Grape (and grengine) is 
not working with the PoC (but with 2.5.0 master), and it is not related 
to Introspector cleanups and most likely also not to concurrency.


Maybe not enough context, but that's essentially what I get:

Cause: org.codehaus.groovy.control.MultipleCompilationErrorsException: 
startup failed:
General error during conversion: No signature of method: 
java.util.ArrayList.inject() is applicable for argument types: 
(java.util.LinkedHashMap, groovy.grape.GrapeIvy$_closure1) values: [[:], 
groovy.grape.GrapeIvy$_closure1@6d16b877]


and ClassInfos created before include:

System.out.println("class: " + System.identityHashCode(klazz) +" (" + klazz.getName() 
+")");

class: 106639514 (groovy.lang.Binding)
class: 801851616 (groovy.lang.GroovyObjectSupport)
class: 154013 (groovy.lang.GroovyObject)
class: 1037324811 (java.lang.String)
class: 546597920 (java.lang.Comparable)
class: 177443754 (java.lang.CharSequence)
class: 1243769281 (groovy.lang.MetaClass)
class: 1883500253 (groovy.lang.MetaObjectProtocol)
class: 1084454171 (groovy.grape.GrapeIvy$_closure1)
class: 545910928 (org.codehaus.groovy.runtime.GeneratedClosure)
class: 784639758 (groovy.lang.Closure)
class: 1270168553 (java.lang.Runnable)
class: 1534969475 (groovy.lang.GroovyCallable)
class: 2054932187 (java.util.concurrent.Callable)

No idea so far what causes this and how relevant/bad it might be...



Re: Improve Groovy class loading performance and memory management

2016-05-28 Thread Alain Stalder
Hmn, not sure yet, but looks like the map from the Spring Framework I am 
using is treating both keys (Class) and values (ClassInfo) as weak 
references, not sure yet if this could easily be changed...
Too bad, this time I really thought I had done enough tests before 
posting...


On 28.05.16 16:49, Alain Stalder wrote:
This is going to be a *very* long mail, but I think it is probably 
worth it! :)


First of all, although I am not 100% sure, I think I agree with Jochen 
regarding ClassValue - in any case, I find ClassValue is not a viable 
option to count on in the immediately forseeable future.


Instead I wrote a PoC based on the Groovy (2.5.0) master with the 
following highlights:


- In most use cases, classes that are no longer used become 
immediately available for garbage collection.
- In all cases, garbage collection is possible once the limit on 
Metaspace/PermGen resp. Heap is reached, i.e. no more OutOfMemoryErrors.
- Appears in some quick initial tests to be generally even a bit 
*faster*(!) than the current implementation.

- (Not using ClassValue at all.)
- (The two merge requests by John Wagenleitner for GROOVY-7683 (weak 
reference to Class in ClassInfo) and GROOVY-7646 (explicit cleanup 
after running scripts in GroovyShell) would become obsolete.)


Let me first define two things:

I will call a class "weakly-collectable" if it can be collected while 
the VM is running normally, i.e. before any limit on PermGen/Metaspace 
or Heap is reached, and "softly-collectable" if that only happens when 
such a limit is reached, but is still possible then, i.e. no 
OutOfMemoryError.


I will call the (maybe most typical?) use case where a Java VM 
dynamically compiles and runs some Groovy scripts the 
"script-running-use-case", including generally also the case were 
scripts were precompiled and are loaded by a dedicated class loader 
(i.e. not the same class loader as Groovy itself), and I will call the 
use case where both Groovy and compiled scripts are loaded by the same 
classe loader the "gradle-use-case", like when the Gradle daemon keeps 
running and reloads Groovy and build scripts (as I understand how this 
works - correct me if I got that wrong).


The status quo with Groovy 2.4.6. is as follows:

- script-running-use-case, use ClassValue: softly-collectable
- script-running-use-case, don't user ClassValue: not collectable 
(OutOfMemoryError)

- gradle-use-case, use ClassValue: not collectable (OutOfMemoryError)
- gradle-use-case, don't user ClassValue: softly-collectable

Now for the PoC...

Here are the PoC branch and diff to master:
- https://github.com/jexler/groovy/tree/weak-gc-poc
- https://github.com/jexler/groovy/compare/master...jexler:weak-gc-poc

The core new thing is the class 
org.codehaus.groovy.reflection.ClassInfoMap, which is based on 
ConcurrentReferenceHashMap from the spring framework (which in turn 
appears to have originated from JBoss). It implements basically a 
WeakHashMap with thread-safe read/write access.


In ClassInfo, that new ClassInfoMap is used within GlobalClassSet. 
(Detail: I have left the ManagedLinkedList items in the 
GlobalClassSet class because at least some Gradle versions seem to 
access it directly via reflection.)


GroovyClassValue (both the real one based on ClassValue and the pre 
Java 7 emulation based on ManagedConcurrentMap) is not used at all any 
more.


The other "half" of the PoC concerns the java.beans.Introspector, 
because its caches are now the last thing that prevents 
weakly-collecting unused classes (as I will show a bit later on).


The basic approach here is to cache BeanInfo as a new private member 
"beanInfo" of ClassInfo and to remove it immediately after creation 
from Introspector caches. There is also a new public getter 
classInfo.getBeanInfo() that lazily initializes BeanInfo and returns it.


I provide 4 options for this PoC how to clean up Introspector caches, 
via a system property "weak-gc-poc.cleanup":


- "none": No cleanup, as today
- "class": The default, call Introspector.flushFromCaches(theClass) 
after getting beanInfo and storing it in ClassInfo
- "super": Same as class, but do the cleanup for the class and all of 
its superclasses (except java.* and javax.*)
- "all": Clean Introspector caches for all classes, i.e. call 
Introspector.flushCaches()


In the end I suspect only "none" and "class" would be viable options 
because the others probably have too much impact on performance (more 
creations of BeanInfo for same classes), potentially also influencing 
performance of outside code that is also using Introspector.


First some results based on classgctest ( 
https://github.com/jexler/classgc ).


script-running-use-case, with the default "weak-gc-poc.cleanup" 
setting of "class":


$ java -XX:MaxMetaspaceSize=256m -Xmx512m -cp 
.:groovy-2.5.0-weak-gc-poc.jar ClassGCTester -cp filling/ -parent 
tester -classes GroovyFilling


Secs Test classes  Metaspace/PermGen Heap   Load time 
Create t

Re: Improve Groovy class loading performance and memory management

2016-05-28 Thread Alain Stalder
This is going to be a *very* long mail, but I think it is probably worth 
it! :)


First of all, although I am not 100% sure, I think I agree with Jochen 
regarding ClassValue - in any case, I find ClassValue is not a viable 
option to count on in the immediately forseeable future.


Instead I wrote a PoC based on the Groovy (2.5.0) master with the 
following highlights:


- In most use cases, classes that are no longer used become immediately 
available for garbage collection.
- In all cases, garbage collection is possible once the limit on 
Metaspace/PermGen resp. Heap is reached, i.e. no more OutOfMemoryErrors.
- Appears in some quick initial tests to be generally even a bit 
*faster*(!) than the current implementation.

- (Not using ClassValue at all.)
- (The two merge requests by John Wagenleitner for GROOVY-7683 (weak 
reference to Class in ClassInfo) and GROOVY-7646 (explicit cleanup after 
running scripts in GroovyShell) would become obsolete.)


Let me first define two things:

I will call a class "weakly-collectable" if it can be collected while 
the VM is running normally, i.e. before any limit on PermGen/Metaspace 
or Heap is reached, and "softly-collectable" if that only happens when 
such a limit is reached, but is still possible then, i.e. no 
OutOfMemoryError.


I will call the (maybe most typical?) use case where a Java VM 
dynamically compiles and runs some Groovy scripts the 
"script-running-use-case", including generally also the case were 
scripts were precompiled and are loaded by a dedicated class loader 
(i.e. not the same class loader as Groovy itself), and I will call the 
use case where both Groovy and compiled scripts are loaded by the same 
classe loader the "gradle-use-case", like when the Gradle daemon keeps 
running and reloads Groovy and build scripts (as I understand how this 
works - correct me if I got that wrong).


The status quo with Groovy 2.4.6. is as follows:

- script-running-use-case, use ClassValue: softly-collectable
- script-running-use-case, don't user ClassValue: not collectable 
(OutOfMemoryError)

- gradle-use-case, use ClassValue: not collectable (OutOfMemoryError)
- gradle-use-case, don't user ClassValue: softly-collectable

Now for the PoC...

Here are the PoC branch and diff to master:
- https://github.com/jexler/groovy/tree/weak-gc-poc
- https://github.com/jexler/groovy/compare/master...jexler:weak-gc-poc

The core new thing is the class 
org.codehaus.groovy.reflection.ClassInfoMap, which is based on 
ConcurrentReferenceHashMap from the spring framework (which in turn 
appears to have originated from JBoss). It implements basically a 
WeakHashMap with thread-safe read/write access.


In ClassInfo, that new ClassInfoMap is used within GlobalClassSet. 
(Detail: I have left the ManagedLinkedList items in the 
GlobalClassSet class because at least some Gradle versions seem to 
access it directly via reflection.)


GroovyClassValue (both the real one based on ClassValue and the pre Java 
7 emulation based on ManagedConcurrentMap) is not used at all any more.


The other "half" of the PoC concerns the java.beans.Introspector, 
because its caches are now the last thing that prevents 
weakly-collecting unused classes (as I will show a bit later on).


The basic approach here is to cache BeanInfo as a new private member 
"beanInfo" of ClassInfo and to remove it immediately after creation from 
Introspector caches. There is also a new public getter 
classInfo.getBeanInfo() that lazily initializes BeanInfo and returns it.


I provide 4 options for this PoC how to clean up Introspector caches, 
via a system property "weak-gc-poc.cleanup":


- "none": No cleanup, as today
- "class": The default, call Introspector.flushFromCaches(theClass) 
after getting beanInfo and storing it in ClassInfo
- "super": Same as class, but do the cleanup for the class and all of 
its superclasses (except java.* and javax.*)
- "all": Clean Introspector caches for all classes, i.e. call 
Introspector.flushCaches()


In the end I suspect only "none" and "class" would be viable options 
because the others probably have too much impact on performance (more 
creations of BeanInfo for same classes), potentially also influencing 
performance of outside code that is also using Introspector.


First some results based on classgctest ( 
https://github.com/jexler/classgc ).


script-running-use-case, with the default "weak-gc-poc.cleanup" setting 
of "class":


$ java -XX:MaxMetaspaceSize=256m -Xmx512m -cp 
.:groovy-2.5.0-weak-gc-poc.jar ClassGCTester -cp filling/ -parent tester 
-classes GroovyFilling


Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed average average
   0 1   1   6.4m   6.5m  14.1m 245.5m 
1.226ms11.831ms
   1   482 482   9.1m  10.5m  25.9m 245.5m 
0.343ms 1.650ms
   2  13561356  12.5m

Re: Improve Groovy class loading performance and memory management

2016-05-24 Thread Jochen Theodorou

hi all,

I thought I try to give an update on the ClassValue issue. Well.. my 
first suggestion is to really not activate it by default. That is 
because Classvalue  has quite special semantics.


I am trying to explain them a bit here

so let us define AV as the value computed by a ClassValue aClassValue 
and aClass the class value we want an AV for.


so there is relation (aClass,aClassValue)-> AV

In this relation there will be a strong reference of aClass to AV. 
aClass and aClassValue can be (in theory) collected independent of AV, 
AV can only be collected after aClass or aClassValue have been 
collected. Even if AV references aClass it can still be collected - 
under conditions


No imagine AV is actually called ClassInfo and from our Groovy runtime 
and aClass is Integer. Since it is a system class, aClass will be never 
collected, since aClassValue is an instance of our runtime, its class 
and ClassInfo will have the same class loader. As there is a strong 
reference to ClassInfo, that ClassInfo instance will not be collected. 
And as ClassInfo will stay loaded, so will any class of the groovy 
runtime. And since the class value is usually in a static field, that 
mean class value too. In conclusion that means the runtime will not be 
unloaded at all, class space used up and the final is a OOME.


There is no chance our approach in the current implementation can work. 
To avoid the problem we would have to do a lot of things. first of all, 
we have to avoid having an AV, which is from our runtime. So at the very 
least we would need something like a WeakReference computed 
from the classvalue, instead of ClassInfo directly. Next we would have 
to keep a list of ClassInfo to avoid their garbage collection right 
away. And then we would have to find a way to remove entries from there 
upon class removal.


And that´s not all of it :(

bye Jochen




Re: Improve Groovy class loading performance and memory management

2016-05-19 Thread Alain Stalder

On 19.05.16 10:23, Alain Stalder wrote:
So maybe refactoring to always using WeakReference in all 
objects stored in ClassInfo (meta classes, caches, ...) would be 
sufficient to get "on-the-fly" garbage collection (i.e. before the 
maximum on Metaspace or Heap is reached)? And would the additional 
weakRef.get() calls maybe have again a noticeable effect on 
performance? I won't try 
this refactoring myself, but if someone else wants to try this?
Looking at the source for WeakReference resp. Reference, get() should be 
as fast as possibly can, just returns a member field.


Alain




Re: Improve Groovy class loading performance and memory management

2016-05-19 Thread Alain Stalder
The version with a WeakHashMap and first getting ClassInfo without 
synchronization (which would could not be used in the end anyway) 
appeared in some quick (probably not representative) tests with 
concurrent access to be about 10-20% slower than the current 
implementation. Similar result also when I split up into 4096 
WeakHashMap instances in an array of clazz.hashCode()%4096 and always 
synchronizing on get().


I looked into the source of ClassValue. I had naively expected it to 
contain some native code or otherwise interact more directly with the 
VM, but there is only a new field in the Class class that it uses. So 
there is a hard link from the class via that field to ClassInfo in the 
case of Groovy, which I guess prevents "on-the-fly" garbage collection 
because ClassInfo contains lots of stuff with hard references to the 
class. And with the pre Java 7 implementation I presume the situation is 
similar. (With a WeakHashMap that is different because the value does 
not count, the entry can be collected independently of the value and it 
could be argued that ClassValue should have been implemented in a way 
that would have had the same behavior, if I my reasoning is correct.)


So maybe refactoring to always using WeakReference in all objects 
stored in ClassInfo (meta classes, caches, ...) would be sufficient to 
get "on-the-fly" garbage collection (i.e. before the maximum on 
Metaspace or Heap is reached)? And would the additional weakRef.get() 
calls maybe have again a noticeable effect on performance? I won't try 
this refactoring myself, but if someone else wants to try this?


Well, at least I think I might start to understand the problem... ;)

Alain


Re: Improve Groovy class loading performance and memory management

2016-05-18 Thread Jochen Theodorou

On 18.05.2016 11:59, Alain Stalder wrote:

Looking at that code for GlobalClassset below now, the itemsMap is only
used for two things:
- put(), where performance is not crucial because it happens only once
per loaded class (which is relatively expensive anyway)
- get(), where performance is crucial
but no iterations or removals etc. necessary.


yes


Could it work to try get() first without synchronize and if it returns
null or throws an exception, just try again in a synchronize(itemsMap)
block? I have no experience with doing such a thing with a
(Weak)HashMap... Could a synchronized put() fail if there is an
unsycnchronized get() at the same time?

I think I will try that unless someone tells me it can't work...?


not safe, no. The trouble is that it can work... for you, in your test. 
But there is no guarantee it will still work on a different machine. And 
then things like this can happen: 
https://bz.apache.org/bugzilla/show_bug.cgi?id=50078


What we used to use is ManagedConcurrentMap instead. Or you could try 
using Guava, you can make in there a concurrent weak keyed hashmap. Only 
we cannot use guava for Groovy. The library has just a too high payload



If it was worth a try:

Any tips on tests I could run to compare performance of Groovy master
with this branch (and verify that it is thread-safe)?
I know there are tests in the benchmark directory of the groovy sources
- which one(s) could I maybe run for this?


those benchmarks are largely number oriented, they won´t help you. You 
could try for example run many scripts concurrently, using the same 
groovy runtime. That usually gives a rough idea about the performance of 
this part of the code. A verification, that it is thread safe you won´t 
get with a benchmark, only a stress test.


bye Jochen


Re: Improve Groovy class loading performance and memory management

2016-05-18 Thread Alain Stalder
Looking at that code for GlobalClassset below now, the itemsMap is only 
used for two things:
- put(), where performance is not crucial because it happens only once 
per loaded class (which is relatively expensive anyway)

- get(), where performance is crucial
but no iterations or removals etc. necessary.

Could it work to try get() first without synchronize and if it returns 
null or throws an exception, just try again in a synchronize(itemsMap) 
block? I have no experience with doing such a thing with a 
(Weak)HashMap... Could a synchronized put() fail if there is an 
unsycnchronized get() at the same time?


I think I will try that unless someone tells me it can't work...?

If it was worth a try:

Any tips on tests I could run to compare performance of Groovy master 
with this branch (and verify that it is thread-safe)?
I know there are tests in the benchmark directory of the groovy sources 
- which one(s) could I maybe run for this?


Actually, the fact that classes would be collected on-the-fly would 
reduce the size of itemsMapm which would in principle shorten access 
times, but maybe not significantly - would have to be seen then...


Alain


On 18.05.16 10:02, Alain Stalder wrote:


On 18.05.16 09:10, Jochen Theodorou wrote:

 private static class GlobalClassSet {

 //private final ManagedLinkedList items = new
ManagedLinkedList(weakBundle);
 private final WeakHashMap> items 


= new WeakHashMap>();


would be actually interesting to keep the list and see if it can 
still garbage collect


Looks like it can. (As I would have expected because 
ClassInfo.remove(clazz) did not touch that list before and that was 
sufficient to get GC on-the-fly provided you also do 
Introspector.flushFromCaches(clazz) ):


--
private static class GlobalClassSet {

private final ManagedLinkedList itemsList = new 
ManagedLinkedList(weakBundle);
private final WeakHashMap> 
itemsMap = new WeakHashMap>();


public int size(){
return values().size();
}

public int fullSize(){
return values().size();
}

public Collection values(){
synchronized(itemsList){
return Arrays.asList(itemsList.toArray(new ClassInfo[0]));
}
}

public void add(ClassInfo value){
synchronized(itemsList) {
itemsList.add(value);
}
synchronized(itemsMap) {
itemsMap.put(value.klazz, new 
WeakReference(value));

}
}

public ClassInfo get(Class cls) {
WeakReference ref;
synchronized(itemsMap) {
ref = itemsMap.get(cls);
}
ClassInfo info;
if (ref == null) {
//System.out.println("ClassInfo Ref is null: " + 
cls.getName());

info = new ClassInfo(cls);
synchronized (itemsMap) {
itemsMap.put(cls, new WeakReference(info));
}
return info;
}
info = ref.get();
if (info == null) {
//System.out.println("ClassInfo is null: " + 
cls.getName());

info = new ClassInfo(cls);
itemsMap.put(cls, new WeakReference(info));
return info;
}
return info;
}

}
--

$ java -XX:MaxMetaspaceSize=64m -Xmx512m -cp 
.:groovy-2.5.0-SNAPSHOT.jar ClassGCTester -cp filling/ -parent tester 
-classes GroovyFilling

(does a Introspector.flushFromCaches(clazz) for each loaded class)

Secs Test classes  Metaspace/PermGen Heap   Load time 
Create timeRun time Cleanup time
   #loaded  #remainingused committed   used 
committed average average average  average
   0 1   1   6.3m   6.5m  13.4m 245.5m 
0.890ms14.308ms  0.026168ms   0.019285ms
   1   435 435   8.9m  10.1m  22.1m 245.5m 
0.365ms 1.825ms  0.64ms   0.09ms
   2  12021202  11.9m  14.6m  66.1m 245.5m 
0.280ms 1.314ms  0.24ms   0.01ms
   3  21972197  15.7m  20.4m  83.8m 309.5m 
0.240ms 1.070ms  0.10ms   0.01ms
   4  3247 966  11.0m  16.8m  16.5m 242.0m 
0.226ms 0.959ms  0.06ms   0.00ms
   5  43962115  15.4m  20.3m  44.5m 238.0m 
0.208ms 0.886ms  0.05ms   0.00ms
   6  54153134  19.3m  26.0m  54.4m 235.5m 
0.202ms 0.863ms  0.09ms   0.00ms
   7  6458 667   9.8m  18.0m  94.7m 266.5m 
0.203ms 0.839ms  0.03ms   0.00ms
   8  75501759  14.0m  21.4m 122.0m 268.5m 
0.198ms 0.821ms  0.03ms   0.00ms
   9  87482957  18.6m  25.9m  46.3m 268.5m 
0.191ms 0.799ms  0.03ms   0.

Re: Improve Groovy class loading performance and memory management

2016-05-18 Thread Alain Stalder


On 18.05.16 09:10, Jochen Theodorou wrote:

 private static class GlobalClassSet {

 //private final ManagedLinkedList items = new
ManagedLinkedList(weakBundle);
 private final WeakHashMap> items
= new WeakHashMap>();


would be actually interesting to keep the list and see if it can still 
garbage collect


Looks like it can. (As I would have expected because 
ClassInfo.remove(clazz) did not touch that list before and that was 
sufficient to get GC on-the-fly provided you also do 
Introspector.flushFromCaches(clazz) ):


--
private static class GlobalClassSet {

private final ManagedLinkedList itemsList = new 
ManagedLinkedList(weakBundle);
private final WeakHashMap> 
itemsMap = new WeakHashMap>();


public int size(){
return values().size();
}

public int fullSize(){
return values().size();
}

public Collection values(){
synchronized(itemsList){
return Arrays.asList(itemsList.toArray(new ClassInfo[0]));
}
}

public void add(ClassInfo value){
synchronized(itemsList) {
itemsList.add(value);
}
synchronized(itemsMap) {
itemsMap.put(value.klazz, new 
WeakReference(value));

}
}

public ClassInfo get(Class cls) {
WeakReference ref;
synchronized(itemsMap) {
ref = itemsMap.get(cls);
}
ClassInfo info;
if (ref == null) {
//System.out.println("ClassInfo Ref is null: " + 
cls.getName());

info = new ClassInfo(cls);
synchronized (itemsMap) {
itemsMap.put(cls, new WeakReference(info));
}
return info;
}
info = ref.get();
if (info == null) {
//System.out.println("ClassInfo is null: " + 
cls.getName());

info = new ClassInfo(cls);
itemsMap.put(cls, new WeakReference(info));
return info;
}
return info;
}

}
--

$ java -XX:MaxMetaspaceSize=64m -Xmx512m -cp .:groovy-2.5.0-SNAPSHOT.jar 
ClassGCTester -cp filling/ -parent tester -classes GroovyFilling

(does a Introspector.flushFromCaches(clazz) for each loaded class)

Secs Test classes  Metaspace/PermGen Heap   Load time Create 
timeRun time Cleanup time
   #loaded  #remainingused committed   used 
committed average average average  average
   0 1   1   6.3m   6.5m  13.4m 245.5m 
0.890ms14.308ms  0.026168ms   0.019285ms
   1   435 435   8.9m  10.1m  22.1m 245.5m 
0.365ms 1.825ms  0.64ms   0.09ms
   2  12021202  11.9m  14.6m  66.1m 245.5m 
0.280ms 1.314ms  0.24ms   0.01ms
   3  21972197  15.7m  20.4m  83.8m 309.5m 
0.240ms 1.070ms  0.10ms   0.01ms
   4  3247 966  11.0m  16.8m  16.5m 242.0m 
0.226ms 0.959ms  0.06ms   0.00ms
   5  43962115  15.4m  20.3m  44.5m 238.0m 
0.208ms 0.886ms  0.05ms   0.00ms
   6  54153134  19.3m  26.0m  54.4m 235.5m 
0.202ms 0.863ms  0.09ms   0.00ms
   7  6458 667   9.8m  18.0m  94.7m 266.5m 
0.203ms 0.839ms  0.03ms   0.00ms
   8  75501759  14.0m  21.4m 122.0m 268.5m 
0.198ms 0.821ms  0.03ms   0.00ms
   9  87482957  18.6m  25.9m  46.3m 268.5m 
0.191ms 0.799ms  0.03ms   0.00ms

[...]

Very interesting because the list contains references to the class and 
yet it can be garbage collected on-the-fly... Maybe that could help to 
find a solution?


Alain



Re: Improve Groovy class loading performance and memory management

2016-05-18 Thread Jochen Theodorou

On 17.05.2016 21:23, Alain Stalder wrote:
[...]

3) Value in the WeakHashMap is a Wrapper with a WeakReference to the class:

private static WeakHashMap, Wrapper> weakFillingClassesMap =
new WeakHashMap, Wrapper>();
private static class Wrapper { public WeakReference> clazz; }
...
Wrapper wrapper = new Wrapper();
wrapper.clazz = new WeakReference>(clazz);
weakFillingClassesMap.put(clazz, wrapper);

=> Can immediately be garbage collected (i.e. before limit on Metaspace
or Heap is reached)

--

4) Value in the WeakHashMap is a WeakReference with a hard
reference to the class in the Wrapper:

private static WeakHashMap, WeakReference>
weakFillingClassesMap = new WeakHashMap,
WeakReference>();
private static class Wrapper { public
Class clazz; }
...
Wrapper wrapper = new Wrapper();
wrapper.clazz = clazz;
weakFillingClassesMap.put(clazz, new WeakReference(wrapper));

=> Can immediately be garbage collected (i.e. before limit on Metaspace
or Heap is reached)

--

So, the basic idea would to refactor ClassInfo caches to use 3) or 4)
and maybe to override Introspector...


since I am looking for something that works with ClassInfo in the end, I 
guess 3) is the way to go.


[...]

 private static class GlobalClassSet {

 //private final ManagedLinkedList items = new
ManagedLinkedList(weakBundle);
 private final WeakHashMap> items
= new WeakHashMap>();


would be actually interesting to keep the list and see if it can still 
garbage collect


[]

What looks less than ideal is the first synchronize on items in get(),
but I don't know to what degree that would matter in practice, I don't
know how often that is called. In my tests this version appeared even to
be slightly faster than the one that is using Java 7 ClassValue, but
there was just a single thread...


in worst case, this is called for about every dynamic method 
invocation... so this should be better not blocking so much.


[...]

As I said, so far rather a hack, probably better to reimplement the
GroovyClassValuePreJava7 class instead?


reimplement to what?


Performance under concurrent
use? Are other caches that apparently exist in ClassInfo also no issue
under different circumstances? (And at some point: does it work across
VMs and OSes etc.?)


that`s to be tested


Would it make sense to implement a "GroovyIntrospector" which caches
things in a WeakHashMap> instead of in a
WeakHashMap as does Introspector, or something like
that? Not sure there, because it is all static and not sure how much
this has or will change from Java release to Java release, but maybe
that is not so important, just need an implementation that works? Or is
it sort of a public API for Groovy classes that is widely used?


I think we already have code for the bean stuff in Groovy itself. What 
it does not do is the bean info part. Removing the Introspector code 
(ideally making it optional) is quite high on my wishlist for a major 
version


bye Jochen



Re: Improve Groovy class loading performance and memory management

2016-05-18 Thread Alain Stalder

jwagenleitner wrote:
> I think performance in general and not just under concurrent use is 
extremely important for ClassInfo.  My understanding is that the static 
cache it holds of ClassInfo's is queried on every method call (at least 
in dynamic groovy).  That is probably why the current hash-based caches 
are used to save from the O(n) retrieval from a globalClassSet which is 
implemented as a linked list.


Too bad, I see no way at the moment how to get "on-the-fly" garbage 
collection of ClassInfo and fast concurrent access to it.


What I quickly tried  was to change the type of object stored with 
globalClassValue from ClassInfo to other types, but all failed in some 
ways (I did not test the full matrix of (Java 7 ClassValue or not) and 
(same class loader as Groovy for compiled Groovy script or not), but 
gave up once something failed):
- WeakReference: Garbage collected even if the class is still 
referenced.

- SoftReference: OutOfMemoryError but took longer than usual.
- WeakHashMap: i.e. a WeakHashMap with just a single 
entry (so no synchronization needed) - that was actually my biggest 
hope, but: OutOfMemoryError.


jwagenleitner wrote:
>>> So why not have the GroovyClassLoader keep a set of all classes it
>>> compiled itself and were loaded and offer a new ~
>>> GroovyClassLoader#finalCleanup() method that removes meta information
>>> for all these classes so that they would become immediately 
eligible for

>>> garbage collection? (I guess InvokerHelper.removeClass(clazz) and
>>> Introspector.flushFromCaches(clazz), but whatever is needed...)
>>
>> GroovyClassLoader (GCL) actually represents a tree of class loaders. 
for each compilation GCL will spawn an instance of InnerLoader. Since 
two different compilations are supposed to know each others
classes a list of classes is kept in GCL itself (see classCache). The 
inner loader itself is not referenced by GCL. Because of that list GCL 
has the clearCache method to remove classes from previous compilations.

>>
>> Why did we use this structure? GCL is supposed to offer you the 
possibility to compile the same class multiple times. That means you 
will get the same class multiple times. At the same time a class must be 
defined under the same name only once in a given defining class loader. 
As a result trying to define a class, that already exists under that 
name results in an error. A classloading constraint is actually to 
return the same class instance each time you request a class with a 
certain name. Is implies the error before it also means GCL is 
breaking those constraints knowingly.

>>
>> Anyway... I think such a cleanup method is misplaced in GCL, since 
it spans beyond the classloader... how about GroovySystem?

>>
> I agree that if a method were added I don't think GCL is the right 
place and that something like GroovySystem#removeClass(Class) or 
GroovySystem#flushFromCaches(Class) would be good.


I guess this would help a little, but - again - as soon as you use e.g. 
a closure in a script, you have more than one class from a compilation, 
and I guess this happens often.


In cases where a script was compiled at runtime, that would have to be 
by a GroovyClassLoader$InnerLoader (which extends GroovyClassLoader). 
Now, that $InnerLoader could be obtained with 
script.getClass().getClassLoader() - you could tell so by whether it is 
a GroovyClassLoader#InnerLoader - and tell it to clean up all classes it 
loaded which would all be related to the only script it compiled. If you 
want to offer that - which I think would probably make sense - you would 
have to add a new method to GroovyClassLoader or at least to 
$InnerLoader - like #cleanupCompiledScripts() or whatever - and then 
offer its functionality also from GroovySystem for convenience, probably 
in two variants, one that really only removes just the indicated class 
and one extends to all classes compiled from the same script in case it 
was compiled at runtime from a Groovy script.


This will certainly also not cover all use cases (Groovy classes loaded 
from the file system by an URLClassLoader, for example, there I see now 
way how to track which classes would belong to which main script etc.), 
but I think the use case of scripts compiled at runtime would still 
justify it. It would offer a relatively clean way to make all classes 
from a script compilation available for GC more quickly.


I would maybe also offer a similar method for ConfigSlurper, for 
convenience, because that also implicitly always compiles a Groovy 
script (the config), thus filling Metaspace/PermGen - with almost 
certainly nobody expecting this offhand - so that users would not have 
to explicitly get the class from the parsed object and then call the 
removal function of GroovySystem. (There a different type of loader 
seems to be used, RootLoader, but also usually compilation only gives a 
single class, I would estimate.)


Finally, Groovy class loading is so dynamic/flexible that

Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread John Wagenleitner
On Tue, May 17, 2016 at 12:23 PM, Alain Stalder  wrote:

> [...]
>
> As I said, so far rather a hack, probably better to reimplement the
> GroovyClassValuePreJava7 class instead? Performance under concurrent use?
> Are other caches that apparently exist in ClassInfo also no issue under
> different circumstances? (And at some point: does it work across VMs and
> OSes etc.?)
>


I think performance in general and not just under concurrent use is
extremely important for ClassInfo.  My understanding is that the static
cache it holds of ClassInfo's is queried on every method call (at least in
dynamic groovy).  That is probably why the current hash-based caches are
used to save from the O(n) retrieval from a globalClassSet which is
implemented as a linked list.


Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread John Wagenleitner
On Tue, May 17, 2016 at 12:48 AM, Alain Stalder  wrote:

>
> On 17.05.16 09:04, Alain Stalder wrote:
>
> PS: Note that Introspector.flushFromCaches(clazz) was experimentally
> really not necessary in this case, but maybe has to do with the simple
> nature of the test script ("42") and only calling a (no-args)
> constructor... In any case very promising...
>
>
> Ah, that's simply because it is already called in
> InvokerHelper.removeClass():
>
> public static void removeClass(Class clazz) {
> metaRegistry.removeMetaClass(clazz);
> ClassInfo.remove(clazz);
> Introspector.flushFromCaches(clazz);
> }
>
> Experimentally, for the test with ClassGCTester, the first call
> (metaRegistry.removeMetaClass(clazz)) was not necessary to have garbage
> collection before Metaspace reaches the maximum, the other two were.
>
>

I believe the removeMetaClass call is only there in case the metaclass
changed.  Any added methods cause the weak metaclass to be replaced by a
strong metaclass (ExpandoMetaClass) and that has a strong ref to the class
requiring removing the metaclass in order to allow GC to work.


Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread John Wagenleitner
On Mon, May 16, 2016 at 11:56 PM, Jochen Theodorou 
wrote:

> On 17.05.2016 07:53, Alain Stalder wrote:
>
>> That looks very good to me :)
>>
>> I will definitely try out the InvokerHelper.removeClass(clazz) with
>> added ClassInfo removal plus Introspector.flushFromCaches(clazz) and see
>> if I can get garbage collection before reaching the limit on Metaspace
>> or Heap.
>>
>> And, maybe something like the following could be added to the
>> GroovyClassLoader? Thinking aloud:
>>
>> Assuming the following is true: Any class can only be garbage collected
>> once its ClassLoader can be garbage collected, because each class keeps
>> a reference to its ClassLoader (so that it can use it to load further
>> classes when needed when running methods).
>>
>
> not only the class, the classloader internals also keep such a reference.
> And I mean java.lang.ClassLoader here.
>
> So why not have the GroovyClassLoader keep a set of all classes it
>> compiled itself and were loaded and offer a new ~
>> GroovyClassLoader#finalCleanup() method that removes meta information
>> for all these classes so that they would become immediately eligible for
>> garbage collection? (I guess InvokerHelper.removeClass(clazz) and
>> Introspector.flushFromCaches(clazz), but whatever is needed...)
>>
>
> GroovyClassLoader (GCL) actually represents a tree of class loaders. for
> each compilation GCL will spawn an instance of InnerLoader. Since two
> different compilations are supposed to know each others classes a list of
> classes is kept in GCL itself (see classCache). The inner loader itself is
> not referenced by GCL. Because of that list GCL has the clearCache method
> to remove classes from previous compilations.
>
> Why did we use this structure? GCL is supposed to offer you the
> possibility to compile the same class multiple times. That means you will
> get the same class multiple times. At the same time a class must be defined
> under the same name only once in a given defining class loader. As a result
> trying to define a class, that already exists under that name results in an
> error. A classloading constraint is actually to return the same class
> instance each time you request a class with a certain name. Is implies the
> error before it also means GCL is breaking those constraints knowingly.
>
> Anyway... I think such a cleanup method is misplaced in GCL, since it
> spans beyond the classloader... how about GroovySystem?
>
>

I agree that if a method were added I don't think GCL is the right place
and that something like GroovySystem#removeClass(Class) or
GroovySystem#flushFromCaches(Class) would be good.


Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread Alain Stalder


On 17.05.16 21:23, Alain Stalder wrote:
I managed to make a Groovy version where garbage collection of 
ClassInfo happens before the limit in Metaspace or Heap is reached (!) 
- so far it is just a hack, but maybe it can contribute to a solution...
Branch/diff with the hack: 
https://github.com/apache/groovy/compare/master...jexler:classinfo-gc-hack
Compiled JAR: 
https://www.jexler.net/groovy-2.5.0-SNAPSHOT-classinf-gc-hack.jar





Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread Alain Stalder
I managed to make a Groovy version where garbage collection of ClassInfo 
happens before the limit in Metaspace or Heap is reached (!) - so far it 
is just a hack, but maybe it can contribute to a solution...


First some "basic research" on when the Java VM can garbage collect a 
class, performed with a slightly modified ClassGCTester and the simple 
"JavaFilling" class from previous tests. Again Oracle JDK 8 (Mac).


--

1) Original test setup (class reference as key, constant String "" as 
value):


private static WeakHashMap, String> weakFillingClassesMap = new 
WeakHashMap, String>();

...
weakFillingClassesMap.put(clazz, "");

=> Can immediately be garbage collected (i.e. before limit on Metaspace 
or Heap is reached), as expected, of course


--

2) Value in the WeakHashMap is a Wrapper with a hard reference to the class:

private static WeakHashMap, Wrapper> weakFillingClassesMap = 
new WeakHashMap, Wrapper>();

private static class Wrapper { public Class clazz; }
...
Wrapper wrapper = new Wrapper();
wrapper.clazz = clazz;
weakFillingClassesMap.put(clazz, wrapper);

=> Cannot be garbage collected, OutOfMemoryError once limit on Metaspace 
or Heap is reached


--

3) Value in the WeakHashMap is a Wrapper with a WeakReference to the class:

private static WeakHashMap, Wrapper> weakFillingClassesMap = 
new WeakHashMap, Wrapper>();

private static class Wrapper { public WeakReference> clazz; }
...
Wrapper wrapper = new Wrapper();
wrapper.clazz = new WeakReference>(clazz);
weakFillingClassesMap.put(clazz, wrapper);

=> Can immediately be garbage collected (i.e. before limit on Metaspace 
or Heap is reached)


--

4) Value in the WeakHashMap is a WeakReference with a hard 
reference to the class in the Wrapper:


private static WeakHashMap, WeakReference> 
weakFillingClassesMap = new WeakHashMap, 
WeakReference>();
private static class Wrapper { public 
Class clazz; }

...
Wrapper wrapper = new Wrapper();
wrapper.clazz = clazz;
weakFillingClassesMap.put(clazz, new WeakReference(wrapper));

=> Can immediately be garbage collected (i.e. before limit on Metaspace 
or Heap is reached)


--

So, the basic idea would to refactor ClassInfo caches to use 3) or 4) 
and maybe to override Introspector...


Here's the hack I made for ClassInfo, based on the master branch (note 
that in that branch there is even still a hard reference to the Class in 
ClassInfo):


Not using ClassValue stuff at all:

/*private static final GroovyClassValue globalClassValue 
= GroovyClassValueFactory.createGroovyClassValue(new 
ComputeValue(){

@Override
public ClassInfo computeValue(Class type) {
ClassInfo ret = new ClassInfo(type);
globalClassSet.add(ret);
return ret;
}
});*/

Instead getting ClassInfo from class from a refactored GlobalClassSet:

public static ClassInfo getClassInfo (Class cls) {
return globalClassSet.get(cls);
//return globalClassValue.get(cls);
}

and here is the refactored GlobalClassSet, now based on a WeakHashMap:

private static class GlobalClassSet {

//private final ManagedLinkedList items = new 
ManagedLinkedList(weakBundle);
private final WeakHashMap> items 
= new WeakHashMap>();


public int size(){
return values().size();
}

public int fullSize(){
return values().size();
}

public Collection values(){
synchronized(items){
Collection> values = 
items.values();

List list = new ArrayList();
for (WeakReference value : values) {
ClassInfo info = value.get();
if (info != null) {
//System.out.println("ClassInfo is null");
list.add(info);
}
}
return list;
//return Arrays.asList(items.toArray(new ClassInfo[0]));
}
}

public void add(ClassInfo value){
synchronized(items){
//items.add(value);
items.put(value.klazz, new 
WeakReference(value));

}
}

public ClassInfo get(Class cls) {
WeakReference ref;
synchronized(items) {
ref = items.get(cls);
}
ClassInfo info;
if (ref == null) {
//System.out.println("ClassInfo Ref is null: " + 
cls.getName());

info = new ClassInfo(cls);
synchronized (items) {
items.put(cls, new WeakReference(info));
}
return info;
}
info = ref.get();
if (info == null) {
//System.out.println("ClassInfo is null: " + 
cls.getName());

info = new ClassInfo(cls);
items.put(cls, new WeakReference(info));
return info;
}
retu

Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread Alain Stalder

On 17.05.16 09:48, Alain Stalder wrote:
Experimentally, for the test with ClassGCTester, the first call 
(metaRegistry.removeMetaClass(clazz)) was not necessary to have 
garbage collection before Metaspace reaches the maximum, the other two 
were.

Makes sense:
metaRegistry keeps no reference to the class, instead it gets ClassInfo 
and stores the MetaClass there.
Introspector has a reference to the class: It contains a WeakHashMap 
(resp. a class derived of it) with the class as the key and an array of 
java.lang.reflect.Method as the value, which, in turn, has the class 
reference in a private field.


Alain


Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread Alain Stalder


On 17.05.16 09:04, Alain Stalder wrote:
PS: Note that Introspector.flushFromCaches(clazz) was experimentally 
really not necessary in this case, but maybe has to do with the simple 
nature of the test script ("42") and only calling a (no-args) 
constructor... In any case very promising...


Ah, that's simply because it is already called in 
InvokerHelper.removeClass():


public static void removeClass(Class clazz) {
metaRegistry.removeMetaClass(clazz);
ClassInfo.remove(clazz);
Introspector.flushFromCaches(clazz);
}

Experimentally, for the test with ClassGCTester, the first call 
(metaRegistry.removeMetaClass(clazz)) was not necessary to have garbage 
collection before Metaspace reaches the maximum, the other two were.


Alain


Re: Improve Groovy class loading performance and memory management

2016-05-17 Thread Alain Stalder


On 17.05.16 07:53, Alain Stalder wrote:
I will definitely try out the InvokerHelper.removeClass(clazz) with 
added ClassInfo removal plus Introspector.flushFromCaches(clazz) and 
see if I can get garbage collection before reaching the limit on 
Metaspace or Heap.

Fantastic, I really got it to work :)

I took groovy master plus changes for GROOVY-7646 and GROOVY-7683 and 
modified the test in ClassGCTester as follows:


--
long nanoT0 = System.nanoTime();
Class clazz = classLoader.loadClass(testClassName);
long nanoT1 = System.nanoTime();
clazz.newInstance();
long nanoT2 = System.nanoTime();
loadTimeTotal += (nanoT1 - nanoT0);
createTimeTotal += (nanoT2 - nanoT1);
weakFillingClassesMap.put(clazz, "");

// added:
Class invokerHelperClass = 
ClassGCTester.class.getClassLoader().loadClass("org.codehaus.groovy.runtime.InvokerHelper");
Method removeClassMethod = 
invokerHelperClass.getDeclaredMethod("removeClass", Class.class);

removeClassMethod.invoke(null, clazz);
//Introspector.flushFromCaches(clazz);
--

An then ran the test as follows (not using ClassValue, loading the test 
class not from the same ClassLoader as Groovy itself):


$ java -XX:MaxMetaspaceSize=64m -Xmx512m -cp .:groovy-2.5.0-SNAPSHOT.jar 
ClassGCTester -cp filling/ -parent tester -classes GroovyFilling


--
Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed average average
   0 1   1   6.3m   6.5m  12.8m 245.5m 
0.838ms11.112ms
   1   486 486   9.2m  10.5m  27.6m 245.5m 
0.326ms 1.624ms
   2  14041404  12.7m  16.1m  30.3m 245.5m 
0.244ms 1.124ms
   3  2308  61   7.6m  16.8m  12.2m 228.5m 
0.241ms 1.014ms
   4  34611214  12.0m  16.8m  46.4m 244.0m 
0.212ms 0.908ms
   5  45772330  16.3m  21.4m  69.1m 240.0m 
0.200ms 0.860ms
   6  55813334  20.1m  27.4m  75.1m 237.5m 
0.197ms 0.847ms
   7  6703 974  11.1m  18.2m  11.3m 268.0m 
0.197ms 0.818ms
   8  79962267  16.0m  23.4m  65.7m 253.5m 
0.188ms 0.786ms
   9  92613532  20.9m  28.5m 115.1m 267.5m 
0.182ms 0.765ms
  10 10518 960  11.0m  20.0m  10.9m 285.0m 
0.181ms 0.747ms
  11 118412283  16.1m  24.0m  68.9m 285.0m 
0.177ms 0.730ms
  12 130973539  20.9m  29.0m 113.9m 285.0m 
0.173ms 0.722ms
  13 14288 331   8.7m  21.3m  49.6m 314.0m 
0.174ms 0.715ms
  14 156401683  13.8m  22.9m 105.5m 316.0m 
0.170ms 0.705ms
  15 169232966  18.7m  27.9m 150.4m 315.5m 
0.168ms 0.699ms
  16 181284171  23.3m  32.6m  54.6m 316.0m 
0.166ms 0.697ms
  17 19360 628   9.8m  21.5m  88.6m 347.0m 
0.167ms 0.692ms
  18 207071975  15.0m  24.6m 137.6m 346.5m 
0.165ms 0.685ms
  19 219583226  19.7m  29.6m  39.1m 347.5m 
0.164ms 0.683ms
  20 231724440  24.4m  34.4m  66.4m 349.5m 
0.163ms 0.682ms
  21 24430 861  10.7m  21.9m 120.7m 383.5m 
0.163ms 0.678ms
  22 257252156  15.7m  25.5m  19.0m 381.0m 
0.162ms 0.675ms
  23 269863417  20.5m  30.5m  49.3m 385.0m 
0.161ms 0.674ms
  24 281924623  25.1m  35.3m  70.8m 384.5m 
0.161ms 0.673ms
  25 29454 956  11.1m  21.0m 134.6m 416.0m 
0.161ms 0.671ms
  26 307372239  16.0m  26.3m  25.2m 416.5m 
0.160ms 0.669ms

[...]
--

Nicely garbage collects repeatedly, Metaspace stay well below the 
configured maximum of 64m...


PS: Note that Introspector.flushFromCaches(clazz) was experimentally 
really not necessary in this case, but maybe has to do with the simple 
nature of the test script ("42") and only calling a (no-args) 
constructor... In any case very promising...


Alain





Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Jochen Theodorou

On 17.05.2016 07:53, Alain Stalder wrote:

That looks very good to me :)

I will definitely try out the InvokerHelper.removeClass(clazz) with
added ClassInfo removal plus Introspector.flushFromCaches(clazz) and see
if I can get garbage collection before reaching the limit on Metaspace
or Heap.

And, maybe something like the following could be added to the
GroovyClassLoader? Thinking aloud:

Assuming the following is true: Any class can only be garbage collected
once its ClassLoader can be garbage collected, because each class keeps
a reference to its ClassLoader (so that it can use it to load further
classes when needed when running methods).


not only the class, the classloader internals also keep such a 
reference. And I mean java.lang.ClassLoader here.



So why not have the GroovyClassLoader keep a set of all classes it
compiled itself and were loaded and offer a new ~
GroovyClassLoader#finalCleanup() method that removes meta information
for all these classes so that they would become immediately eligible for
garbage collection? (I guess InvokerHelper.removeClass(clazz) and
Introspector.flushFromCaches(clazz), but whatever is needed...)


GroovyClassLoader (GCL) actually represents a tree of class loaders. for 
each compilation GCL will spawn an instance of InnerLoader. Since two 
different compilations are supposed to know each others classes a list 
of classes is kept in GCL itself (see classCache). The inner loader 
itself is not referenced by GCL. Because of that list GCL has the 
clearCache method to remove classes from previous compilations.


Why did we use this structure? GCL is supposed to offer you the 
possibility to compile the same class multiple times. That means you 
will get the same class multiple times. At the same time a class must be 
defined under the same name only once in a given defining class loader. 
As a result trying to define a class, that already exists under that 
name results in an error. A classloading constraint is actually to 
return the same class instance each time you request a class with a 
certain name. Is implies the error before it also means GCL is 
breaking those constraints knowingly.


Anyway... I think such a cleanup method is misplaced in GCL, since it 
spans beyond the classloader... how about GroovySystem?


bye Jochen


Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Alain Stalder

That looks very good to me :)

I will definitely try out the InvokerHelper.removeClass(clazz) with 
added ClassInfo removal plus Introspector.flushFromCaches(clazz) and see 
if I can get garbage collection before reaching the limit on Metaspace 
or Heap.


And, maybe something like the following could be added to the 
GroovyClassLoader? Thinking aloud:


Assuming the following is true: Any class can only be garbage collected 
once its ClassLoader can be garbage collected, because each class keeps 
a reference to its ClassLoader (so that it can use it to load further 
classes when needed when running methods).


So why not have the GroovyClassLoader keep a set of all classes it 
compiled itself and were loaded and offer a new ~ 
GroovyClassLoader#finalCleanup() method that removes meta information 
for all these classes so that they would become immediately eligible for 
garbage collection? (I guess InvokerHelper.removeClass(clazz) and 
Introspector.flushFromCaches(clazz), but whatever is needed...)


This would not help with Groovy classes that were precompiled and 
loaded, say, with an URLClassLoader, but would help with ones that were 
dynamically compiled at runtime.


Alain


On 17.05.16 00:35, John Wagenleitner wrote:



On Mon, May 16, 2016 at 1:34 PM, Alain Stalder > wrote:


Thanks, I had not looked at the master branch, ClassInfo source
looks quite a bit cleaner there already :)

Regarding programmatic cleanup (GROOVY-7646), I think that is a
good idea, but in the details there might be some obstacles.


I definitely agree, many obstacles usually present themselves once you 
scratch the surface. :)




For example this sequence of calls to GroovyShell:

def shell = new GroovyShell()
def script1 = shell.parse("42")
assert script1.class.name  == "Script1"
def script2 = shell.parse("new Script1().run()")
assert script2.run() == 42

def script3 = shell.parse("99", "Nintetynine")
assert script3.class.name  == "Nintetynine"

def file = new File("Fiftyfive.groovy")
file.setText("55")
def script4 = shell.parse(file)
assert script4.class.name  == "Fiftyfive"

So, classes accumulate (in the GroovyClassLoader) and can be
addressed by their names in subsequent scripts. (And for more
complex script expressions, more than one class might be the
result of compilation, e.g. with closures or inner classes, enums,
etc.)



This passes with the changes in place for GROOVY-7646, though calls to 
parse don't call the added clean-up code. It still passes if I change 
parse to run.




I would estimate that in the case where a script is run with a
name given automatically by the GroovyShell ("Script1", "Script2",
...) it would be OK to do the cleanup (and I guess using
GroovyShell that way might be a very common case?), but when it
comes to explicitly named scripts, doing so might change behaviour
of existing code.



For quite some time GroovyShell#evaluate(Reader,String filename) was 
doing this kind of cleanup [1].  Cleanup meaning removing the 
metaClass, the ClassInfo from the cache and the Introspector 
beanInfoCache.  As long as the ClassLoader and any of it's classes are 
still referenced the Classes that result from parse/run calls would 
still be available.  But you are right, there are so many ways the 
shell can be used it is difficult to tell what it might break.



I just took a look at GroovyScriptEngine which also has run()
methods and, if I remember correctly, it recompiles all scripts if
one of them changes (to get dependencies right), so in principle
lots of classes to cleanup for each time this happens. But I am
not sure if that is possible there, because there is also a
createScript() method, so possibly still objects/classes that are
in use around.

(And I have also just started to think about Grengine in that
context, my open source library for using Groovy in a Java VM (and
which almost nobody uses ;), there it might be easier to build in
such automatic removal because the approach is more structured,
although a bit less dynamic.)

Hmn, would really be great if there was a way to achieve constant
garbage collection of Groovy classes.



I take constant to mean not waiting until heap/metaspace is filled 
before collection.  If so, from that I've seen that would still 
require some user intervention 
(java.beans.Introspector.flushFromCaches(clazz)) in order to clear the 
Introspector cache which keeps a Soft Reference to main method of a 
Script class which in turns references the Class.




Alain



Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread John Wagenleitner
On Mon, May 16, 2016 at 1:34 PM, Alain Stalder  wrote:

> Thanks, I had not looked at the master branch, ClassInfo source looks
> quite a bit cleaner there already :)
>
> Regarding programmatic cleanup (GROOVY-7646), I think that is a good idea,
> but in the details there might be some obstacles.
>

I definitely agree, many obstacles usually present themselves once you
scratch the surface. :)



>
> For example this sequence of calls to GroovyShell:
>
> def shell = new GroovyShell()
> def script1 = shell.parse("42")
> assert script1.class.name == "Script1"
> def script2 = shell.parse("new Script1().run()")
> assert script2.run() == 42
>
> def script3 = shell.parse("99", "Nintetynine")
> assert script3.class.name == "Nintetynine"
>
> def file = new File("Fiftyfive.groovy")
> file.setText("55")
> def script4 = shell.parse(file)
> assert script4.class.name == "Fiftyfive"
>
> So, classes accumulate (in the GroovyClassLoader) and can be addressed by
> their names in subsequent scripts. (And for more complex script
> expressions, more than one class might be the result of compilation, e.g.
> with closures or inner classes, enums, etc.)
>


This passes with the changes in place for GROOVY-7646, though calls to
parse don't call the added clean-up code.  It still passes if I change
parse to run.



> I would estimate that in the case where a script is run with a name given
> automatically by the GroovyShell ("Script1", "Script2", ...) it would be OK
> to do the cleanup (and I guess using GroovyShell that way might be a very
> common case?), but when it comes to explicitly named scripts, doing so
> might change behaviour of existing code.
>


For quite some time GroovyShell#evaluate(Reader,String filename) was doing
this kind of cleanup [1].  Cleanup meaning removing the metaClass, the
ClassInfo from the cache and the Introspector beanInfoCache.  As long as
the ClassLoader and any of it's classes are still referenced the Classes
that result from parse/run calls would still be available.  But you are
right, there are so many ways the shell can be used it is difficult to tell
what it might break.



>
> I just took a look at GroovyScriptEngine which also has run() methods and,
> if I remember correctly, it recompiles all scripts if one of them changes
> (to get dependencies right), so in principle lots of classes to cleanup for
> each time this happens. But I am not sure if that is possible there,
> because there is also a createScript() method, so possibly still
> objects/classes that are in use around.
>
> (And I have also just started to think about Grengine in that context, my
> open source library for using Groovy in a Java VM (and which almost nobody
> uses ;), there it might be easier to build in such automatic removal
> because the approach is more structured, although a bit less dynamic.)
>
> Hmn, would really be great if there was a way to achieve constant garbage
> collection of Groovy classes.
>


I take constant to mean not waiting until heap/metaspace is filled before
collection.  If so, from that I've seen that would still require some user
intervention (java.beans.Introspector.flushFromCaches(clazz)) in order to
clear the Introspector cache which keeps a Soft Reference to main method of
a Script class which in turns references the Class.



>
>
> Alain
>
>
>

[1]
https://github.com/apache/groovy/commit/5724870025c25622015ba13c0310def5742d0b2f#diff-62f4f9c1bd5efea3ddcfe563c25f953eR459



> On 16.05.16 18:18, John Wagenleitner wrote:
>
> Just catching up on this thread, very interesting discussion and will have
> to give the posted test code a try.
>
> You are right about the PhantomReference and it has been removed in
> master [1] along with the local cache that used it.  Due to some
> refactorings that were not in 2_4_X at the time it wasn't removed from that
> branch.  But probably should be cleaned up if any fixes for the memory
> issues to ClassInfo are merged into that branch in the near future.
>
> I think the suggestion of referencing the Class via the ClassInfo from
> metaclasses/cachedclasses would be a good one, the less places the Class is
> kept the better.  Unfortunately since it's a protected field on
> MetaClassImpl that would be a breaking change would be something for a 3.0
> as you pointed out.
>
> So far, I have found that keeping a WeakReference to the Class in
> ClassInfo allows it to be collected (mostly tested with non-ClassValue
> version of ClassInfo).  At least one exception is if methods are added to
> the metaclass then it's required to setMetaClass(null) since the
> ExpandoMetaClass is a strong reference on ClassInfo and EMC has a strong
> reference to the Class.  What is difficult to determine is if keeping a
> WeakReference can cause any potential issues.  Only possible problem I can
> see is if the methods of the Class A were referenced in the MetaMethodIndex
> for Class B, but I think in that case as long as the Class B was strongly
> referenced the index itself would ke

Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Alain Stalder
Thanks, I had not looked at the master branch, ClassInfo source looks 
quite a bit cleaner there already :)


Regarding programmatic cleanup (GROOVY-7646), I think that is a good 
idea, but in the details there might be some obstacles.


For example this sequence of calls to GroovyShell:

def shell = new GroovyShell()
def script1 = shell.parse("42")
assert script1.class.name == "Script1"
def script2 = shell.parse("new Script1().run()")
assert script2.run() == 42

def script3 = shell.parse("99", "Nintetynine")
assert script3.class.name == "Nintetynine"

def file = new File("Fiftyfive.groovy")
file.setText("55")
def script4 = shell.parse(file)
assert script4.class.name == "Fiftyfive"

So, classes accumulate (in the GroovyClassLoader) and can be addressed 
by their names in subsequent scripts. (And for more complex script 
expressions, more than one class might be the result of compilation, 
e.g. with closures or inner classes, enums, etc.)


I would estimate that in the case where a script is run with a name 
given automatically by the GroovyShell ("Script1", "Script2", ...) it 
would be OK to do the cleanup (and I guess using GroovyShell that way 
might be a very common case?), but when it comes to explicitly named 
scripts, doing so might change behaviour of existing code.


I just took a look at GroovyScriptEngine which also has run() methods 
and, if I remember correctly, it recompiles all scripts if one of them 
changes (to get dependencies right), so in principle lots of classes to 
cleanup for each time this happens. But I am not sure if that is 
possible there, because there is also a createScript() method, so 
possibly still objects/classes that are in use around.


(And I have also just started to think about Grengine in that context, 
my open source library for using Groovy in a Java VM (and which almost 
nobody uses ;), there it might be easier to build in such automatic 
removal because the approach is more structured, although a bit less 
dynamic.)


Hmn, would really be great if there was a way to achieve constant 
garbage collection of Groovy classes.


Alain

On 16.05.16 18:18, John Wagenleitner wrote:
Just catching up on this thread, very interesting discussion and will 
have to give the posted test code a try.


You are right about the PhantomReference and it has been removed in 
master [1] along with the local cache that used it.  Due to some 
refactorings that were not in 2_4_X at the time it wasn't removed from 
that branch. But probably should be cleaned up if any fixes for the 
memory issues to ClassInfo are merged into that branch in the near future.


I think the suggestion of referencing the Class via the ClassInfo from 
metaclasses/cachedclasses would be a good one, the less places the 
Class is kept the better.  Unfortunately since it's a protected field 
on MetaClassImpl that would be a breaking change would be something 
for a 3.0 as you pointed out.


So far, I have found that keeping a WeakReference to the Class in 
ClassInfo allows it to be collected (mostly tested with non-ClassValue 
version of ClassInfo).  At least one exception is if methods are added 
to the metaclass then it's required to setMetaClass(null) since the 
ExpandoMetaClass is a strong reference on ClassInfo and EMC has a 
strong reference to the Class.  What is difficult to determine is if 
keeping a WeakReference can cause any potential issues.  Only possible 
problem I can see is if the methods of the Class A were referenced in 
the MetaMethodIndex for Class B, but I think in that case as long as 
the Class B was strongly referenced the index itself would keep Class 
A referenced.


In environments where lots of scripts are being parsed and run and 
references to the Class are not retained, it might be worth having a 
way to programmatically initiate the cleanup so as not to have to wait 
for the Soft References to be collected.  The extra performance costs 
of clearing a few references might not be as high as consistently 
hitting the upper heap limit constantly.  It is something I have 
looked at for GROOVY-7646 [2].  Parsed groovy classes should be 
collectible by default without any intervention, but there may be 
cases where an API to help speed the removal might be useful too.



[1] 
https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50 


[2] https://github.com/apache/groovy/pull/325


On Mon, May 16, 2016 at 8:01 AM, Alain Stalder > wrote:


My time here is running up, other things to attend to, so here is what
I wrote about the current state of class loading and garbage
collection
in Groovy in the just updated user manual of Grengine:

https://www.grengine.ch/manual.html#the-cost-of-session-separation
--
 The Cost of Session Separation

Although loading classes from bytecode obtained from compiling
Groovy scripts
is a lot less expensive than compiling them (plus afterwards also
loading the
res

Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread John Wagenleitner
Just catching up on this thread, very interesting discussion and will have
to give the posted test code a try.

You are right about the PhantomReference and it has been removed in master
[1] along with the local cache that used it.  Due to some refactorings that
were not in 2_4_X at the time it wasn't removed from that branch.  But
probably should be cleaned up if any fixes for the memory issues to
ClassInfo are merged into that branch in the near future.

I think the suggestion of referencing the Class via the ClassInfo from
metaclasses/cachedclasses would be a good one, the less places the Class is
kept the better.  Unfortunately since it's a protected field on
MetaClassImpl that would be a breaking change would be something for a 3.0
as you pointed out.

So far, I have found that keeping a WeakReference to the Class in ClassInfo
allows it to be collected (mostly tested with non-ClassValue version of
ClassInfo).  At least one exception is if methods are added to the
metaclass then it's required to setMetaClass(null) since the
ExpandoMetaClass is a strong reference on ClassInfo and EMC has a strong
reference to the Class.  What is difficult to determine is if keeping a
WeakReference can cause any potential issues.  Only possible problem I can
see is if the methods of the Class A were referenced in the MetaMethodIndex
for Class B, but I think in that case as long as the Class B was strongly
referenced the index itself would keep Class A referenced.

In environments where lots of scripts are being parsed and run and
references to the Class are not retained, it might be worth having a way to
programmatically initiate the cleanup so as not to have to wait for the
Soft References to be collected.  The extra performance costs of clearing a
few references might not be as high as consistently hitting the upper heap
limit constantly.  It is something I have looked at for GROOVY-7646 [2].
Parsed groovy classes should be collectible by default without any
intervention, but there may be cases where an API to help speed the removal
might be useful too.


[1]
https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50

[2] https://github.com/apache/groovy/pull/325


On Mon, May 16, 2016 at 8:01 AM, Alain Stalder  wrote:

> My time here is running up, other things to attend to, so here is what
> I wrote about the current state of class loading and garbage collection
> in Groovy in the just updated user manual of Grengine:
>
> https://www.grengine.ch/manual.html#the-cost-of-session-separation
> --
>  The Cost of Session Separation
>
> Although loading classes from bytecode obtained from compiling Groovy
> scripts
> is a lot less expensive than compiling them (plus afterwards also loading
> the
> resulting bytecode), it is still somewhat more expensive than one might
> naively
> expect and there are a few things to be aware of when operating that way.
>
> In the following, I will simply call classes compiled by the Groovy
> compiler
> from Groovy scripts/sources _Groovy classes_ and classes compiled by the
> Java
> compiler from Java sources _Java classes_.
>
> * *Class Loading* +
>   Experimentally, loading of a typical Groovy class is often about 10 times
>   slower than loading a Java class with similarly complex source code, but
>   both are relatively expensive operations (of the order of a millisecond
>   for a small Groovy class, to give a rough indication). For Java classes,
>   this is apparently mainly expensive because some security checks have to
>   be made on the bytecode. For Groovy classes, it is mainly expensive
>   because some meta information is needed to later efficiently call methods
>   dynamically, and the like.
> * *Garbage Collection* +
>   Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_ (Java 8
>   and later) plus some associated data on the Heap, at least for Groovy
>   classes the latter is normally the case (meta information). Whereas for
>   Java classes, unused classes appear to be usually garbage collected from
>   PermGen/Metaspace continuously, with Groovy classes this experimentally
>   does not happen before PermGen/Metaspace or the Heap reach a configured
>   limit. Why exactly this is so and whether it is easy to change and
> whether
>   it will change in the future, is difficult to answer for me, I find the
>   code around it is rather convoluted, hard to untangle. Note that by
> default
>   on Java VMs there is typically no limit set for Metaspace (but there is
>   for PermGen), so setting a limit is crucial in practice when using
> Groovy.
> * *Garbage Collection Bugs* +
>   In the past, several Groovy versions had failed at garbage collecting
>   Groovy classes and their class loaders, resulting finally in an
>   `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the Heap,
>   whichever limit was reached first. If when you are reading this, Groovy
>   2.4.6 is (still) the newest version, make sure you set the system
> property
>

Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Alain Stalder

My time here is running up, other things to attend to, so here is what
I wrote about the current state of class loading and garbage collection
in Groovy in the just updated user manual of Grengine:

https://www.grengine.ch/manual.html#the-cost-of-session-separation
--
 The Cost of Session Separation

Although loading classes from bytecode obtained from compiling Groovy 
scripts
is a lot less expensive than compiling them (plus afterwards also 
loading the
resulting bytecode), it is still somewhat more expensive than one might 
naively

expect and there are a few things to be aware of when operating that way.

In the following, I will simply call classes compiled by the Groovy compiler
from Groovy scripts/sources _Groovy classes_ and classes compiled by the 
Java

compiler from Java sources _Java classes_.

* *Class Loading* +
  Experimentally, loading of a typical Groovy class is often about 10 times
  slower than loading a Java class with similarly complex source code, but
  both are relatively expensive operations (of the order of a millisecond
  for a small Groovy class, to give a rough indication). For Java classes,
  this is apparently mainly expensive because some security checks have to
  be made on the bytecode. For Groovy classes, it is mainly expensive
  because some meta information is needed to later efficiently call methods
  dynamically, and the like.
* *Garbage Collection* +
  Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_ (Java 8
  and later) plus some associated data on the Heap, at least for Groovy
  classes the latter is normally the case (meta information). Whereas for
  Java classes, unused classes appear to be usually garbage collected from
  PermGen/Metaspace continuously, with Groovy classes this experimentally
  does not happen before PermGen/Metaspace or the Heap reach a configured
  limit. Why exactly this is so and whether it is easy to change and 
whether

  it will change in the future, is difficult to answer for me, I find the
  code around it is rather convoluted, hard to untangle. Note that by 
default

  on Java VMs there is typically no limit set for Metaspace (but there is
  for PermGen), so setting a limit is crucial in practice when using 
Groovy.

* *Garbage Collection Bugs* +
  In the past, several Groovy versions had failed at garbage collecting
  Groovy classes and their class loaders, resulting finally in an
  `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the Heap,
  whichever limit was reached first. If when you are reading this, Groovy
  2.4.6 is (still) the newest version, make sure you set the system 
property

  `groovy.use.classvalue=true` in the context of Grengine. Note that under
  different circumstances, like the one described in
  https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591:
  Use of ClassValue causes major memory leak] you would instead have had to
  set it to false! That Groovy bug is actually in turn due to a bug in
  Oracle/OpenJDK Java VMs regarding garbage collection under some
  circumstances, more precisely a bug in a new feature (`ClassValue`)
  introduced in order to make thing easier(!) for dynamic languages in the
  Java VM, see 
https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353].


So, if you want to use session separation with Greninge (or otherwise want
to load many Groovy classes repeately), first set a limit on 
PermGen/Metaspace,

then verify that classes can be garbage collected in an environment close to
production and that throughput under load would be sufficient (despite the
relatively slow class loading performance of Groovy (and Java) classes 
in the

Java VM) and then use it. And don't forget to repeat this at least when you
upgrade Groovy to a new version, but possibly also when you upgrade Java.

Or see the next section for an alternative...
--

PS: By the way, very funny how Jochen Theodorou "garbage collected" what I
wrote about PhantomReference to a "[...]"...

Good luck with Groovy garbage collection.

Alain


Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Alain Stalder

Maybe that explains the general idea better although a bit abstractly:

classValue.get(clazz).getOrCreateClassInfo(clazz).getOrCreateMetaClass(clazz).doWhatever(clazz-if-needed).getSomeMoreIfNeeded(clazz-if-needed)

Don't store the class in any objects of that chain, get the associated 
object only from ClassValue. Maybe that is too limiting in practice, 
maybe it just requires to think a bit differently, I don't know.


Alain

On 16.05.16 14:01, Jochen Theodorou wrote:

Which brings my mind back to my question regarding whether it is "good
architecture" to have a reference to the class in ClassInfo (or any
other metadata associated with a class) - again, I mean fundamentally,
independently of whether this is an option for a Groovy 2.4.7 or even
anything before a Groovy 3, because I fear it would likely require to
change several Groovy APIs and internals.


ok, let´s assume the ClassInfo does not reference the class, then as 
soon as you have a MetaClass, you have a reference to the class again. 
If not there, then in the method accessors...






Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Jochen Theodorou

On 16.05.2016 12:46, Alain Stalder wrote:

In order to get a better understanding, I made two configurable changes
in ClassInfo, in a branch from the GROOVY_2_4_6 tag (ClassInfo is still
practically the same in the GROOVY_2_4_X branch):

- -Dgctest.classreftype=(hard|soft|weak|phantom), where hard=as today,
soft=SoftReference
- -Dgctest.cacheclassvalue=(true|false), if true and using ClassValue,
then do not cache it

See here:
https://github.com/jexler/groovy/compare/GROOVY_2_4_6...jexler:f92c2866653208ad68db5580b5bf9febc347fe1d


Compiled Groovy JAR:
https://www.jexler.net/groovy-2.4.6-gctest.jar

[...]

Then I ran a full matrix of tests:


  same loader | use class value | cache class value | hard | soft | weak

  YES | YES | YES   | FAIL | FAIL | FAIL
  YES | YES | NO| FAIL | FAIL | FAIL
  YES | NO  | --|  OK  |  OK* | OK*

  NO  | YES | YES   |  OK  |  OK* | OK*
  NO  | YES | NO|  OK  |  OK* | OK*
  NO  | NO  | --| FAIL |  OK* | OK*


- "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m -cp .
ClassGCTester -cp groovy-2.4.6-gctest.jar:filling/ -parent null -classes
GroovyFilling
- not "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m
-cp .:groovy-2.4.6-gctest.jar ClassGCTester -cp filling/ -parent tester
-classes GroovyFilling
- "use class value" <=> -Dgroovy.use.classvalue=
- "cache class value" <=> -Dgctest.cacheclassvalue=
- "hard"|"soft"|"weak" <=> -Dgctest.classreftype=

* Garbage collection in all cases still only when the limit on Metaspace
or Heap is reached.

So:
- Caching ClassValue or not made no difference.
- Using weak oder soft references did not help when using ClassValue.
- When not using ClassValue, using weak or soft references helped. :)


Even if hard references had been used everywhere, it is only for a 
single iteration. It means unless data leaks into a more global 
structure, it must be collectable. So the non-ClassValue version working 
in this scenario is no sign of correctness for the memory awareness of 
the used structures at all.




Actually the latter is also reflected (as I noticed in retrospect) by
the pull request by John Wagenleitner for "GROOVY-7683 - Memory leak
when using Groovy as JSR-223 scripting language":
https://github.com/apache/groovy/pull/219/files

There a WeakReference is used.


yes, I think that is something we should do.


Which brings my mind back to my question regarding whether it is "good
architecture" to have a reference to the class in ClassInfo (or any
other metadata associated with a class) - again, I mean fundamentally,
independently of whether this is an option for a Groovy 2.4.7 or even
anything before a Groovy 3, because I fear it would likely require to
change several Groovy APIs and internals.


ok, let´s assume the ClassInfo does not reference the class, then as 
soon as you have a MetaClass, you have a reference to the class again. 
If not there, then in the method accessors...


bye Jochen



Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Alain Stalder

Ah, no, not with ClassValue, sorry for the spam...

On 16.05.16 13:37, Alain Stalder wrote:
Ah, I think now I am getting it... In order to cache ClassInfo you 
need a key that identifies the class...


On 16.05.16 12:46, Alain Stalder wrote:
My argument is still the same: ClassInfo (or other assiociated 
metadata) only makes sense if you have your hands on a class (or an 
instance of it) to apply it to. The one who wants to do something 
with the class/instance has it and in principle can pass it down to 
ClassInfo in order to extract whatever is needed. If there is no 
"client" with a class/instance, there is no need to create ClassInfo 
(or similar). And if the class is garbage collected, automatically 
ClassInfo cannot be accessed with such queries any more, and then 
also the JVM bug with ClassValue would no longer affect Groovy, 
ClassValue could be used again by default. 







Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Alain Stalder
Ah, I think now I am getting it... In order to cache ClassInfo you need 
a key that identifies the class...


On 16.05.16 12:46, Alain Stalder wrote:
My argument is still the same: ClassInfo (or other assiociated 
metadata) only makes sense if you have your hands on a class (or an 
instance of it) to apply it to. The one who wants to do something with 
the class/instance has it and in principle can pass it down to 
ClassInfo in order to extract whatever is needed. If there is no 
"client" with a class/instance, there is no need to create ClassInfo 
(or similar). And if the class is garbage collected, automatically 
ClassInfo cannot be accessed with such queries any more, and then also 
the JVM bug with ClassValue would no longer affect Groovy, ClassValue 
could be used again by default. 




Re: Improve Groovy class loading performance and memory management

2016-05-16 Thread Alain Stalder
In order to get a better understanding, I made two configurable changes 
in ClassInfo, in a branch from the GROOVY_2_4_6 tag (ClassInfo is still 
practically the same in the GROOVY_2_4_X branch):


- -Dgctest.classreftype=(hard|soft|weak|phantom), where hard=as today, 
soft=SoftReference
- -Dgctest.cacheclassvalue=(true|false), if true and using ClassValue, 
then do not cache it


See here:
https://github.com/jexler/groovy/compare/GROOVY_2_4_6...jexler:f92c2866653208ad68db5580b5bf9febc347fe1d

Compiled Groovy JAR:
https://www.jexler.net/groovy-2.4.6-gctest.jar

First thing I learned was that you cannot get the value of a 
PhantomReference, it always returns null, by design. From its Javadoc: 
"In order to ensure that a reclaimable object remains so, the referent 
of a phantom reference may not be retrieved: The get method of a phantom 
reference always returns null."


(By the way, this very probably means that the already existing 
PhantomReference myThread in ClassInfo makes no sense.)


Then I ran a full matrix of tests:


 same loader | use class value | cache class value | hard | soft | weak

 YES | YES | YES   | FAIL | FAIL | FAIL
 YES | YES | NO| FAIL | FAIL | FAIL
 YES | NO  | --|  OK  |  OK* | OK*

 NO  | YES | YES   |  OK  |  OK* | OK*
 NO  | YES | NO|  OK  |  OK* | OK*
 NO  | NO  | --| FAIL |  OK* | OK*


- "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m -cp . 
ClassGCTester -cp groovy-2.4.6-gctest.jar:filling/ -parent null -classes 
GroovyFilling
- not "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m 
-cp .:groovy-2.4.6-gctest.jar ClassGCTester -cp filling/ -parent tester 
-classes GroovyFilling

- "use class value" <=> -Dgroovy.use.classvalue=
- "cache class value" <=> -Dgctest.cacheclassvalue=
- "hard"|"soft"|"weak" <=> -Dgctest.classreftype=

* Garbage collection in all cases still only when the limit on Metaspace 
or Heap is reached.


So:
- Caching ClassValue or not made no difference.
- Using weak oder soft references did not help when using ClassValue.
- When not using ClassValue, using weak or soft references helped. :)

Actually the latter is also reflected (as I noticed in retrospect) by 
the pull request by John Wagenleitner for "GROOVY-7683 - Memory leak 
when using Groovy as JSR-223 scripting language": 
https://github.com/apache/groovy/pull/219/files


There a WeakReference is used.

Which brings my mind back to my question regarding whether it is "good 
architecture" to have a reference to the class in ClassInfo (or any 
other metadata associated with a class) - again, I mean fundamentally, 
independently of whether this is an option for a Groovy 2.4.7 or even 
anything before a Groovy 3, because I fear it would likely require to 
change several Groovy APIs and internals.


If using now a WeakReference or SoftReferencefor the class reference in 
ClassInfo instead of a hard reference, you now have to handle the case 
where the class is already null because it has been garbage collected. 
(Actually this is in principle more likely with a WeakReference than 
with a SoftReference, so I would rather tend to favor SoftReference 
because class GC so far only kicks in when a memory limit is reached 
anyway, but likely it makes no difference in practice exactly for the 
same reason. Actually, this may even save the situation, maybe in 
practice you never get the Reference to return null because classes and 
ClassInfo are only garbage collected together when the memory limit is 
reached in a Java VM that does nothing else then, but I am not sure...)


My argument is still the same: ClassInfo (or other assiociated metadata) 
only makes sense if you have your hands on a class (or an instance of 
it) to apply it to. The one who wants to do something with the 
class/instance has it and in principle can pass it down to ClassInfo in 
order to extract whatever is needed. If there is no "client" with a 
class/instance, there is no need to create ClassInfo (or similar). And 
if the class is garbage collected, automatically ClassInfo cannot be 
accessed with such queries any more, and then also the JVM bug with 
ClassValue would no longer affect Groovy, ClassValue could be used again 
by default.


But I don't want to make too much of this.

Using a WeakReference or SoftReference for the class reference in 
ClassInfo would already be step forward, at least no better realistic 
ideas from my side at the moment...


Alain

On 15.05.16 12:37, Jochen Theodorou wrote:

On 15.05.2016 10:3

Re: Improve Groovy class loading performance and memory management

2016-05-15 Thread Jochen Theodorou

On 15.05.2016 10:39, Alain Stalder wrote:

Thanks, that clarifies a lot to me, especially SoftReference.

So with Groovy it is only realistic to have GC of classes (and attached
ClassInfo) kick in once a limit on Metaspace/PermGen (or Heap) is
reached - fine with me, no point to try to "outrun the bear"... :)


well... I do think the ClassValue version should not have this 
behaviour. But for this I think we would have to ensure not to keep any 
references to the ClassValue anywhere in a global strucutre. Not even as 
a WeakReference... PhantomReference would probably be ok... but I find 
the usages for PhantomReferences quite rare...and not fitting here I guess



A general question (current implementation and most likely APIs to keep
aside): Why does ClassInfo need a reference to the class? To me the use
case would be that you have an Groovy object or a Groovy class and want
to do something with it (call a static or instance method, for example),
so you only need to find ClassInfo from the class and then maybe pass
the class temporarily just for doing things, but don't need it a
reference back from ClassInfo.


ClassInfo represents a cached reflective information of a Class, plus 
some more internal stuff. To create that structure you need the Class. 
And if you do not want to do it eager, you need to keep a reference... 
at least till after init. Of course that does not have to be a 
SoftReference.


[...]

This allows, for example, two produce two of the known
"OutOfMemoryError: Metaspace|PermGen" issues with Groovy 2.4.6, as follows.

[...]

good job

bye Jochen



Re: Improve Groovy class loading performance and memory management

2016-05-15 Thread Alain Stalder

Thanks, that clarifies a lot to me, especially SoftReference.

So with Groovy it is only realistic to have GC of classes (and attached 
ClassInfo) kick in once a limit on Metaspace/PermGen (or Heap) is 
reached - fine with me, no point to try to "outrun the bear"... :)


A general question (current implementation and most likely APIs to keep 
aside): Why does ClassInfo need a reference to the class? To me the use 
case would be that you have an Groovy object or a Groovy class and want 
to do something with it (call a static or instance method, for example), 
so you only need to find ClassInfo from the class and then maybe pass 
the class temporarily just for doing things, but don't need it a 
reference back from ClassInfo.


I presume I am simply overlooking something rather obvious, but what 
would be the use case(s)? Or maybe historical reasons?


I have updated ClassGCTester ("version" 2.0.0) such that the class path 
for the URLClassLoader can be specified in more detail.


https://github.com/jexler/classgc
--
Usage: java [java-args] ClassGCTester -cp  -parent 
[tester|null] -classes  [-wait]


  -cp:  Class path for URLClassLoader that will load the test classes.
Directories and JARs separated by ':' or ';'.
Example: .:classes/:libs/groovy-2.4.6.jar
  -parent:  Parent class loader for URLClassLoader that will load the 
test classes:
'tester': ClassGCTester.class.getClassLoader() - i.e. same 
as tester

'null':   null - i.e. no parent
  -classes: Test classes to load from the URLClassLoader in a loop.
Fully qualified class names separated by ':' or ';'.
Example: SampleInDefaultPackage:net.sample.Sample
  -wait:Optional, whether to wait for key pressed before starting 
the test.

Allows to attach external tools from the start.
(Like 'jvisualvm' or 'jstat -gc  
' etc.)

Note that this tool usually prints out its PID for convenience.
--

This allows, for example, two produce two of the known 
"OutOfMemoryError: Metaspace|PermGen" issues with Groovy 2.4.6, as follows.


Directory Structure:
- GroovyGCTester.class
- groovy-2.4.6.jar
- filling/GroovyFilling.class ("42" compiled, for example, but could be 
practically any Groovy class)


If you load Groovy only once with the class loader of ClassGCTester, and 
set groovy.use.classvalue=false (the default) you get the error:


$ java -XX:MaxMetaspaceSize=64m -Xmx512m -Dgroovy.use.classvalue=false 
-cp .:groovy-2.4.6.jar ClassGCTester -cp filling/ -parent tester 
-classes GroovyFilling

--
Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed average average

[...]
  11  81078107  38.3m  55.1m 293.8m 455.5m 
0.433ms 0.909ms
  12  85798579  40.1m  57.9m 298.0m 455.5m 
0.581ms 0.909ms
  13  86658665  40.4m  58.5m 309.5m 455.5m 
0.577ms 0.908ms
  14  94239423  43.3m  63.0m 327.0m 455.5m 
0.545ms 0.940ms

Exception in thread "main" java.lang.OutOfMemoryError: Metaspace
[...]
--

With true GC kicks in at the limit of 64m:
--
Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed average average

[...]
  12  85798579  40.0m  57.9m 300.7m 455.5m 
0.485ms 0.915ms
  13  94249424  43.3m  62.8m 330.0m 455.5m 
0.494ms 0.907ms
  14  9670  19   7.3m  30.0m  21.0m 367.0m 
0.527ms 0.908ms
  15 108991248  12.0m  31.3m 103.2m 367.0m 
0.484ms 0.880ms

[...]
--

If you instead load Groovy with the same URLClassLoader that loads 
GroovyFilling each time, it passes with groovy.use.classvalue=false (the 
default):


$ java -XX:MaxMetaspaceSize=64m -Xmx512m -Dgroovy.use.classvalue=false 
-cp . ClassGCTester -cp groovy-2.4.6.jar:filling/ -parent null -classes 
GroovyFilling

--
Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed average average

[...]
   1 9   9  22.4m  23.1m  26.9m 242.0m 
1.622ms   112.702ms
   218  18  38.1m  39.4m  47.9m 308.0m 
1.419ms   112.346ms
   327  27  53.7m  55.4m 112.4m 308.0m 
1.367ms   111.185ms
   436   5  14.0m  28.0m  41.5m 457.0m 
1.315ms   111.633ms
   547  16  33.0m  37.4m  49.0m 474.5m 
1.242ms   106.555ms

[...]
--

Whereas you get the error with true:
--
[...]
Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed ave

Re: Improve Groovy class loading performance and memory management

2016-05-15 Thread Jochen Theodorou

On 14.05.2016 09:54, Alain Stalder wrote:
[...]

Lazy initialization of MetaClass:

I could well imagine that this makes a noticeable difference, but you
are probably much better able to estimate this offhand.

If it made a noticeable difference, I guess it would also impact heap
consumption due to loaded Groovy classes (less MetaClass instances) to a
similar degree.


less meta classes is always good.


Maybe measure in some way for how many and which classes MetaClass is
actually ever needed/used? Maybe a modified Groovy test version that
somehow records this and then use this Groovy test version in some
"realistic" setups where lots of Groovy classes are loaded and
instantiated? (Just thinking aloud...)


well... as soon as you access a property or call a method, you will most 
likely need a meta class of that class. What could be profit are small 
scripts. In case of Closure you will loose that already, because even if 
it does only return something like 42 and thus does not need to make a 
dynamic call, the call to the method containing that code is called 
dynamically, thus you need the meta class. On the other hand we have 
here ClosureMetaClass, with a smaller footprint exactly for that.



Garbage collection without setting a limit on Metaspace:

For the simple Java test class JavaFilling, the VM collected unused
classes without having to set MaxMetaspaceSize, in the case of
GroovyFilling this was not the case.


because of the use of SoftReference. The problem with using 
WeakReference for example is, that then you will get lots of meta class 
initializations, which can be a big impact on performance. ClassValue is 
supposed to make this better though



I lack experience with garbage collection of "pure" Java classes, so I
am not sure if the behavior observed with Java is only like this for
very simple classes with little dependencies on other loaded classes. If
that was so, I guess there would be very little that could be gained
from trying to change Groovy behavior.


Well, in Java the classes are not referenced but structures as complex 
as we have to use like the table/classinfo/metaclass system. ClassValue 
is suppose to help here.



One approach could be to look closer into that, maybe run ClassGCTester
with some "pure Java" library JARs and load classes from there and
observe Metaspace?


frankly, most Java applications do not even use more than one 
classloader, thus class unloading is of no  question there. That goes 
even as far as not being able to use the standard garbage collector if 
you want classes being unloaded properly. Well, this changed with JDK8. 
Even a JSP web app, does use a very limited amount of classes (but of 
course more class loaders than a standard Java app).



Or maybe approach it the other way: Use a modified Groovy test version
that *does* use a WeakHashMap, just to see if that would make a
difference here and if investing more effort into that direction could
amount to anything?


WeakHashMap cannot handle concurrency. Each and every access would have 
to be synchronized. In a simple application with GUI you can already 
have more than one thread operating potentially causing the creation of 
a meta class.


bye Jochen


Re: Improve Groovy class loading performance and memory management

2016-05-14 Thread Alain Stalder

After reading a few times, some more feedback...

CompileStatic:

Cool that this thread (and maybe ClassGCTester?) has already yielded an 
issue that impacts performance (and I presume might likely be relatively 
straightforward to fix)  :)


Lazy initialization of MetaClass:

I could well imagine that this makes a noticeable difference, but you 
are probably much better able to estimate this offhand.


If it made a noticeable difference, I guess it would also impact heap 
consumption due to loaded Groovy classes (less MetaClass instances) to a 
similar degree.


Maybe measure in some way for how many and which classes MetaClass is 
actually ever needed/used? Maybe a modified Groovy test version that 
somehow records this and then use this Groovy test version in some 
"realistic" setups where lots of Groovy classes are loaded and 
instantiated? (Just thinking aloud...)


Garbage collection without setting a limit on Metaspace:

For the simple Java test class JavaFilling, the VM collected unused 
classes without having to set MaxMetaspaceSize, in the case of 
GroovyFilling this was not the case.


I lack experience with garbage collection of "pure" Java classes, so I 
am not sure if the behavior observed with Java is only like this for 
very simple classes with little dependencies on other loaded classes. If 
that was so, I guess there would be very little that could be gained 
from trying to change Groovy behavior.


One approach could be to look closer into that, maybe run ClassGCTester 
with some "pure Java" library JARs and load classes from there and 
observe Metaspace?


Or maybe approach it the other way: Use a modified Groovy test version 
that *does* use a WeakHashMap, just to see if that would make a 
difference here and if investing more effort into that direction could 
amount to anything? I am not familiar enough with the implementation to 
know if such a test change would be trivial or not, but interested 
enough to take at least a closer look, and I guess also to try to build 
Groovy myself for the first time, because this could be handy sometime 
anyway...


Alain

On 13.05.16 13:03, Jochen Theodorou wrote:

On 13.05.2016 02:22, Alain Stalder wrote:
[...]

Qualitatively this often has the following result in the Java VM:
Metaspace resp. PermGen, and Heap in parallel, just grow until a
configured limit is reached (and note that there is none by default for
Metaspace in Java 8 and later), often only then is it garbage collected.
With Java classes, at least with simple ones, this looks often
different, those appear to be garbage collected much more quickly.

Another qualitative difference is that loading a Groovy class and
instantiating it seems typically to be considerably slower than
instantiating a Java class with similar functionality, even quite
drastically so, more than one would expect even considering the need to
create metadata for dynamic function calls etc.

At least that has been my experience over the past few years.


this is going to be a long mail ;)

so let us make three things to discuss here:

1) Object initialization performance
2) class verification/complexity
3) garbage collection of unused classes

And we have to distinguish here between usages of ClassValue, 
invokedynamic and the traditional version of those... that makes 4 
aspects.


So I will write several things you probably already know, but other 
reading here might not. And even though I simplify a bit, please bear 
with me ;) And first of all, let us talk about the meta class 
system... that mostly targets ClassValue then.


The old version of the meta class system uses a big global table for 
all meta classes, with a class key and ClassInfo as value. In 
ClassInfo we have the meta class, which might be either recreate-able 
(in which case the meta class is soft referenced) or not (in which 
case it is a strong reference). The idea being, that as soon as the 
class can be garbage collected, the ClassInfo can as well, and with it 
the meta class.


Problem 1 here is, the meta class holds a strong reference to the 
class, so if the ClassInfo holds a strong reference to the meta class, 
this entry in the table will never be collected. I mention this only 
for completeness, since you did not set a permanent meta class in your 
test


Problem 2 here is, the code is concurrent, which rules out WeakHashMap 
and forced us to implement our own map.


In the ClassValue version we do not have our own table anymore and let 
the JVM manage this.


To avoid the lookup cost of the meta class, every Groovy class has a 
reference to its ClassInfo. The meta class and ClassInfo are lazy 
initialized well, "populated with actual data" in case of ClassInfo.


Some classes extend GroovyObjectSupport, which does the initialization 
in the constructor already, groovy.lang.Script is one of them. That 
means every time you create an instance of a new script class, you get 
the meta class already, even if the meta class is not used. 

Re: Improve Groovy class loading performance and memory management

2016-05-13 Thread Alain Stalder
First about the bug that classes were loaded only once: This did not 
happen in my case for the GroovyFilling because I used different 
directories for ClassGCTester and GroovyFilling, so the parent of the 
URLClassLoader could not load the test class, but it happened 
(unintentionally) in my tests for JavaFilling which is why I though that 
loading Java classes was so incredibly fast... :(


With that fixed, I got much more reasonable output, about 
0.1ms+0.02ms=0.12ms for loading JavaFilling and about 0.4ms + 0.9ms = 
1.3ms for the original GroovyFilling. So, there is not all that much of 
room for improvement, at least a lot less than I feared, considering 
that Groovy classes do have more capabilities.


$ java -cp . ClassGCTester filling/ JavaFilling

Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed average average
   0 1   1   2.9m   4.8m   2.6m 245.5m 
0.812ms 0.098ms
   1  55772803   7.4m  21.3m  37.9m 177.5m 
0.136ms 0.028ms
   2 136103437   8.2m  25.9m  45.6m 361.0m 
0.114ms 0.024ms
   3 215792836   7.4m  24.0m  38.3m 513.5m 
0.109ms 0.023ms
   4 295822269   6.7m  21.8m  31.6m 733.5m 
0.106ms 0.022ms
   5 377921909   6.3m  20.5m  28.7m 990.5m 
0.104ms 0.022ms
   6 458941441   5.7m  18.6m  22.3m 1257.0m 
0.103ms 0.022ms
   7 540491026   5.2m  17.6m  19.1m 1575.5m 
0.102ms 0.021ms


$ java -XX:MaxMetaspaceSize=64m -Xmx512m -Dgroovy.use.classvalue=true 
-cp .:groovy-2.4.6.jar ClassGCTester filling/ GroovyFilling


Secs Test classes  Metaspace/PermGen Heap   Load time Create 
time
   #loaded  #remainingused committed   used 
committed average average
   0 1   1   6.3m   6.5m  13.4m 245.5m 
0.875ms11.525ms
   1   474 474   9.0m  10.3m  36.5m 245.5m 
0.350ms 1.694ms
   2  13121312  12.2m  15.1m 103.7m 309.5m 
0.265ms 1.224ms
   3  22912291  16.0m  21.0m  87.4m 389.0m 
0.417ms 1.034ms
   4  29772977  18.6m  25.1m 178.4m 389.0m 
0.360ms 0.962ms
   5  40654065  22.8m  31.5m 216.0m 389.0m 
0.307ms 0.905ms
   6  46414641  25.0m  34.9m 164.0m 455.5m 
0.444ms 0.892ms
   7  53145314  27.6m  38.8m 213.2m 455.5m 
0.412ms 0.888ms


What I still observed was that for loading JavaFilling, Metaspace does 
not grow indefinitely even without a limit (see above), but it does for 
GroovyFilling, which I can also understand. Would be nice if it was 
possible that Groovy classes were also collected so quickly, but in 
practice I guess once you know that, you just have to set a reasonable 
maximum for Metaspace, and when you operate a "server environment" you 
have to take care of these kinds of things anyway, I would say.


So, essentially I am quite happy with the results and with how Groovy 
fares :)


I have made a "release" 1.1.0 of ClassGCTester with an added check that 
the test class cannot be loaded from the classpath of ClassGCTester 
alone and with a fix for the display of Metaspace/PermGen (this matches 
now roughly the output of "jstat -gc ..." for MC and MU, resp. the 
equivalents for PermGen), plus updated the readme and examples.


So far for the "broad picture", some feedback regarding your detailed 
analysis hopefully a bit later, I think I will first read it again a few 
more times - very interesting... :)


Alain


On 13.05.16 13:03, Jochen Theodorou wrote:

On 13.05.2016 02:22, Alain Stalder wrote:
[...]

Qualitatively this often has the following result in the Java VM:
Metaspace resp. PermGen, and Heap in parallel, just grow until a
configured limit is reached (and note that there is none by default for
Metaspace in Java 8 and later), often only then is it garbage collected.
With Java classes, at least with simple ones, this looks often
different, those appear to be garbage collected much more quickly.

Another qualitative difference is that loading a Groovy class and
instantiating it seems typically to be considerably slower than
instantiating a Java class with similar functionality, even quite
drastically so, more than one would expect even considering the need to
create metadata for dynamic function calls etc.

At least that has been my experience over the past few years.


this is going to be a long mail ;)

so let us make three things to discuss here:

1) Object initialization performance
2) class verification/complexity
3) garbage collection of unused classes

And we have to distinguish here between usages of ClassValue, 
invokedynamic and the

Re: Improve Groovy class loading performance and memory management

2016-05-13 Thread Jochen Theodorou

On 13.05.2016 02:22, Alain Stalder wrote:
[...]

Qualitatively this often has the following result in the Java VM:
Metaspace resp. PermGen, and Heap in parallel, just grow until a
configured limit is reached (and note that there is none by default for
Metaspace in Java 8 and later), often only then is it garbage collected.
With Java classes, at least with simple ones, this looks often
different, those appear to be garbage collected much more quickly.

Another qualitative difference is that loading a Groovy class and
instantiating it seems typically to be considerably slower than
instantiating a Java class with similar functionality, even quite
drastically so, more than one would expect even considering the need to
create metadata for dynamic function calls etc.

At least that has been my experience over the past few years.


this is going to be a long mail ;)

so let us make three things to discuss here:

1) Object initialization performance
2) class verification/complexity
3) garbage collection of unused classes

And we have to distinguish here between usages of ClassValue, 
invokedynamic and the traditional version of those... that makes 4 aspects.


So I will write several things you probably already know, but other 
reading here might not. And even though I simplify a bit, please bear 
with me ;) And first of all, let us talk about the meta class system... 
that mostly targets ClassValue then.


The old version of the meta class system uses a big global table for all 
meta classes, with a class key and ClassInfo as value. In ClassInfo we 
have the meta class, which might be either recreate-able (in which case 
the meta class is soft referenced) or not (in which case it is a strong 
reference). The idea being, that as soon as the class can be garbage 
collected, the ClassInfo can as well, and with it the meta class.


Problem 1 here is, the meta class holds a strong reference to the class, 
so if the ClassInfo holds a strong reference to the meta class, this 
entry in the table will never be collected. I mention this only for 
completeness, since you did not set a permanent meta class in your test


Problem 2 here is, the code is concurrent, which rules out WeakHashMap 
and forced us to implement our own map.


In the ClassValue version we do not have our own table anymore and let 
the JVM manage this.


To avoid the lookup cost of the meta class, every Groovy class has a 
reference to its ClassInfo. The meta class and ClassInfo are lazy 
initialized well, "populated with actual data" in case of ClassInfo.


Some classes extend GroovyObjectSupport, which does the initialization 
in the constructor already, groovy.lang.Script is one of them. That 
means every time you create an instance of a new script class, you get 
the meta class already, even if the meta class is not used. let us have 
a small look at the bytecode of the constructors (x.groovy which only 
returns 42) of such a script:



  // access flags 0x1
  public ()V
ALOAD 0
INVOKESPECIAL groovy/lang/Script. ()V
   L0
INVOKESTATIC x.$getCallSiteArray 
()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
ASTORE 1
   L1
RETURN

  // access flags 0x1
  public (Lgroovy/lang/Binding;)V
   L0
INVOKESTATIC x.$getCallSiteArray 
()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
ASTORE 2
ALOAD 0
ALOAD 1
INVOKESPECIAL groovy/lang/Script. (Lgroovy/lang/Binding;)V
   L1
RETURN


as you can see here, there is a getCallSiteArray call in here and direct 
methods calls for initialization. The getCallSiteArray call in here is 
actually surplus, but it is difficult to decide in the compiler early on 
if we will need it or not, because the callsite array here is basically 
an array wrapper, which supplies method names for method calls as well 
as specialized code for doing dynamic method calls. Would the 
constructor for example contain a method call to a method foo(), you 
would see some thing like getting the CallSite and executing "call" on it.


Why are we doing static method calls in the constructor here? Because in 
several cases the compiler optimizes the dynamic call away here. 
Basically you cannot provide a super constructor in Groovy, which means 
only the one statically defined does count. And as long as the given 
types are matching enough for the compiler to decide, we can create 
direct method calls.


So what does this mean for the object initialization performance in 
cases of Scripts so far? We eagerly initialize ClassInfo and MetaClass 
for each script. That means a lookup by reflection of the complete 
structure of the class and its super classes... which is cached, so the 
super class lookup will be much lower the next time. initializing the 
first meta class will also initialize the extension method lookup, which 
can also take a bit time... again, a one-time cost here.


The invokedynamic version will again use the jdk internals for the 
callsites, thus $getCallSiteArray is never called nor are