blackdrag commented on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic URL: https://github.com/apache/groovy/pull/1135#issuecomment-570829416 frankly I think I am missing something completely right now. You say setTarget is causing the bad performance... there is a hit involved with that yes, but in my experience creating all those MethodTypes is just as bad. Maybe that part got improved. In my opinion your implementation should be much slower now for the regular case... Especially you hitting the cache every time should cause some bad effects... I guess there is only testing.. then let us call cs the normal groovy callsite caching as we have, indy1 the current indy on master, indy2 master with your changes including the removal of the ThreadLocal. Also better would be to run multiple times and only get the best ones (using java11u28) ```Java def m(i){1} for (int i = 0; i < 100_000_000; i++) { if (m(i)!=1) throw new Error("??") } ``` I run this up to 20 times to get the best stable time CS: ~510ms indy1: ~256ms indy2: ~12680ms which more or less shows what I expected. The effect of indy1 being faster than CS, is because the JVM actually has more options to optimize in indy and the guards here are very simple. With more parameters in the call indy would loose performance compared to CS. For indy2 I did expect it to be worse than indy1, but actually not that much. On the other hand I was already expecting form the code, that going to the cache every time is going to hurt. ```Java def same(String obj) { return obj } def same(int obj) { return obj } def same(float obj) { return obj } for (int i = 0; i < 100_000; i++) { [1, 1.0f, '1.0'].each { same(it) } } ``` CS: ~64ms indy1: ~2250ms indy2: ~2000ms This was the case I was really curious about. There is an improvement, but too small yet. So I did an experiment... ```diff @@ -883,7 +884,7 @@ public abstract class Selector { if (handle == null) return; if (!cache) return; - MethodHandle fallback = makeFallBack(callSite, sender, name, callType.ordinal(), targetType, safeNavigationOrig, thisCall, spread); + MethodHandle fallback = callSite.getTarget(); // special guards for receiver if (receiver instanceof GroovyObject) { ``` which means instead of always falling back to the select method I instead reuse the existing handle in the callsite as fallback. That means, that each select call will cause a bigger handle, the newest on top, having the older as fallback. If I then have a callsite [1, 1.0f, '1.0'].each { same(it) } I will get a one handle for the string, which falls back to the float, which falls back to the int, which falls back to the select method. I did not test this for being correct or anything, and even if correct it is still having overhead it does not need, but well... let's compare... In the first test the performance of my version and the old indy version are about the same, because the handle, that is produced is the same. Anything else would have been strange. In the second test I get 65ms, so roughly the same as CS. Sure, there may be a deopt because of setTarget, which happens in the inner loop 3 times. But after that the callsite is stable and has still 99999 iterations to run and optimize. I was wondering why your versions is so extremely slow and I think it is because of the receiver based cache you use. Sure you cache the handle, but the key is the receiver. ```Java for (int i = 0; i < 1_000_000; i++) { [1, 1.0f, '1.0'].each { it.toString() } } ``` In this variant the receiver will change, giving your caching a chance to actually do its work and not fail with cache misses all the time CS: ~770ms indy1: ~14600ms (I stopped after 4 iterations) indy2: ~639ms indy3: ~215ms (static compiled Groovy: ~110ms) Here I can see your factor 3 you talked about earlier. And the indy3 version shows that there is even more to gain. One last case I´d like to showcase ```Java class Cat { static m(Script s){println "x"} } def m(){} for (int i = 0; i < 100_000; i++) { use(Cat) { m() } } ``` In this version we have an category active. This category is not even going to be used, but the effect is really bad: CS: starts with ~180ms, ends with ~2000ms indy1: starts with ~2500ms, ends with ~7000ms indy2: starts with ~2300ms, ends with ~6650ms indy3: StackOverflowError I consider indy1 and indy2 being about equal here, but you may ask why there is a StackOverflowError for indy3. That is because at least one of the guards (I think it is the exception guard, which also degrades performance a lot) causes additional stack usage. That is already the case for the handle we produce in the selector. Older versions of Java did still show this, it seems in Java11 this is now hidden... well.. considering https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8016334 this is the case for a much longer time, but with -XX:+UnlockDiagnosticVMOptions -XX:+ShowHiddenFrames you can make them visible again: ``` at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1027) at java.base/java.lang.invoke.LambdaForm$DMH/0x00000008002a4440.invokeInterface(LambdaForm$DMH:1000029) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bf040.invoke(LambdaForm$MH:1000027) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002be840.invoke(LambdaForm$MH:1000059) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002be040.guardWithCatch(LambdaForm$MH:1000040) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bc840.invoke(LambdaForm$MH:1000113) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002b4840.invokeExact_MT(LambdaForm$MH:1000019) at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232) at java.base/java.lang.invoke.LambdaForm$DMH/0x0000000800294840.invokeStatic(LambdaForm$DMH:1000026) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bfc40.invoke(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000094) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bd840.reinvoke(LambdaForm$MH:1000020) at java.base/java.lang.invoke.LambdaForm$MH/0x00000008002bdc40.guard(LambdaForm$MH:1000074) at java.base/java.lang.invoke.Invokers$Holder.linkToCallSite(Invokers$Holder:1000014) at simple5$_run_closure1.doCall(simple5.groovy:11) ``` This part of the tracer is from indy in master - the indy3 version is really really big. What is most likely happening though is that the handle is grown for each method call. So I think it is an actual bug in the indy3 version, but I did not claim it is production ready code either ;)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services