On 14.01.26 15:32, Gianluca Sartori wrote:
[...]
*
*Groovy 5* moved most dynamic calls to |CallSite| caching through
|indy|.
correct, though it was before in the equivalent code for the bytecode
generation
*
*Groovy 3* often inlined certain calls more aggressively, sometimes
relying on slower reflection but faster in microbenchmarks due to
simpler call chains.
which type of calls though? It was faster in some micro-benchmarks
because we had a full primitive path parallel to the non-primitive path
maybe. This worked only with unchanged meta classes, so not sure Grails
had much of it. Which also shows a big difference, a micro-benchmark is
not an app and you have to know what you actually test
*
*Result:* |invokedynamic| can introduce overhead for short-lived
calls or highly polymorphic code because of frequent call site
relinking.
that would have impacted performance on non-indy as well
Key reasons for slowness:
1.
*Polymorphic call sites:* If your code calls many different methods
dynamically at the same call site, the |invokedynamic| bootstrap has
to relink repeatedly.
2.
*CallSite cache invalidation:* Changes to the meta-class or dynamic
method addition can invalidate call sites.
as I said, that counts for both variants.
3.
*Boxing/unboxing overhead:* Primitive-heavy code may suffer due to
dynamic dispatch.
well, that is not supposed to be an issue in indy. It was in the old
callsite code in Groovy 3
4.
*JIT warmup issues:* The JVM may take longer to optimize
|indy|-based dispatch.
that is actually a factor. But a proper micro-benchmark will not measure
warmup times, right?
------------------------------------------------------------------------
*2. Optimization Strategies for a Next Groovy Version*
*A. Improve CallSite Caching*
*
Implement *multi-level caching* for polymorphic call sites.
*
Use *polymorphic inline caches (PICs)* like modern JavaScript
engines (V8) to avoid relinking.
basically agree
*
Avoid global call site invalidations when meta-classes are updated —
make call site invalidation more local.
how?
*B. Reduce Relinking*
*
Track method signatures more strictly. Many |invokedynamic| relinks
happen because Groovy tries to handle any dynamic call, even when
the call target is stable.
True. If you call foo(Object) with a String and then with an Integer it
will cause relinking, even though it is not required
*
Consider *specialized bootstrap methods* for common call patterns:
o
e.g., frequent calls to |String| methods, |List|/|Map| operations.
on what are we actually saving here?
*C. Optimize Primitive Handling*
*
Introduce *primitive specialization* for arithmetic and collection
operations.
*
Reduce boxing/unboxing by generating specialized call site versions
for primitives (like what Kotlin/JVM or Scala do with inline functions).
we have
*D. Optional Static Call Optimization*
*
Provide *hybrid static/dynamic dispatch*:
o
Use static compilation (|@CompileStatic|) when possible.
on the compiler side directly? Well... when is it possible?
o
Use a *profiling-guided JIT* to replace call sites with direct
method handles if a single target dominates.
well.. and how does the invalidation work if the target is suddenly
incorrect?
*E. Bytecode Generation Improvements*
*
Investigate how Groovy 5 generates |invokedynamic| bytecode:
o
Avoid unnecessary |Object| casts.
o
Combine multiple small dynamic calls into a single bootstrap
call to reduce overhead.
Have that afaik
*
Possibly generate *direct method handles* for commonly called Groovy
methods (|size()|, |get()|, etc.).
see above
*F. JIT-Friendly Bootstrap*
*
Groovy could provide *simpler bootstrap methods* to allow JVM JIT
inlining:
o
Reduce bootstrap method complexity to help HotSpot optimize the
call site faster.
not bootstrap complexity, but the resulting handle should have as little
complexity in there as possible
------------------------------------------------------------------------
*3. Micro-Optimizations for Library Authors*
If you are writing a library or code in Groovy that must be fast:
1.
*Prefer static types* whenever possible — even without
|@CompileStatic|, type hints help.
which leads to more casting... so no.
2.
*Use |@CompileStatic|* selectively for hot loops.
possibly
3.
*Avoid meta-class changes* at runtime in performance-critical code.
well, they is currently no way to protect an area against meta class changes
4.
*Cache dynamic lookups manually* for very hot methods.
not understood
5.
*Use primitive arrays* instead of boxed lists when dealing with numbers.
possibly.
6.
*Minimize polymorphism at call sites* — repeated calls to the same
method is much faster than alternating between multiple methods.
correct
------------------------------------------------------------------------
*4. Experimental Ideas for Groovy 6+*
*
*CallSite specialization per type* (like Truffle/Graal dynamic
languages).
I would like to make a MethodHandle part of the MetaMethod. I guess that
goes in that direction.
*
*Inline small closures automatically* at compile-time.
It is not only the code of the Closure, it is also the code of the
Closure handling method. in i=0; n.times{i++} it does not actually help
to "inline" i++, because we would have to inline it to the times method.
Instead we have to inline times as well of course with the danger, that
if we change times, it will not be reflected without recompilation
*
*Profile-guided call site replacement*: replace |invokedynamic|
calls with direct |MethodHandle| or static calls at runtime if
profiling shows a stable target.
I cannot replace it with static calls
*
*Better JIT feedback*: provide hints to HotSpot that certain call
sites are monomorphic or polymorphic.
How?
bye Jochen