Hi Jonny,

I've used chatGPT just to raise questions more than give a solution. I am
unfortunately not in the position to work on it, I don't have experience on
language development (I am sure the Assembly compiler I developed when I
was in High School does not qualify).

At the moment the role I can play is to make sure we don't forget about
performance, because what's happening is that we are degrading instead of
improving.

I am aware that this is not an easy task and I don't want to sound rude
It's just that I'm writing a bit in a hurry.

Gianluca Sartori
--
https://dueuno.com


On Thu, 15 Jan 2026 at 22:20, Jonny <[email protected]> wrote:

> Before we go back and forth too much with Jochen fact-checking GPT output,
> may I recommend Gianluca try giving the agent a laboratory to work with?
> See https://brianlovin.com/writing/give-your-agent-a-laboratory (first
> example prompt in that doc is even about performance, though there'd
> obviously be tweaks since you're dealing with a language, not a backend).
>
> Regardless of the LLM, making the agent build the kinds of performance
> tests and investigate their output is the only way you're likely to get
> meaningful output. Otherwise, you are indeed in the predictive text spin
> cycle. It does require more time, but it's also the only way you're likely
> to get coherent, usable output.
>
> Best,
>
> Jonny
>
> On Wed, Jan 14, 2026 at 2:57 PM Jochen Theodorou <[email protected]>
> wrote:
>
>> On 14.01.26 15:32, Gianluca Sartori wrote:
>> [...]
>> >   *
>> >
>> >     *Groovy 5* moved most dynamic calls to |CallSite| caching through
>> >     |indy|.
>>
>> correct, though it was before in the equivalent code for the bytecode
>> generation
>>
>> >   *
>> >
>> >     *Groovy 3* often inlined certain calls more aggressively, sometimes
>> >     relying on slower reflection but faster in microbenchmarks due to
>> >     simpler call chains.
>>
>> which type of calls though? It was faster in some micro-benchmarks
>> because we had a full primitive path parallel to the non-primitive path
>> maybe. This worked only with unchanged meta classes, so not sure Grails
>> had much of it. Which also shows a big difference, a micro-benchmark is
>> not an app and you have to know what you actually test
>>
>> >   *
>> >
>> >     *Result:* |invokedynamic| can introduce overhead for short-lived
>> >     calls or highly polymorphic code because of frequent call site
>> >     relinking.
>>
>> that would have impacted performance on non-indy as well
>>
>>
>> > Key reasons for slowness:
>> >
>> >  1.
>> >
>> >     *Polymorphic call sites:* If your code calls many different methods
>> >     dynamically at the same call site, the |invokedynamic| bootstrap has
>> >     to relink repeatedly.
>> >
>> >  2.
>> >
>> >     *CallSite cache invalidation:* Changes to the meta-class or dynamic
>> >     method addition can invalidate call sites.
>>
>> as I said, that counts for both variants.
>>
>> >  3.
>> >
>> >     *Boxing/unboxing overhead:* Primitive-heavy code may suffer due to
>> >     dynamic dispatch.
>>
>> well, that is not supposed to be an issue in indy. It was in the old
>> callsite code in Groovy 3
>> >  4.
>> >
>> >     *JIT warmup issues:* The JVM may take longer to optimize
>> >     |indy|-based dispatch.
>>
>> that is actually a factor. But a proper micro-benchmark will not measure
>> warmup times, right?
>>
>> > ------------------------------------------------------------------------
>> >
>> >
>> >       *2. Optimization Strategies for a Next Groovy Version*
>> >
>> >
>> >         *A. Improve CallSite Caching*
>> >
>> >   *
>> >
>> >     Implement *multi-level caching* for polymorphic call sites.
>> >
>> >   *
>> >
>> >     Use *polymorphic inline caches (PICs)* like modern JavaScript
>> >     engines (V8) to avoid relinking.
>>
>> basically agree
>>
>> >   *
>> >
>> >     Avoid global call site invalidations when meta-classes are updated —
>> >     make call site invalidation more local.
>>
>> how?
>>
>> >         *B. Reduce Relinking*
>> >
>> >   *
>> >
>> >     Track method signatures more strictly. Many |invokedynamic| relinks
>> >     happen because Groovy tries to handle any dynamic call, even when
>> >     the call target is stable.
>>
>> True. If you call foo(Object) with a String and then with an Integer it
>> will cause relinking, even though it is not required
>>
>> >   *
>> >
>> >     Consider *specialized bootstrap methods* for common call patterns:
>> >
>> >       o
>> >
>> >         e.g., frequent calls to |String| methods, |List|/|Map|
>> operations.
>>
>> on what are we actually saving here?
>>
>> >         *C. Optimize Primitive Handling*
>> >
>> >   *
>> >
>> >     Introduce *primitive specialization* for arithmetic and collection
>> >     operations.
>> >
>> >   *
>> >
>> >     Reduce boxing/unboxing by generating specialized call site versions
>> >     for primitives (like what Kotlin/JVM or Scala do with inline
>> functions).
>>
>> we have
>>
>> >         *D. Optional Static Call Optimization*
>> >
>> >   *
>> >
>> >     Provide *hybrid static/dynamic dispatch*:
>> >
>> >       o
>> >
>> >         Use static compilation (|@CompileStatic|) when possible.
>>
>> on the compiler side directly? Well... when is it possible?
>>
>> >       o
>> >
>> >         Use a *profiling-guided JIT* to replace call sites with direct
>> >         method handles if a single target dominates.
>>
>> well.. and how does the invalidation work if the target is suddenly
>> incorrect?
>>
>> >         *E. Bytecode Generation Improvements*
>> >
>> >   *
>> >
>> >     Investigate how Groovy 5 generates |invokedynamic| bytecode:
>> >
>> >       o
>> >
>> >         Avoid unnecessary |Object| casts.
>> >
>> >       o
>> >
>> >         Combine multiple small dynamic calls into a single bootstrap
>> >         call to reduce overhead.
>>
>> Have that afaik
>>
>> >   *
>> >
>> >     Possibly generate *direct method handles* for commonly called Groovy
>> >     methods (|size()|, |get()|, etc.).
>>
>> see above
>>
>> >         *F. JIT-Friendly Bootstrap*
>> >
>> >   *
>> >
>> >     Groovy could provide *simpler bootstrap methods* to allow JVM JIT
>> >     inlining:
>> >
>> >       o
>> >
>> >         Reduce bootstrap method complexity to help HotSpot optimize the
>> >         call site faster.
>>
>> not bootstrap complexity, but the resulting handle should have as little
>> complexity in there as possible
>>
>> > ------------------------------------------------------------------------
>> >
>> >
>> >       *3. Micro-Optimizations for Library Authors*
>> >
>> > If you are writing a library or code in Groovy that must be fast:
>> >
>> >  1.
>> >
>> >     *Prefer static types* whenever possible — even without
>> >     |@CompileStatic|, type hints help.
>>
>> which leads to more casting... so no.
>>
>> >  2.
>> >
>> >     *Use |@CompileStatic|* selectively for hot loops.
>>
>> possibly
>>
>> >  3.
>> >
>> >     *Avoid meta-class changes* at runtime in performance-critical code.
>>
>> well, they is currently no way to protect an area against meta class
>> changes
>>
>> >  4.
>> >
>> >     *Cache dynamic lookups manually* for very hot methods.
>>
>> not understood
>>
>> >  5.
>> >
>> >     *Use primitive arrays* instead of boxed lists when dealing with
>> numbers.
>>
>> possibly.
>>
>> >  6.
>> >
>> >     *Minimize polymorphism at call sites* — repeated calls to the same
>> >     method is much faster than alternating between multiple methods.
>>
>> correct
>>
>> > ------------------------------------------------------------------------
>> >
>> >
>> >       *4. Experimental Ideas for Groovy 6+*
>> >
>> >   *
>> >
>> >     *CallSite specialization per type* (like Truffle/Graal dynamic
>> >     languages).
>>
>> I would like to make a MethodHandle part of the MetaMethod. I guess that
>> goes in that direction.
>>
>> >   *
>> >
>> >     *Inline small closures automatically* at compile-time.
>>
>> It is not only the code of the Closure, it is  also the code of the
>> Closure handling method. in i=0; n.times{i++} it does not actually help
>> to "inline" i++, because we would have to inline it to the times method.
>> Instead we have to inline times as well of course with the danger, that
>> if we change times, it will not be reflected without recompilation
>>
>> >   *
>> >
>> >     *Profile-guided call site replacement*: replace |invokedynamic|
>> >     calls with direct |MethodHandle| or static calls at runtime if
>> >     profiling shows a stable target.
>>
>> I cannot replace it with static calls
>>
>> >   *
>> >
>> >     *Better JIT feedback*: provide hints to HotSpot that certain call
>> >     sites are monomorphic or polymorphic.
>>
>> How?
>>
>> bye Jochen
>>
>>

Reply via email to