jamesfredley commented on issue #15293:
URL: https://github.com/apache/grails-core/issues/15293#issuecomment-3769803396
Classic Call Site (Non-Indy) - Key Characteristics
1. Call site is replaced on first call (CallSiteArray.defaultCall →
createCallSite → replaceCallSite)
- The call site object in the array is replaced with a specialized
version (e.g., PojoMetaMethodSite)
- This specialized site caches: receiver class, metaclass, method, and
expected parameter classes
2. Guard check is simple inline code (see PojoMetaMethodSite.checkCall):
return receiver.getClass() == metaClass.getTheClass() // receiver
class matches
&& checkPojoMetaClass() // metaclass
version unchanged
&& MetaClassHelper.sameClasses(params, args); // argument
classes match
3. On guard failure: falls back to CallSiteArray.defaultCall which will
re-resolve and replace the call site again
4. Direct method invocation: metaMethod.doMethodInvoke(receiver, args) or
direct Method.invoke
Indy Call Site - Key Differences
1. Uses MutableCallSite with MethodHandle target - more complex structure
2. Guards are MethodHandle chains: MethodHandles.guardWithTest(test, handle,
fallback)
- Each guard adds overhead in the method handle chain
3. Multiple guards are chained: metaclass guard → switchpoint guard →
argument type guards
4. Cache is a LinkedHashMap with soft references - synchronized access
5. On guard failure: goes through selectMethod which does full method
resolution
Key Performance Differences
The classic approach has several advantages:
1. Simple polymorphic dispatch: If receiver class changes, the site is just
replaced with a new specialized one - no complex cache lookup
2. No MethodHandle overhead: Direct virtual calls vs method handle chains
3. Simpler guard checks: Inline if statements vs MethodHandles.guardWithTest
chains
4. Per-call-site specialization: Each call site becomes optimized for its
most common case
PojoMetaMethodSite.java
When the guard fails (line 59), it calls CallSiteArray.defaultCall(this,
receiver, args) which will:
1. Call createCallSite
2. Create a NEW specialized call site for the new receiver/args types
3. Replace the old call site in the array with the new one
4. Invoke the new call site
This means for polymorphic call sites, the classic approach constantly
replaces the call site - it's monomorphic but "thrashes" between different
specializations. This is actually less efficient than a proper polymorphic
inline cache (PIC), but the overhead of thrashing is apparently lower than
indy's guard-chain overhead!
What makes indy's guards more expensive:
IndyGuardsFiltersAndSignatures.java
Summary: Why Non-Indy is 5-6x Faster
Classic Call Site Approach (Non-Indy)
```
┌─────────────────────────────────────────────────────────────┐
│ CallSiteArray │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │CallSite 0│ │CallSite 1│ │CallSite 2│ ... │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Specialized Specialized Specialized │
│ (replaced on (replaced on (replaced on │
│ first call) first call) first call) │
└─────────────────────────────────────────────────────────────┘
```
Flow:
1. call(receiver, args)
2. checkCall() → simple inline if-checks
3. If pass: invoke() → direct method call
4. If fail: CallSiteArray.defaultCall()
→ create NEW specialized site
→ REPLACE in array
→ invoke new site
Key characteristics:
- Monomorphic but adaptive: Each site is specialized for ONE type, replaced
when type changes
- Simple guard checks: Inline Java if statements
- Direct invocation: metaMethod.invoke() or Method.invoke()
- No method handle overhead: Pure virtual dispatch
Indy Call Site Approach
```
┌─────────────────────────────────────────────────────────────┐
│ CacheableCallSite │
│ target: MethodHandle (guarded chain) │
│ fallbackTarget: MethodHandle → selectMethod │
│ lruCache: LinkedHashMap<String, SoftRef<MHWrapper>> │
│ latestClassName: volatile String (fast-path) │
└─────────────────────────────────────────────────────────────┘
```
Guard Chain:
```
┌──────────────────────────────────────────────────────────────┐
│ guardWithTest(metaclassGuard, │
│ guardWithTest(switchPointGuard, │
│ guardWithTest(sameClassesGuard, │
│ actualMethodHandle, │
│ fallback), │
│ fallback), │
│ fallback) │
└──────────────────────────────────────────────────────────────┘
```
Flow:
1. fromCache(callSite, sender, methodName, ...)
2. buildCacheKey(arguments) → String concatenation
3. synchronized(lruCache) → cache lookup
4. If miss: fallback() → full method resolution via Selector
5. On cache hit: mhw.getDirectMethodHandle().invokeExact(args)
→ But guards still in chain!
6. If guard fails: fallback to selectMethod()
Key problems:
1. Method handle chains are slow: Each guardWithTest adds overhead
2. Cache lookup overhead: Even on fast-path, involves volatile reads and
equals()
3. String concatenation for cache keys: buildCacheKey() creates strings
4. Synchronized map access: LRU cache requires synchronization
5. Guard failures cascade: One failed guard triggers full fallback
Why the Difference is So Large
For polymorphic call sites (like collection operations on different domain
types):
```
| Aspect | Classic | Indy |
|--------|---------|------|
| Guard check | Inline if (class == expected) | MethodHandle chain traversal
|
| On guard fail | Replace site, immediate re-invoke | Full selectMethod()
path |
| Cache lookup | None (site IS the cache) | Map lookup + soft ref deref |
| Method invoke | Direct virtual call | MethodHandle.invokeExact() |
| Memory allocation | Minimal | String keys, MH adapters |
```
The classic approach "thrashes" by constantly replacing the call site, but
each replacement is cheap. The indy approach tries to be smarter with caching
but the overhead of the mechanism exceeds the benefit.
Potential Improvements for Indy
1. Polymorphic inline cache (PIC): Instead of single guarded target,
maintain multiple targets (like V8's PICs)
2. Megamorphic fallback: After threshold, switch to unguarded dispatch via
metaclass
3. Simpler guards: Receiver-only guard for non-overloaded methods
4. Remove guard chain: Use computed goto / tableswitch for type dispatch
5. JIT-friendly patterns: Structure code to help HotSpot optimize
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]