jamesfredley commented on issue #15293:
URL: https://github.com/apache/grails-core/issues/15293#issuecomment-3720854215
I had Grok analyze those two 100MB logs.
## application startup and initial data loading log
### Analysis of the Indy Logging Output
The provided log (from `performance-test-app.log`) captures verbose
INFO-level output from Groovy's invokedynamic (indy) system during the
execution of the `/performanceTest/benchmark` endpoint in the Grails 7
performance regression app. This logging was enabled via
`-Dgroovy.indy.logging=true` and `-Dgroovy.indy.debug=true`, as per your
command. The log is extensive (truncated in the query but fully processed via
tools), showing repeated patterns of indy bootstrap operations. Below, I break
down key insights, why callsites remain uncached, and implications for
performance.
#### Key Statistics from the Log
- **Total "call site stays uncached" occurrences**: 1497. This indicates
that nearly every indy callsite bootstrapped during the test failed to cache,
leading to repeated overhead.
- **Invocation types**:
- CAST: 1306 (dominant; these are type casts, e.g., casting domain objects
to interfaces like `DirtyCheckable` for change tracking).
- METHOD: 148 (method calls, e.g., `hasChanged`, `getEntry`, or
GORM-specific methods like
`org_grails_datastore_gorm_GormEntityDirtyCheckabletrait$super$hasChanged`).
- **Guards added**: These are conditions checked before invoking the target.
Common ones include:
- `added meta class equality check` (checks if the metaclass hasn't
changed; frequent due to Groovy's dynamic MOP).
- `added switch point guard` (JVM-level invalidation hook for changes like
class redefinition).
- `added same-class argument check` (ensures argument types match expected
classes).
- `added GroovyRuntimeException unwrapper` (handles exception wrapping in
Groovy calls).
- Less common: `added class equality check`, `added null argument check at
pos 0`, `added category-in-current-thread-guard for category method` (for
thread-local categories).
- **Target types**: Primarily GORM-related, such as:
- `(DirtyCheckable)FieldHelper` (most common; for dirty checking in domain
objects).
- `(Object)DirtyCheckable`, `(GormValidateable)FieldHelper`,
`(Validator)boolean`, `(Object)Constraint`.
- Others like `(Object)GormInstanceApi`, `(Object)Validator`,
`(Object)Errors`, reflecting validation, persistence, and trait access.
- **Argument classes**: Focused on domain entities and GORM internals:
- Domain: `grails7.performance.Company`, `Department`, `Employee`,
`Project`, `Task`, `Milestone`.
- GORM/Hibernate: `org.grails.orm.hibernate.HibernateGormInstanceApi`,
`org.grails.datastore.gorm.validation.constraints.*` (e.g., `BlankConstraint`,
`MaxSizeConstraint`),
`org.hibernate.engine.internal.StatefulPersistenceContext`,
`org.grails.datastore.mapping.validation.ValidationErrors`.
The log starts at app startup/boot but explodes during the benchmark loop,
where GORM operations (e.g., creating/saving/querying interconnected domains
like Company → Department → Employee → Project → Task → Milestone) trigger
thousands of dynamic dispatches.
#### Why Do Call Sites Stay Uncached?
In Groovy 4's indy system, callsites are bootstrapped the first time an
invocation occurs (via `IndyInterface.bootstrap`). Guards are added to allow
caching if conditions (e.g., types, metaclasses) remain stable. However, if the
selector (in `org.codehaus.groovy.vmplugin.v8.Selector$MethodSelector`)
determines the callsite is too volatile, it "stays uncached," falling back to
full method selection on every call—similar to legacy call site behavior but
without the legacy caching benefits.
Reasons for uncaching based on this log:
- **High polymorphism and dynamic behavior**: GORM domains use traits (e.g.,
`DirtyCheckable`, `GormEntity`) that involve dynamic method resolution. Each
new instance (created per benchmark iteration) may have slight variations in
metaclasses or state, triggering guards like "meta class equality check." If
polymorphism exceeds thresholds (e.g., multiple subtypes or changes), caching
is skipped to avoid constant invalidations.
- **CAST-heavy workload**: 87% of invocations are casts (e.g., to
`DirtyCheckable` for change detection during saves). These are common in GORM
for property access/validation but aren't cacheable if types vary or if they're
one-off (low invocation count per unique site).
- **No optimizations triggered**: With your potential tunings (e.g., low
`groovy.indy.optimize.threshold`), sites should optimize after a few calls, but
the log shows no "call site optimized" or caching success. This suggests
inherent volatility—e.g., metaclass modifications from GORM plugins or
categories.
- **Benchmark design**: The test creates fresh objects each iteration (e.g.,
new `Task@...`, `Company@...`), leading to new callsites per run. If
types/arguments differ slightly, guards fail, preventing caching.
- **Source code insight**: From Groovy codebase (`IndyInterface.java`),
uncaching occurs if `doCallSiteTargetSet` decides the target isn't stable
(e.g., abstract methods, categories, or high fallback risk). Here, many targets
are abstract/interfaces (e.g., `PersistenceContext.getEntry`), which are
dynamic.
This uncaching is a key contributor to the 2-4x slowdown in Grails 7 (indy)
vs. legacy modes, as each invocation incurs bootstrap-like overhead without JVM
JIT benefits from repeated stable calls.
#### Performance Implications
- **Overhead**: Uncached sites mean slower dispatch (full MOP lookup each
time) vs. legacy's pre-cached call sites. In data-intensive loops like this
benchmark (e.g., 50 iterations of GORM saves/queries), this amplifies
regressions.
- **Comparison to legacy**: Legacy Groovy used explicit call site objects
with caching; indy relies on JVM for optimization but fails here due to
dynamics. Flame graphs from the original issue (#15293) show time in
`vmplugin.v8` paths, matching this log.
- **Why more logging than expected?**: You mentioned limited output
initially, but this log is bursty—startup has some, but the test triggers
thousands as new invocations hit. If you ran with default warmup/iterations,
expect ~1.3k CASTs alone.
#### Recommendations for Improvements
To make indy faster (approaching legacy):
1. **Static compilation**: Add `@CompileStatic` to domain classes (e.g.,
`Company.groovy`), the `PerformanceTestController`, or services. This bypasses
dynamic dispatch for casts/methods, reducing indy reliance.
2. **Tune thresholds aggressively**: Set `-Dgroovy.indy.optimize.threshold=0
-Dgroovy.indy.fallback.threshold=0` to force early optimization and disable
fallbacks. Rerun the benchmark and check if "stays uncached" decreases.
3. **Disable indy**: For testing, add to `build.gradle`:
```
tasks.withType(GroovyCompile) {
groovyOptions.optimizationOptions.indy = false
}
```
This reverts to legacy bytecode, potentially resolving the regression (as
per issue benchmarks).
4. **Codebase tweaks in Groovy**: Easy spots (from earlier analysis):
- In `IndyInterface.java`, lower hardcoded thresholds or add more
adaptive caching for CASTs.
- Optimize `linkCallSite` for GORM-like workloads by pre-caching common
trait interfaces.
- Contribute to GROOVY JIRA (e.g., extend GROOVY-8298 fixes).
5. **Diagnostics**: Run with flame graphs (e.g., via async-profiler) to
confirm indy overhead. If using low thresholds, monitor for invalidations (not
logged here, but possible with custom logging).
If you share more details (e.g., full command, iteration count, or flame
graphs), I can refine further. This log confirms indy isn't caching effectively
in this GORM-heavy scenario, but the tunables and static compilation can
mitigate it.
## log while running localhost:8080/performanceTest/benchmark?pretty=true
### Analysis of the New Indy Logging Output
This log (from `performance-test-app.log`) captures an even larger burst of
indy activity during the `/performanceTest/benchmark` endpoint execution, with
a total of ~243,549 lines (heavily repetitive). It focuses almost exclusively
on a single pattern in a closure
(`grails7.performance.PerformanceTestService$_getProjectMetrics_closure12$_closure61`),
likely part of a metrics calculation loop over `Task` domain objects in the
benchmark. Unlike the previous log (dominated by CASTs for dirty
checking/validation), this one highlights property access and type coercion in
a data-processing context.
#### Key Statistics from the Log
- **Total lines**: 243,549 (indicating high verbosity and many iterations;
the benchmark likely ran with higher iterations, e.g., 50+).
- **"call site stays uncached" occurrences**: 10,589 (every bootstrapped
callsite remains uncached, causing repeated overhead).
- **Invocation types**:
- GET: 8,129 (primarily for the `estimatedHours` property on `Task`
objects).
- CAST: 2,460 (exclusively casting the returned `Integer` from
`estimatedHours` to `boolean`, likely for a conditional like `if
(task.estimatedHours)`).
- **Specific patterns**:
- `estimatedHours` GET invocations: 2,461 (one extra possibly from an edge
case).
- Boolean CASTs: 2,460 (matches CAST count; each GET is followed by a CAST
to boolean via DefaultGroovyMethods `asBoolean` on `Number`).
- **Unique `Task` instances**: 75 (accesses are repeated across a small set
of objects, but the dynamic nature prevents reuse).
- **Guards added** (common checks before invocation; counts reflect
repetition):
- GroovyRuntimeException (unwrapper): 10,589
- switch (switch point for invalidations): 10,589
- same-class (argument class check): 10,589
- meta (meta class equality check): 8,128 (mostly on GETs for `Task`,
where metaclasses are dynamic).
The log spans ~11 seconds (17:47:50 to 17:48:01), with bursts corresponding
to loop iterations processing `Task` objects (e.g., summing hours for project
metrics). The sender is a closure in `PerformanceTestService`, suggesting code
like:
```groovy
// Hypothetical snippet from getProjectMetrics
tasks.each { task ->
if (task.estimatedHours) { // Triggers GET + CAST to boolean
// Accumulate metrics
}
}
```
#### Why Do Call Sites Stay Uncached?
Based on the Groovy 4 source code (`Selector.java` and
`IndyInterface.java`), callsites remain uncached due to a deliberate decision
in `MethodSelector.doCallSiteTargetSet()`:
- The `cache` flag is set to `false` if:
- `spreadCall` is true (argument spreading, e.g., `list*.method()`—not the
case here, as `spreadCall: false`).
- The receiver's class has per-instance metaclasses
(`ClassInfo.getClassInfo(receiver.getClass()).hasPerInstanceMetaClasses()`
returns true).
- For the GET on `estimatedHours`:
- Receiver is `grails7.performance.Task` (a GORM domain class).
- GORM enhances domains with dynamic traits and methods (e.g., via
`ExpandoMetaClass` or plugins), marking them as having per-instance
metaclasses. This makes caching unsafe, as metaclasses could change per object
instance, leading to invalidations.
- Meta class in log: `groovy.lang.MetaClassImpl` (standard, but the class
info flags it as per-instance due to GORM's runtime modifications).
- For the CAST to boolean:
- Receiver is `java.lang.Integer` (stable JDK class, no per-instance
metaclasses).
- However, it's using a DefaultGroovyMethods (DGM) helper (`asBoolean` on
`Number`), which is dynamic. The uncaching cascades from the context or is
treated similarly due to the enclosing dynamic dispatch.
- Other factors from source:
- No abstract methods here (`estimatedHours` is a concrete getter; log
shows "meta method is CachedMethod instance" and "successfully unreflected
method").
- No category methods (log: "meta method is category type method: false").
- Hit counts aren't triggering optimization (controlled by
`-Dgroovy.indy.optimize.threshold`, default 10,000)—but since `cache=false`
initially, it never attempts to set the target.
- Fallbacks aren't resetting here (no log of "call site target reset"),
but high volatility could exceed `-Dgroovy.indy.fallback.threshold` (default
10,000) in longer runs.
This results in every invocation falling back to full method selection
(heavyweight, blocking JVM threads as noted in related discussions), mimicking
legacy behavior but without its caching efficiency. With 2,461+ accesses across
only 75 unique `Task`s, stable caching could have reduced this to near-zero
overhead after the first few calls.
#### Performance Implications
- **Regression source**: Uncached sites mean repeated bootstrapping/lookups,
amplifying the 2-4x slowdown in GORM workloads. This log shows tighter focus on
property access in loops, where indy should excel (via JVM JIT) but fails due
to the per-instance metaclass flag.
- **Comparison to previous log**: Earlier log had broader CASTs for dirty
checking/validation (GORM internals); this is user-code level (service
closure). Together, they highlight indy struggles with GORM's dynamism.
- **Benchmark context**: The ~11-second span suggests the test loop (e.g.,
creating/querying tasks) is bottlenecked here. Flame graphs would show time in
`vmplugin.v8.Selector` and MOP lookups.
#### Easy-to-Spot Improvements in Groovy 4 Codebase
From source analysis:
- **Enhance caching for per-instance metaclasses**: In `Selector.java`, the
`cache &= !hasPerInstanceMetaClasses` is conservative. Add an optional guard
(e.g., instance-specific metaclass check) or a property
`-Dgroovy.indy.cache.perinstance=true` to enable caching with extra safeguards.
This could help GORM-like apps without breaking dynamics.
- **Lower defaults for thresholds**: Hardcoded defaults (10,000 for
optimize/fallback) are high for low-frequency calls. Make them configurable
per-class or lower to 100-1,000 in `IndyInterface.java` for better out-of-box
performance.
- **Special handling for DGM casts**: For common CASTs like `asBoolean` on
primitives, bypass uncaching if the receiver is a JDK class (e.g., add `if
(receiver instanceof Number && !hasPerInstanceMetaClasses)` in
`MethodSelector`).
- **Logging enhancements**: Add reasons to the "call site stays uncached"
log (e.g., "due to per-instance metaclasses") for easier debugging.
These are low-risk changes in `Selector.java` and `IndyInterface.java`,
testable via benchmarks like this app.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]