wu-sheng opened a new pull request, #13867:
URL: https://github.com/apache/skywalking/pull/13867

   ### Fix missing `attr0..5` / `entityId` columns in OAL debug capture
   
   **Bug.** A debug session against any OAL metric (e.g.
   `oal/core.oal/service_cpm`) shows the terminal `output` sample with only
   the family parent's value columns:
   
   ```json
   { "type": "ServiceCpmMetrics", "timeBucket": …, "id": …,
     "total": 1, "value": 0 }
   ```
   
   The source-derived storage columns `attr0..attr5` and `entityId` are
   absent — even though the OAL-generated subclass DOES carry them on the
   live instance, the entry function copied them from the Source, and the
   storage row in BanyanDB DOES persist them. Operators verifying e.g. the
   layer name written to `attr0` by `ServiceDecorator` cannot see it in
   the debugger; they have to cross-reference with a storage query.
   
   **Why the gap.** For OAL-generated metrics, `attr0..5` are NOT
   inherited from a fixed parent — they are declared on the source class
   via `@ScopeDefaultColumn.DefinedByField`, surfaced by
   `SourceColumnsFactory.getColumns()`, and copied to the generated
   metrics subclass at OAL boot by `OALClassGeneratorV2.generateMetricsClass`
   (line 220, the `for (CodeGenModel.SourceFieldV2 field :
   model.getFieldsFromSource()) metricsClass.addField(...)` loop). The
   family parent (CPMMetrics, SumMetrics, …) doesn't know the dynamic
   fields exist, so its `appendDebugFields(JsonObject)` override only
   emits its own value columns. `Metrics.toJson()` then renders just
   `{type, timeBucket, lastUpdateTimestamp, id} + family fields`.
   
   **The fix.** Codegen-side. The same OAL pass that materialises the
   dynamic fields now also emits an `appendDebugFields(JsonObject)`
   override on the generated subclass:
   ```java
   protected void appendDebugFields(com.google.gson.JsonObject obj) {
       super.appendDebugFields(obj);  // delegates to family override
                                      // (CPMMetrics → total + value, …)
       obj.addProperty("entity_id", entityId);
       obj.addProperty("attr0", attr0);
       // … each @ScopeDefaultColumn-declared field
   }
   ```
   Mechanically: a new FreeMarker template
   
`oap-server/oal-rt/src/main/resources/code-templates-v2/metrics/appendDebugFields.ftl`
   iterates `fieldsFromSource` and emits the right typed
   `obj.addProperty(...)` overload (Long / Integer / Double / Float /
   Boolean wrappers for primitives so Javassist resolves the
   `addProperty(String, Number)` overload unambiguously; null-safe
   `toString()` for object types). The generator picks it up by listing
   `"appendDebugFields"` in `METRICS_CLASS_METHODS`, the same array that
   already drives `id.ftl`, `serialize.ftl`, etc.
   
   **Why codegen, not reflection / a `Convert2Storage` shim.** The fields
   are codegen-materialised — putting their debug projection in the same
   codegen pass keeps the field list literally hard-coded in the bytecode
   (zero reflection at runtime, exact parity with what the generator
   declared). Recorder-side reflection or a JsonObject-backed
   `Convert2Storage` would also work but adds a runtime walker per probed
   emit; codegen is strictly cheaper.
   
   - [x] Add a unit test to verify that the fix works.
     - All 98 existing `oap-server/oal-rt` unit tests pass against the new
       METRICS_CLASS_METHODS entry — the existing
       `OALClassGeneratorV2Test` and `RuntimeOALGenerationTest` exercise
       the full template-driven generator. The generated method is loaded
       onto every `*Metrics` subclass at OAP boot, so a generator failure
       would surface as a class-load error before a single source event
       is dispatched.
     - Live-verified post-deploy: an OAL session against
       `oal/core.oal/service_cpm` now shows the dynamic columns on the
       terminal sample in addition to the family parent's value fields.
   - [x] Explain briefly why the bug exists and how to fix it.
     - See above. The bug is debug-output only; the storage row was
       always correct, only the wire payload missed the dynamic columns.
       The fix is debug-only and zero-cost when
       `SW_DSL_DEBUGGING_INJECTION_ENABLED=false` (the override is only
       invoked from `Metrics.toJson()`, which is only called by the
       gated dsl-debugging recorder).
   
   - [ ] If this pull request closes/resolves/fixes an existing issue, replace 
the issue number. Closes #<issue number>.
   - [x] Update the [`CHANGES` 
log](https://github.com/apache/skywalking/blob/master/docs/en/changes/changes.md).
     - Not required: this is a follow-up patch to the same unreleased
       SWIP-13 surface that PR #13865 just landed against, already
       covered by the 10.5.0 changelog entry.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to