wu-sheng opened a new pull request, #13865:
URL: https://github.com/apache/skywalking/pull/13865

   ### Fix two related issues in the live debugger surface for unreleased 
SWIP-13 code
   
   **Bug 1 — terminal MAL `output` sample shows the wrong shape in the UI.**
   Operators inspecting a `histogram_percentile([...])` rule see bucket-keyed
   samples (`(service_name=…, le=10)`, `…, le=20`, …) in the value column 
instead
   of the percentile result they expect to see in storage (`{p=50}`, `{p=90}`, 
`{p=99}`).
   Two reasons: (1) the captured `output` sample never carried the metric's 
actual
   reading on `payload.value` — the wire only had `metric` / `entity` / 
`valueType`
   / `timeBucket`, so the UI fell back to rendering the upstream
   `histogram_percentile` function-stage `SampleFamily`, which legitimately has
   `le=` keyed samples but isn't what the rule emits. (2) Even if we add 
`payload.value`,
   two-phase functions (`AvgHistogramPercentileFunction`, 
`SumHistogramPercentileFunction`)
   populate their user-visible field (`percentileValues`) lazily in 
`calculate()`,
   which the L1 streaming pipeline calls *after* the `captureMeterEmit` probe 
site,
   so a naive `getValue()` read returns the empty pre-calculate map.
   
   This PR fixes both:
   - New `appendValue(JsonObject, AcceptableValue<?>)` in `MALDebugRecorderImpl`
     renders `payload.value` per holder type — JSON number for `LongValueHolder`
     / `IntValueHolder` / `DoubleValueHolder` (with `\"NaN\"` / `\"Infinity\"`
     string fall-back for non-finite doubles so the wire stays valid JSON), and
     `JsonObject {key: long}` for `LabeledValueHolder` (label combos for
     `*Labeled` functions, `p=<rank>` entries for percentile functions).
   - Forces `Metrics.calculate()` once before the `getValue()` read. Safe 
because
     the function's `isCalculated` guard makes the streaming pipeline's later
     call a no-op, and cross-node `combine()` resets `isCalculated=false` so
     post-merge state still recomputes on read.
   
   Both edits are inside `MALDebugRecorderImpl`, which is only reached from the
   gated `captureMeterEmit` probe — when no operator has installed a session the
   probe gate is JIT-elided on hot paths, so steady-state runtime cost is 
**zero**.
   
   **Bug 2 — `recordCap` ceiling is too generous for an operator-driven 
debugger.**
   Hard cap was `10000` records / session, default `1000`. Both burn UI page
   real estate and heap (per-record JSON ≈ 10 KiB → ~100 MiB worst case per
   session). Operators inspect a handful of executions, not a paginated 
firehose.
   
   This PR drops both to `100` (default = hard cap = `MAX_RECORD_CAP`),
   yielding ~1 MiB worst-case heap per session. Out-of-range requests still get
   `400 invalid_limits` from the constructor.
   
   - [x] Add a unit test to verify that the fix works.
     - All 16 existing `dsl-debugging` unit tests pass against the new code; no
       fixture broke. The cap-aware test 
(`DebugSessionRegistryTest.reapExpired_…`)
       was switched to `SessionLimits.MAX_RECORD_CAP` so it tracks the cap going
       forward instead of pinning a stale literal.
     - The MAL e2e flow now asserts `payload.value` is present and numeric on
       every terminal `output` sample.
     - Live-verified against a local stack feeding a `histogram_percentile`
       rule: terminal sample now reports
       `value: {\"{p=50}\": 250000, \"{p=90}\": 5000000, \"{p=99}\": 10000000}`.
   - [x] Explain briefly why the bug exists and how to fix it.
     - See above. Both bugs are debug-only paths; fix is debug-only and gated.
   
   - [ ] If this pull request closes/resolves/fixes an existing issue, replace 
the issue number. Closes #<issue number>.
   - [x] Update the [`CHANGES` 
log](https://github.com/apache/skywalking/blob/master/docs/en/changes/changes.md).
     - Not required: this patches unreleased SWIP-13 code that's already in the
       10.5.0 changelog entry. The wire/cap changes won't be visible to any
       released version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to