moonchen opened a new pull request, #13343:
URL: https://github.com/apache/trafficserver/pull/13343
## Summary
`RateLimiter::_metrics` (in `plugins/experimental/rate_limit/limiter.h`) is
value-initialized to `0` and is only populated by `initializeMetrics()`, which
`parseYaml()` calls **only when a selector carries a `metrics:` block**. But
`incrementMetric()` skips the stat update only when the entry equals `TS_ERROR`
(`-1`):
```cpp
std::array<int, RATE_LIMITER_METRIC_MAX> _metrics{}; // value-initialized
to 0
void incrementMetric(uint metric) {
if (_metrics[metric] != TS_ERROR) { // 0 != -1 → true →
does NOT skip
TSStatIntIncrement(_metrics[metric], 1); //
TSStatIntIncrement(0, 1)
}
}
```
So for any rate-limit selector configured **without** a `metrics:` block,
the guard never skips, and every `incrementMetric()` call on the queue / reject
/ expire / resume paths runs `TSStatIntIncrement(0, 1)`.
These call sites are unconditional (not gated on metrics being configured):
- `sni_limiter.cc` — `RATE_LIMITER_METRIC_QUEUED`,
`RATE_LIMITER_METRIC_REJECTED`
- `sni_selector.cc` — `RATE_LIMITER_METRIC_RESUMED`,
`RATE_LIMITER_METRIC_EXPIRED`
- `txn_limiter.cc` — `QUEUED`, `REJECTED`, `RESUMED`, `EXPIRED`
`TSStatIntIncrement(0)` does not assert, because `ts::Metrics::Storage`'s
constructor reserves id `0` as `proxy.process.api.metrics.bad_id` ("Reserve
slot 0 for errors, this should always be 0"), so `valid(0)` is always true. The
effect is therefore **silent**: that reserved id-0 metric is incremented on
every queue/reject/expire/resume for a selector with no `metrics:` block — an
unrelated global counter slowly drifts upward.
## Reproduction
A selector with a limit/queue but no `metrics:` block, driven past its limit:
```yaml
selector:
- name: example.com
limit: 1
queue: { size: 10, max_age: 1 }
```
Send more than `limit` concurrent handshakes to `example.com`; each
queued/rejected (and later expired/resumed) connection bumps
`proxy.process.api.metrics.bad_id`.
## Fix
Fill `_metrics` with `TS_ERROR` in the constructor — the same "not
registered" marker `metric_helper()` already writes per element — so
`incrementMetric()` is a true no-op until metrics are actually registered. One
line, robust to the metric count.
## Testing
- The equivalent fix (initialize the IDs to `TS_ERROR`) was validated under
AddressSanitizer on a 10.x branch: the same selector-without-`metrics:` path
aborts under ASAN before the change and passes cleanly after.
- This PR's change has not been built locally against `master`; relying on
CI for the compile.
---
_Unrelated observation while investigating (not addressed here): on the
configured-metrics path, `parseYaml()` calls
`initializeMetrics(RATE_LIMITER_TYPE_SNI, prefix, tag)` but the signature is
`initializeMetrics(uint type, std::string tag, std::string prefix = ...)`, so
`prefix` and `tag` appear to be passed swapped — affecting metric **names**
when a `metrics:` block is present. Happy to file separately if it's a real
issue._
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]