moonchen opened a new pull request, #13219:
URL: https://github.com/apache/trafficserver/pull/13219
## Summary
The `proxy.config.memory.max_usage` connection throttle and the
`proxy.process.traffic_server.memory.rss` gauge are both driven by the
`MemoryLimit` continuation in `src/traffic_server/traffic_server.cc`, and
both
were broken. This fixes all three defects.
### 1. The throttle was silently dead (~7 years)
`MemoryLimit` sets the global `net_memory_throttle` flag when RSS exceeds
`max_usage`, but the only reader was in the accept path and it was **removed
in
2018** (`4bfaf3645`, "Make throttling feature more useful."). Since then the
flag has been written and never consulted — `check_net_throttle()` only
honored
the connection-*count* throttle. So configuring
`proxy.config.memory.max_usage`
had no effect on connection acceptance at all.
Restored by honoring `net_memory_throttle` in `check_net_throttle(ACCEPT)`
(one
spot, so every accept site picks it up). Gated on `ACCEPT` only: outbound
`CONNECT`s keep flowing so in-flight transactions can complete and release
memory.
### 2. The metric was frozen on default configs (freeze bug)
`MemoryLimit` stopped updating the gauge whenever `max_usage <= 0` (the
default): the periodic handler did `e->cancel(); delete this;`, so the gauge
was
written exactly once early in startup and then **never updated again**. In
production it was observed frozen at ~1.4 GiB while real RSS was ~375 GiB.
Now the current RSS is published every period, decoupled from the throttle
feature, so the metric stays fresh regardless of whether the throttle is
enabled.
### 3. Wrong source / units / semantics (peak vs current, platform bug)
It stored `getrusage(RUSAGE_SELF).ru_maxrss << 10`:
- `ru_maxrss` is **peak** RSS (never decreases) — wrong for a gauge named
`.rss`, and it made the throttle **latch on** once peak crossed the limit
and
never release.
- `ru_maxrss` is **KiB on Linux** (so `<<10` is correct there) but **bytes on
macOS/BSD**, so the shift was wrong off-Linux.
Reads current RSS portably via a new `ink_get_current_rss()` helper in
`tscore/ink_sys_control` — Linux `/proc/self/statm` (`resident pages *
sysconf(_SC_PAGESIZE)`), macOS `task_info(... MACH_TASK_BASIC_INFO ...)`. The
throttle now compares current (not peak) RSS in bytes against `max_usage`
(which
is documented in bytes), so it both engages and **releases** correctly.
## Why the two fixes are coupled
Reviving the throttle requires the current-RSS fix: with peak RSS the flag
would
latch on forever and the server would refuse all new connections permanently.
Current RSS is what lets the throttle clear when memory actually drops.
## Testing
- New unit test `src/tscore/unit_tests/test_ink_sys_control.cc`: asserts
`ink_get_current_rss()` is plausible and **grows** after touching a 64 MiB
allocation.
- Builds clean (`traffic_server`, `inknet`, `test_tscore`); pre-commit format
hooks (clang-format / cmake-format / whitespace) pass.
- Docs updated for the metric (current vs peak, refreshed every 10s).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]