[PR] Add cache mechanism to certificate compression [trafficserver]

via GitHub Tue, 16 Jun 2026 08:53:28 -0700


maskit opened a new pull request, #13284:
URL: https://github.com/apache/trafficserver/pull/13284


   ## Summary
   
   Cache compressed certificate bytes so the cert chain is compressed once per 
certificate, not on every TLS handshake.
   
   ## Why
   
   BoringSSL has no internal cache for cert compression — each handshake 
re-runs the compression callback on the same static cert chain. Wasted CPU.
   
   (OpenSSL builds aren't affected: `SSL_CTX_compress_certs` pre-compresses the 
chain and caches internally.)
   
   ## Approach
   
   A per-certificate cache attached to the SSL_CTX. Each algorithm has a slot 
holding an atomic pointer to an immutable compressed-bytes entry.
   
   - Reads (the hot path) are lock-free: acquire-load the slot, memcpy.
   - Misses compare-exchange a fresh entry into the slot — only the first 
publisher in a thundering herd wins; the rest discard their work.
   - Invalidation runs on every OCSP refresh attempt, since the staple is part 
of what gets compressed. It swaps the slot to null and retires the previous 
entry for free on the *next* invalidation, so per-slot memory is bounded at two 
entries.
   
   ## Operator-facing
   
   - `proxy.config.ssl.server.cert_compression.cache` (INT, default `1`, 
reloadable). Disabling forces recompression on every handshake. No-op on 
OpenSSL builds (documented in records.yaml).
   - `proxy.process.ssl.cert_compress.cache_hit` (counter). Aggregated across 
algorithms — hit rate is identical per algorithm by construction.
   
   ## Validation
   
   - New autest `tls_cert_comp_cache` runs traffic through a mid-tier ATS with 
two full handshakes (session tickets disabled) and asserts the compress metric 
reaches 2 with caching on, and `cache_hit` stays at 0 with caching off.
   - New Catch2 microbench (`test_net "[!benchmark]"`) drives the production 
callbacks at cold / warm / disabled and at thread fan-out of 1, 2, 4, 8, 16, 
32. Single-thread warm hit is ~125 ns regardless of algorithm. Throughput 
scales positively with threads (zlib: 7.5M → 21.7M ops/sec at 1 → 8 threads on 
an 18-core box).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Add cache mechanism to certificate compression [trafficserver]

Reply via email to