phongn opened a new pull request, #13259:
URL: https://github.com/apache/trafficserver/pull/13259

   ## Summary
   
   When the home-grown Huffman codec was replaced with vendored LiteSpeed 
ls-hpack code (#12357), only the conservative 4-bit FSM decoder 
(`lshpack_dec_huff_decode_full`) was ported — although `huff-tables.h` has 
carried the 64K-entry table for upstream's fast decoder all along. This PR 
ports the fast decoder (`lshpack_dec_huff_decode`, from ls-hpack v2.3.5) and 
switches `huffman_decode()` to it.
   
   The fast decoder consumes 16 bits of input per table lookup and emits up to 
3 bytes, falling back to the FSM decoder for the rare codes longer than 16 
bits. HPACK and QPACK share the wrapper through `xpack_decode_string()`, so 
both HTTP/2 and HTTP/3 header decoding benefit. No new memory footprint: the 
`hdecs` table has been compiled into the binary since the original vendoring.
   
   ## Performance
   
   `tools/benchmark/benchmark_HuffmanDecode.cc` (new; build with 
`-DENABLE_BENCHMARKS=ON`), release build, Ice Lake:
   
   | Input | FSM decoder | fast decoder | speedup |
   |---|---|---|---|
   | 8B value (`text/css`) | 103 ns | 73 ns | 1.4x |
   | 86B Accept value | 1003 ns | 505 ns | 2.0x |
   | 113B User-Agent | 1296 ns | 659 ns | 2.0x |
   | 10-value mixed corpus (459B) | 5.32 µs | 2.80 µs | 1.9x |
   
   ## RFC 7541 strictness (deliberate divergence from upstream ls-hpack)
   
   Differential testing revealed that upstream's fast decoder accepts padding 
of 8–10 bits when it follows the final symbol near the end of the input. RFC 
7541 §5.2 requires that "a padding strictly longer than 7 bits MUST be treated 
as a decoding error", and the FSM decoder has always enforced this. The ported 
decoder adds a guard in the tail check to keep the strict behavior, so this PR 
does not loosen what ATS accepts. The divergence is documented in 
`lib/ls-hpack/README.md` and pinned by the deterministic 
`decode_overlong_padding` test, which fails if a future re-sync drops the guard.
   
   ## Semantics and compatibility
   
   - Validity and output are otherwise identical to the FSM decoder. Verified 
by: exhaustive parity over all 1- and 2-byte inputs, 100k seeded differential 
fuzz iterations (with out-of-bounds-write sentinel checks), 20k encode/decode 
roundtrips, and the RFC 7541 vectors. Offline, the port was additionally 
validated four-ways against upstream's own two decoders over exhaustive 
1–3-byte inputs (16.8M cases) plus a 134M-case destination-size sweep: 
bit-identical to upstream's fast decoder apart from the strictness guard above.
   - The one observable boundary is an exactly-sized destination (`dst_len ==` 
decoded length), where either decoder may report `LSHPACK_ERR_MORE_BUF` 
depending on how trailing padding falls on nibble boundaries. One byte of 
headroom guarantees success; ATS sizes destinations at 2× the encoded length 
(strictly larger than any decoded result, since Huffman expansion is at most 
8/5), so callers are unaffected. The sizing contract is now documented on 
`huffman_decode()`.
   - Error returns remain negative-on-failure as callers expect 
(`xpack_decode_string()` checks `len < 0`).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to