Hi Ben,

On Tue, Jan 20, 2026 at 07:07:58PM +0000, Ben Kallus wrote:
> > No, because the chunks are removed on the input and recomposed on the
> > output (possibly of different sizes) and chunk extensions are entirely
> > lost. I was initially confused by Rajat's report because I didn't see
> > where the chunk-ext could have been stored to pass it on the other
> > side, and Christopher reminded me that I was just stupid, we indeed
> > entirely drop them.
> 
> I agree with every word of this paragraph except the first one (and
> the part where you call yourself stupid) :)
> The fact that haproxy removes chunk-exts before forwarding only
> ensures that the request is not misinterpreted *after* haproxy
> forwards it. But what if the request was *already* misinterpreted when
> haproxy received the request?
> 
> To be clear, the trailer-injection poc I gave in the last email was
> not hypothetical; I tested it against the latest versions of both
> pound and haproxy.

I totally understand and am not denying this.

> Does pound have the bigger bug? Certainly. But the
> poc only works because of *both* pound's behavior *and* haproxy's
> behavior, each of which I am arguing is not permitted by the RFCs.

On the opposite! Please look again: the spec says:

  chunk          = chunk-size [ chunk-ext ] CRLF
  chunk-ext      = *( BWS ";" BWS chunk-ext-name [ BWS "=" BWS chunk-ext-val ] )
  chunk-ext-name = token
  chunk-ext-val  = token / quoted-string

And:

  A recipient MUST ignore unrecognized chunk extensions.

The CRLF is the *chunk* delimitor, i.e. the outer one. Once you have
isolated your chunk before the CRLF, you end up with:
  - a chunk-size
  - some optional chunk extension

Here the spec is clear that you must ignore chunk extensions that you
don't recognize, and like many HTTP agents we don't recognize any, and
like many, we just trop them all. So what's ignored is this part:

  chunk-ext      = *( BWS ";" BWS chunk-ext-name [ BWS "=" BWS chunk-ext-val ] )

As such since we ignore it, it's not a requirement to go deep into its
syntax validation.

> If
> haproxy rejected messages with improperly quoted chunk-exts, the pound
> bug would not be exploitable.

I absolutely agree on this and am not denying it. It's a *reinforcement*
measure, like we've done in the past for various other things where we
detected that OK, here and there, we could improve the situation beyond
what we're using. But I'm saying that it's also a dangerous slippery
slope, because this is also by starting to add extra processing code
that you introduce parser bugs. Here if you want to process the chunk-ext
correctly, at least you need to validate quoting. And in order to validate
quoting, you have to validate backslash-escaping. I did it in the proposed
patch, in a lenient way (not verifying that there's an equal sign, that the
quotes are only for the value, that non-token chars do not appear in the
name nor that forbidden chars are not present inside the quotes). But this
is based on a supposed weakness of some external components. Maybe in one
week you'll tell me "this component mistakenly parses quoted printable
text in extensions and takes =0D=0A for a CRLF, could you please also
block them?".

> This same payload can't be used against
> a `client<->pound<->node.js` server chain, because node.js properly
> rejects malformed chunk-exts.

Maybe not node.js but other spec-compliant agents certainly for exactly
the same reason which is that they just drop anything past the semi-colon
(and nginx seems to be doing exactly this, just like golang etc) apparently
from the mentioned reports). 

> > Like any agent that will stop at the semi-colon like permitted
> > by the spec ("MUST ignore unknown extensions", and when you neither know nor
> > support any extension, you just ignore all of them).
> 
> This raises an important question to which I think we disagree on the answer:
> What is meant by the word "ignore" in that sentence in the RFC?
> 
> I believe that you are of the opinion that "ignore" means "do not look
> at," but I think "ignore" in this context instead means "do not
> interpret (but still validate)."

That's not what's done on anything else. You're not supposed to validate
the contents of user-agent if you're not using it. You're not validating
the syntax of "vary" if you're not using it, same for bytes-range, or
even "cookie". Each of these have their very specific syntax, and the
flexibility and the definition of the protocol precisely permits this:
you have to validate what you're consuming and can safely ignore the
rest that is properly bounded. For headers, a header field is what
ends at a CRLF (if you don't support folding, otherwise you can detect
that next line starts with an LWS and accept to continue the header on
the next line).

Here the chunk is perfectly defined as well:
 - first find the chunk till the CRLF
 - then parse it
 - look at extensions if you need them.

> I think my interpretation is supported by the rest of the document.
> Take, for example, RFC 9110 Section 5.1:
> > A proxy MUST forward unrecognized header fields unless the field name is
> > listed in the Connection header field (Section 7.6.1) or the proxy is
> > specifically configured to block, or otherwise transform, such fields.
> > Other recipients SHOULD ignore unrecognized header and trailer fields.
> > Adhering to these requirements allows HTTP's functionality to be extended
> > without updating or removing deployed intermediaries.

I totally agree and that's my point.

> If "ignore" meant "do not look at," then this would seem to suggest
> that endpoint servers are not responsible for validating header values
> from unrecognized headers.

There's a difference between reading a header field name to adhere to
the protocol (or here the chunk size) and validating the contents. I
can tell you for certain that mismatched quotes in header values are
not infrequent at all; you're just free to send what you want in these
headers provided that you don't send forbidden chars. Some even use
base85 which *does* contain the double-quote.

> That would mean that, for example, it would
> be valid to accept CR, LF, and NUL within header-values of
> unrecognized headers. To me, this seems incorrect.

It is incorrect only because LF marks the end of the header in H1, so
if you're parsing H1, by definition, either CRLF or LF do mark the end
of the header field and they are the only marker for this :-)

And we do check for CR & NUL not appearing in headers (and LF for H2/H3)
for the sake of not risking to improperly transcode H2/3 to H1 (and the
spec was improved on this front for the same reason). But you can't do
much more on headers you simply don't know.

> I argue that the word "ignore" in this context instead means the
> recipient should ensure that each received header is well-formed,

"Well-formed" is very hard to determine because in some cases it will
be improperly formed based on its own definition. Bytes-range is a good
example, there have been issues in the past with overlapping ranges that
were only exploiting the difficulty to convert it to a unified range
while parsing it on the fly, and such issues didn't require any CR nor
NUL character or even less qd-text. So no, we do not *validate* headers
that are not consumed because noone knows them nor can decide whether
they're correct or not, we only validate headers we use (e.g. content-
length etc), and that already requires a complex and expensive logic to
get these right.

> but
> well-formed unexpected headers should have no effect. That is, the
> result of parsing the validated header field should be ignored.
> 
> By my interpretation, this makes haproxy's behavior a (minor)
> violation of the spec.

By my explanation above, I sincerely disagree, because the chunk-ext
is not a protocol element we're using at all, we're supposed to ignore
it, and the chunk is properly delimited by the CRLF so the protocols
remains properly synchronized.

> > I disagree with this argument. We need to implement what we *use*. It's
> > even particularly dangerous to implement what you don't use because 1)
> > you can't test it, and 2) it's often done in too lenient a way to the
> > point that issues come from there. How many components have had issues
> > with trailers while they are unable to make use of them for example ?
> 
> I disagree that you can't test this effectively; the tests need only
> show that requests containing well-formed chunk-extensions are emitted
> with the chunk-extensions properly removed, and requests containing
> malformed chunk-extensions are rejected.

Well, yes but this remains limited. If you introduce a bug in that parser
it will not necessarily trigger, that's what I mean.

> > But as you pointed this one is not valid as per the RFC. We could always
> > say that "what if..." but the we cannot prevent lenient parsers from
> > shooting themselves in the foot.
> 
> I've heard this argument many times, but I don't think it's entirely
> fair. This basically reduces to "haproxy's spec violation wouldn't be
> exploitable if Pound hadn't violated the spec first,"

Except that here there is NO spec violation. Can we do more and act as
if we'd be using extensions except that we don't use them ? Sure, we can,
and my patch does that. Will it be sufficient to stop requests (and possibly
responses) passing through Pound ? Logically yes. Will it be sufficient
against next component that mistakes part of a chunk extension with the
delimiter ? not necessarily. After all, maybe some will look at the last
digits in the chunk line and will consider that only these ones define
the chunk size, e.g:

    0012; name=0038\r\n

and the component parsing it using a regex keeping only ([0-9a-f]*)$ would
take the wrong one. And then what's the next step ? It's exactly the same
problem here, one component by accident takes a protocol element (CRLF)
inside another one where it's not supposed to find it (chunk-ext-val).
I'm not blaming Pound at all for this, such bugs are easy to encounter
and we've also had our share of comparable ones. I just want to clarify
that on this point we're speaking about adding extra protection against
one *identified* bug in an external component. That's just reactive
security, not proactive. And that can be find to improve the ecosystem,
just like we've done with hearybleed to protect affected openssl users
in the past for example. It's just that it's not the one-size-fits-all.

> and of course
> the inverse could be claimed by pound.

Except that in this specific case it ignores the outer delimiter (CRLF).

> When an exploit is achievable
> only due to multiple bugs in multiple programs, both programs are at
> fault (though maybe not equally).

I totally agree with that point.

> > Maybe time to report it to them ;-) Because it will do the same with
> > about every other component since basically noone implements chunk-ext
> > in practice.
> 
> Already did :)

Great, thanks!
Willy


Reply via email to