Hi Robin,

On Thu, Apr 01, 2021 at 06:03:39PM +0000, Robin H. Johnson wrote:
> Hi,
> 
> I'm wondering if there is any ongoing development or improvement plans
> around the 'show errors' functionality?

We should improve it. There used to be a wish of keeping a history in a
rotating buffer, but in the mean time H2 arrived with muxes and errors
are now produced at a level where it's a bit more complicated to pass
contextual info. But at least they're still there.

> This has come out of cases where we upgraded HAProxy 1.8 -> 2.2, and
> $work customers started reporting requests that previously worked fine
> now return 400 Invalid Request errors.

That's never good. Often it indicates that they've long been doing
something wrong without ever noticing and that for various reasons
it's not possible anymore.

> Two things stand out:
> - way to reliably capture that output, not being limited to the last
>   error

Ideally I'd like to have a global ring buffer to replace all per-proxy
ones (which also consume a lot of RAM). We could imagine keeping the
ability to have a per-proxy copy based on a config option.

> - messaging about WHY a given position is an error
>   - partial list of reasons I've seen so far included below

In a capture, the indicated position is *always* an invalid character.
It may require to have a look at the standards to know why, but it
seems particularly difficult to me to start to emit the list of all
permitted characters whenever a forbidden one is met. I remember that
we emit the parser's state, maybe this is what should be turned to a
more human-readable form to indicate "in header field name" or "in
header field value" or stuff like this which can help the reader
figure why the char is not welcome (maybe because they expect that
a previous one had switched the state).

> Partial list of low-level invalid request reasons
> - path/queryparams has character that was supposed to be encoded
> - header key invalid character for given position
> - header line malformed (no colon!)
> - header value invalid relative to prior pieces of request**

For the last one we will not have a position because the request is
*semantically* invalid, it's not a parsing issue.

> ** This one is bugging me: user requests with an absolute URI as the
> urlpath, but the hostname in that URI does not match the hostname in the
> Host header.

This is mandated by the standard:

   https://tools.ietf.org/html/rfc7230#section-5.4

   If the target URI includes an authority component, then a
   client MUST send a field-value for Host that is identical to that
   authority component, excluding any userinfo subcomponent and its "@"
   delimiter (Section 2.7.1).
   ...
   A server MUST respond with a 400 (Bad Request) status code to any
   HTTP/1.1 request message that lacks a Host header field and to any
   request message that contains more than one Host header field or a
   Host header field with an invalid field-value.

Do you regularly encounter this case ? If so maybe we could have an
option to relax the rule in certain cases. The standard allows proxies
to ignore the provided Host field and rewrite it from the authority.
Note that we're not a proxy by a gateway and it's often the case that
a number of other gateways (and possibly proxies) have been met from
the client, so we wouldn't do that by default but with a compelling
case I woudln't find this problematic.

Regards,
Willy

Reply via email to