Re: 'show errors' - logging & reasons
On Fri, Apr 02, 2021 at 06:38:35PM +0200, Willy Tarreau wrote: > > This has come out of cases where we upgraded HAProxy 1.8 -> 2.2, and > > $work customers started reporting requests that previously worked fine > > now return 400 Invalid Request errors. > That's never good. Often it indicates that they've long been doing > something wrong without ever noticing and that for various reasons > it's not possible anymore. Yes, see below for details of what a tool used by customers has been doing wrong. > > Two things stand out: > > - way to reliably capture that output, not being limited to the last > > error > Ideally I'd like to have a global ring buffer to replace all per-proxy > ones (which also consume a lot of RAM). We could imagine keeping the > ability to have a per-proxy copy based on a config option. I'd keep per-proxy support, because I can see cases where different retention might be wanted. > > - messaging about WHY a given position is an error > > - partial list of reasons I've seen so far included below > In a capture, the indicated position is *always* an invalid character. > It may require to have a look at the standards to know why, but it > seems particularly difficult to me to start to emit the list of all > permitted characters whenever a forbidden one is met. I remember that > we emit the parser's state, maybe this is what should be turned to a > more human-readable form to indicate "in header field name" or "in > header field value" or stuff like this which can help the reader > figure why the char is not welcome (maybe because they expect that > a previous one had switched the state). I wouldn't emit an entire list of permitted characters, but certainly via the parser's state we can point to it by reference. E.g. in the target URI, non-encoded spaces or unicode. > > Partial list of low-level invalid request reasons > > - path/queryparams has character that was supposed to be encoded > > - header key invalid character for given position > > - header line malformed (no colon!) > > - header value invalid relative to prior pieces of request** > For the last one we will not have a position because the request is > *semantically* invalid, it's not a parsing issue. Hmm, I think it does have a position value, but one that didn't seem to make sense. > > ** This one is bugging me: user requests with an absolute URI as the > > urlpath, but the hostname in that URI does not match the hostname in the > > Host header. > > This is mandated by the standard: > >https://tools.ietf.org/html/rfc7230#section-5.4 > >If the target URI includes an authority component, then a >client MUST send a field-value for Host that is identical to that >authority component, excluding any userinfo subcomponent and its "@" >delimiter (Section 2.7.1). >... >A server MUST respond with a 400 (Bad Request) status code to any >HTTP/1.1 request message that lacks a Host header field and to any >request message that contains more than one Host header field or a >Host header field with an invalid field-value. > > Do you regularly encounter this case ? If so maybe we could have an > option to relax the rule in certain cases. The standard allows proxies > to ignore the provided Host field and rewrite it from the authority. > Note that we're not a proxy by a gateway and it's often the case that > a number of other gateways (and possibly proxies) have been met from > the client, so we wouldn't do that by default but with a compelling > case I woudln't find this problematic. If this changed in 1.8->2.2, then yes, it's absolutely the case. I have already asserted to $work customers, that the third-party tool that is being used is doing the wrong thing, but I haven't had much traction in getting them to change what they are doing, since it needs. I can't disclose the name of tool in question on the mailing list, but it choses to implement HTTPS by having users of the tool run a local stunnel, pointing to $work service, and point the tool to stunnel as an http proxy As for 'regularly', what is the metric to compare that to? It's a tiny fraction of overall requests, but has been loud at the support level. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: PGP signature
Re: 'show errors' - logging & reasons
Hi Robin, On Thu, Apr 01, 2021 at 06:03:39PM +, Robin H. Johnson wrote: > Hi, > > I'm wondering if there is any ongoing development or improvement plans > around the 'show errors' functionality? We should improve it. There used to be a wish of keeping a history in a rotating buffer, but in the mean time H2 arrived with muxes and errors are now produced at a level where it's a bit more complicated to pass contextual info. But at least they're still there. > This has come out of cases where we upgraded HAProxy 1.8 -> 2.2, and > $work customers started reporting requests that previously worked fine > now return 400 Invalid Request errors. That's never good. Often it indicates that they've long been doing something wrong without ever noticing and that for various reasons it's not possible anymore. > Two things stand out: > - way to reliably capture that output, not being limited to the last > error Ideally I'd like to have a global ring buffer to replace all per-proxy ones (which also consume a lot of RAM). We could imagine keeping the ability to have a per-proxy copy based on a config option. > - messaging about WHY a given position is an error > - partial list of reasons I've seen so far included below In a capture, the indicated position is *always* an invalid character. It may require to have a look at the standards to know why, but it seems particularly difficult to me to start to emit the list of all permitted characters whenever a forbidden one is met. I remember that we emit the parser's state, maybe this is what should be turned to a more human-readable form to indicate "in header field name" or "in header field value" or stuff like this which can help the reader figure why the char is not welcome (maybe because they expect that a previous one had switched the state). > Partial list of low-level invalid request reasons > - path/queryparams has character that was supposed to be encoded > - header key invalid character for given position > - header line malformed (no colon!) > - header value invalid relative to prior pieces of request** For the last one we will not have a position because the request is *semantically* invalid, it's not a parsing issue. > ** This one is bugging me: user requests with an absolute URI as the > urlpath, but the hostname in that URI does not match the hostname in the > Host header. This is mandated by the standard: https://tools.ietf.org/html/rfc7230#section-5.4 If the target URI includes an authority component, then a client MUST send a field-value for Host that is identical to that authority component, excluding any userinfo subcomponent and its "@" delimiter (Section 2.7.1). ... A server MUST respond with a 400 (Bad Request) status code to any HTTP/1.1 request message that lacks a Host header field and to any request message that contains more than one Host header field or a Host header field with an invalid field-value. Do you regularly encounter this case ? If so maybe we could have an option to relax the rule in certain cases. The standard allows proxies to ignore the provided Host field and rewrite it from the authority. Note that we're not a proxy by a gateway and it's often the case that a number of other gateways (and possibly proxies) have been met from the client, so we wouldn't do that by default but with a compelling case I woudln't find this problematic. Regards, Willy
'show errors' - logging & reasons
Hi, I'm wondering if there is any ongoing development or improvement plans around the 'show errors' functionality? This has come out of cases where we upgraded HAProxy 1.8 -> 2.2, and $work customers started reporting requests that previously worked fine now return 400 Invalid Request errors. Two things stand out: - way to reliably capture that output, not being limited to the last error - messaging about WHY a given position is an error - partial list of reasons I've seen so far included below Partial list of low-level invalid request reasons - path/queryparams has character that was supposed to be encoded - header key invalid character for given position - header line malformed (no colon!) - header value invalid relative to prior pieces of request** ** This one is bugging me: user requests with an absolute URI as the urlpath, but the hostname in that URI does not match the hostname in the Host header. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: PGP signature