On Sun, Mar 30, 2014 at 03:44:38PM -0500, Patrick Schless wrote:
> Very interesting, thanks for the tip. I only see two requests there, one of
> which seems like nonsense or a vulnerability scan
> ("\r\n\r\n\x00\x00\x00...."), and the other has a space in the path that's
> being requested due to improper escaping. Neither of those is a huge deal
> to me, though if the downstream server (nginx) would handle the space then
> I suppose I'd want to use the accept-invalid-request option.Or on the opposite, you certainly wouldn't want such a request to pass through since you have no idea how various backend components will handle it! How do you think the following request would be handled along your chain : GET /some/dir/file HTTP/1.1 HTTP/1.0 As you can see, your downstream server might handle it as 1.0 and others might stop at the first 1.1 and ignore the trailing part, resulting in one possibly ignoring the Host header and worse, supporting chunked encoding or intermediary responses that the other version would not, possibly resulting in a desynchronization between both components called HTTP request smuggling attack, which can quickly become a security issue if one of the component is supposed to validate some parts of the request (eg: a WAF). > I finally captured some 504s in the debug logging. 129 since yesterday > afternoon. They all seem to look like this: > Mar 30 14:46:19.000 haproxy-k49 haproxy[19450]: x.x.x.x:49638 > [30/Mar/2014:14:45:19.533] frontend_https~ tapp_http/tapp-m2t > 77/0/4/60000/60081 504 343 - - ---- 1255/1255/17/4/0 0/0 "GET /data/?a=b > HTTP/1.1" > > I'm guessing that the 60000/60081 means that 60s is some timeout threshold, > and 60.081 seconds were reached, which caused the 504. Is that correct? yes, that's it. > I > am also guessing that this is caused by slowness on the downstream > application servers. I do see a spike in the number of requests at the same > times as these 504s, and I suspect adding more downstream servers (with > balance leastconn) will help here. You should also possibly try to set a maxconn on your servers and lower it till the point where you only have large queues and acceptable response times. Then you'll simply have to divide the total queue size by the average maxconn you use and it will tell you how many servers you need to optimally serve your visitors without ever queuing. > I'm a little confused by the 60s timeout, though. I have these (below) in > my config, so I'd expect the the timeout that's causing the 504 is the > "timeout server", which isn't 60s. > > defaults > log global > option httplog > retries 3 > option redispatch > maxconn 8196 > timeout connect 20s > timeout client 80s > timeout server 80s > timeout http-keep-alive 50s > stats enable > stats auth user:password > stats uri /stats > > Is my 80s server timeout not taking for some reason, or is the 60s some > other setting? I'm mostly just curious, since 80s was artificially high > (while I try to address these problem), and I'll probably lower it to 50s > before I'm done. Then it should be 80, unless you have an explicit "timeout server 60s" in your backend section. Willy

