Hi

HA-Proxy version 1.5-dev25-a339395 2014/05/10

I have read tons of posts addressing 408 wait timeouts and I have tried most 
if not all of them. Currently about 5% of my traffic is getting this error 
(~600,000 wait timeouts a day). The 408 errors that I see is that the 
browser gets the timeout error immediately, doesn't seem to match the 12 
seconds in the logs. I have seen this similar issue in other posts but I 
still cannot solve.

**My load balancers are dedicated physical servers and are not close to full 
utilization.**

I have tried the following in my config: 1. increase http-request to 12,60 
and 90s 2. Change http-keep-alive timeouts (1s, 5s, 10s, and 12s) 3. 
increased queue,server and client timouts to 32s each 4. increased connect 
timeout to 12s 5. turned on accept-invalid-http-request, splice-auto, tcp-
smart-connect, tcp-smart-accept 6. turned off compression and 7. lowered mss 
to 1422

Here are the errors in my log:

May 22 13:48:20 localhost.localdomain haproxy[19963]: <IP>:49219|
[22/May/2014:13:48:08.736]|non_ssl|non_ssl/<NOSRV>|-1/-1/-1/-1/+12000|408|+2
12|-|-|cR--|183/181/0/0/0|0/0|{||}||"<BADREQ>"|<IP>
May 22 13:48:20 localhost.localdomain haproxy[19963]: <IP>:49221|
[22/May/2014:13:48:08.746]|non_ssl|non_ssl/<NOSRV>|-1/-1/-1/-1/+12001|408|+2
12|-|-|cR--|182/180/0/0/0|0/0|{||}||"<BADREQ>"|<IP>
May 22 13:48:20 localhost.localdomain haproxy[19963]: <IP>:18792|
[22/May/2014:13:48:08.927]|non_ssl|non_ssl/<NOSRV>|-1/-1/-1/-1/+12000|408|+2
12|-|-|cR--|180/178/0/0/0|0/0|{||}||"<BADREQ>"|<IP>
May 22 13:48:20 localhost.localdomain haproxy[19963]: <IP>:18794|
[22/May/2014:13:48:08.927]|non_ssl|non_ssl/<NOSRV>|-1/-1/-1/-1/+12000|408|+2
12|-|-|cR--|179/177/0/0/0|0/0|{||}||"<BADREQ>"|<IP>
May 22 13:48:20 localhost.localdomain haproxy[19963]: <IP>:18795|
[22/May/2014:13:48:08.927]|non_ssl|non_ssl/<NOSRV>|-1/-1/-1/-1/+12000|408|+2
12|-|-|cR--|178/176/0/0/0|0/0|{||}||"<BADREQ>"|<IP>
May 22 13:48:20 localhost.localdomain haproxy[19963]: <IP>:18791|
[22/May/2014:13:48:08.929]|non_ssl|non_ssl/<NOSRV>|-1/-1/-1/-1/+12000|408|+2
12|-|-|cR--|176/174/0/0/0|0/0|{||}||"<BADREQ>"|<IP>


I have mixed and matched the above options but none have seemed to work. 
Currently here is my config (without ACLs):

global
        log 127.0.0.1 local2    ##Log to the local rsyslog daemon
        user haproxy
        group haproxy
        pidfile /var/run/haproxy.pid
        stats socket /tmp/haproxy.socket user nobody group nobody mode 600 
level admin
        node <NODENAME>
        description HAPROXY2-DL
        daemon
        maxconn 120000
        spread-checks 3
        ca-base /etc/ssl/certs/comb
        crt-base /etc/ssl/certs/comb
        quiet


defaults
        log    global
        mode    http
        option forwardfor
        compression algo gzip
        compression type text/html text/plain text/css text/xml 
text/javascript
        retries 5
        timeout http-request 12s
        timeout http-keep-alive 1s
        timeout queue   32s
        timeout connect 12s
        timeout server  32s
        timeout client  32s
        option http-server-close
        option accept-invalid-http-request
        option splice-auto
        option tcp-smart-connect
        option tcp-smart-accept
        log-format %ci:%cp|
[%t]|%ft|%b/%s|%Tq/%Tw/%Tc/%Tr/%Tt|%ST|%B|%CC|%CS|%tsc|%ac/%fc/%bc/%sc/%rc|%
sq/%bq|%hr|%hs|%{+Q}r|%fi


###PORT 80 LISTENER###

frontend non_ssl *:80 mss 1422
        ##Rate Limit, block ip for 10 minutes if true
        stick-table type ip size 400k expire 10m store gpc0
        acl whitelist src <ip1> <subnet1>
        acl akamai_user_agent hdr_sub(User-Agent) -i <user-agent-cdn>
        acl source_is_abuser src_get_gpc0(non_ssl) gt 0
        use_backend ease-up-y0 if source_is_abuser ! whitelist
        tcp-request connection track-sc1 src if ! source_is_abuser
        acl network_allowed src <IP1> <subnet1>
        acl restricted_page url_reg wp-admin
        acl restricted_page url_reg wp-login.php
        acl restricted_page url_reg cms/login-form.php
        block if restricted_page !network_allowed
        ############################
        ###OPTIONS
        maxconn 100000
        mode http
        option logasap
        option forwardfor
        reqadd X-Forwarded-Proto:\ http


I understand that some 408s are normal (possible DDOS) but I believe 5% is 
too high. 

So far we have only been able to reproduce the problem in Chrome but we 
can't rule out just yet that it is isolated only to Chrome. Lastly, clients 
have only recently started seeing these timeouts in their browser even 
though the logs have shown the 408s for months.

Any help would truly be appreciated.


M


Reply via email to