I'm currently working through an issue that manifested as random 408
errors to users. A packet capture analysis coupled with observation of
user behavior and experience indicated that client browsers were opening
sockets and not sending data for a number of seconds.

We had timeout http-request 2s set which resulted in a TCP connect
followed 2 seconds later followed by an HTTP request being sent by the
client which didn't look for a 408 response before sending a request
(not unreasonable).  Most of the speculative connections coming from
browsers would actually result in receiving a 408 only to have the
browser close the connection 5.5 seconds after it was opened. 

At this point I decided to set timeout http-request to the default of 6s
for public facing listeners which may behave this way.  In my validation
testing I was baffled to discover that the 408 errors had gone away as
expected, but they were now replaced with "400 Bad Request" at 5
seconds.  I've been trying to figure out which different timeout I'm
running into that results in a 400, but I can't find anything in the
docs.  Incidentally, the haproxy docs at code.google.com are returning
403 unavailable so I've had some trouble reading the manual as the flat
.txt docs aren't as easy to navigate.

I'm not sure what I'm doing wrong.

I already tried increasing timeout connect from 5s to 15s with no change
in behavior.  See below for the pertinent sections of my config.

Shaun



global
    log 127.0.0.1:514 local2 notice emerg
    chroot /var/lib/haproxy
    pidfile /var/run/haproxy.pid
    maxconn 4000
    user root
    group haproxy
    daemon
    stats socket /tmp/haproxy.sock mode 777


defaults
    mode http
    log global
    option dontlognull
    option redispatch
    retries 3

    timeout check 3100ms # set to >3s to allow first-retry on TCP
retransmit to succeed
    timeout connect 15s
    timeout client 120s
    timeout server 120s
    timeout queue 300s

    timeout http-request 2s # Review this.  1s was too short and caused
408 errors, this is high enough to be dangerous.
    timeout http-keep-alive 10s
    maxconn 3000

    option http-server-close

frontend dmz_mysite_main 172.27.1.111:80
    timeout http-request 6s
    #log 127.0.0.1:514 local2 notice emerg
    #log 127.0.0.1:514 local2 debug
    option forwardfor except 172.27.1.111
    option httplog

    default_backend dmz_apache_error_bump

backend dmz_apache_error_bump
    balance roundrobin
    mode http
    log global
    server localhost_20980 127.0.0.1:20980 check inter 30000 weight 10




Reply via email to