Hi,
I work as a sysadmin for a company using HAProxy as HTTP reverse proxy for
Apache backends. The average session rate for this frontend is around 400
and the log (HTTP info level) shows no unexplainable errors. Stats average
connection and response errors are also very low. So far so good.
However, the frontend request error rate is very high: average of 65 req/s,
peak at 120 req/s, minimim of 10 req/s. We have 3 haproxy servers, they all
have the same issue (frontend error percentage is around 15%).
I've already analysed the haproxy log, captured and analysed packets with
tcpdump and wireshark, but I can't see any obvious reason why it's
happening. I tried to raise the client and server timeouts but it didn't
help and I can't keep it too high.
I have 3 questions about this:
1) What kind of tcp traffic can be counted as a request error by the
HAProxy stats page ? Only incomplete/wrong HTTP requests, or also aborted
tcp handshakes ? Is web browsers' preconnect activity also logged (and
counted as error when a connection is not used) ?
2) Why can't I see all errors in the log file ? Even in debug mode, I only
get a small percentage (around 20%) of all the frontend errors counted in
the stats page.
3) Do you have any tip about how I could continue to troubleshoot this ?
Here's my config for this frontend
#################
global
log /dev/log daemon info
stats socket /var/run/haproxy.sock group munin mode 770
maxconn 20000
user haproxy
group haproxy
defaults
log global
mode http
balance roundrobin
stats enable
stats auth user:password
stats refresh 5s
option httplog
option forwardfor
option httpclose
option httpchk GET /status.php HTTP/1.1\r\nHost:\
some.url.com\r\nAuthorization:\
Basic\ password
option redispatch
retries 1
maxconn 19500
timeout connect 5000
timeout client 50000
timeout server 50000
listen farm_http server_alias:80
bind 127.0.0.1:8082
no option redispatch
option forceclose
retries 0
timeout client 3000
timeout server 3000
acl is-ssl dst 127.0.0.1
reqadd X-Proto:\ SSL if is-ssl
acl is_staging hdr_sub(Cookie) -i some_acl_condition=_staging
use_backend staging if is_staging
acl is_beta url_dir beta
use_backend staging if is_beta
server backend1 backend1:80 check maxconn 30 slowstart 20s
server backend2 backend2:80 check maxconn 30 slowstart 20s
server backend3 backend3:80 check maxconn 30 slowstart 20s
...
server empty 127.0.0.1:81 check backup
################
Thanks for your help on this one !
Cheers,
Laurent