Hi,
I am running haproxy using Ubuntu 10.04 LTS on some HP DL360's which
have 1 CPU (8cores - some servers with 2.0GHz, others 2.5GHz) and 8Gb
RAM. They talk http to 8 nginx backends and I am seeing up around 100%
CPU utilization by the haproxy process, resulting in high latency.
Previously these machines were behind hardware firewalls that were
experiencing capacity issues and were subsequently moved away from
them. In doing so iptables had to be enabled, but I have made sure the
nf_conntrack hashsize is sufficiently sized. Prior to relocating the
servers each haproxy instance would do around 12k requests per second
however now they jump around the 9-11k requests per second mark. The
upstream firewall issues may have actually been masking the haproxy cpu
starvation problems so it may just be a coincidence that this cpu
symptom was noticed after the relocation.
I've looked at a range of commonly known bottlenecks which has included
nf_conntrack bucket sizes (even removed conntrack & iptables
temporarily), tuning haproxy by pinning haproxy to a single cpu, various
ethtool & sysctl options etc.
I see that people are able to sustain higher throughput using haproxy on
this sort of hardware spec so I suspect I've got a misconfiguration
somewhere, or a bug causing the high CPU util %. I am having difficulty
isolating what exactly is causing the high cpu usage/latency problem so
any advice on what else to look for or what additional debugging I can
do would be greatly appreciated.
Respective configs and info below:
# haproxy -vv
HA-Proxy version 1.4.19 2012/01/07
Copyright 2000-2011 Willy Tarreau <[email protected]>
Build options :
TARGET = linux26
CPU = amd64
CC = gcc
CFLAGS = -g -fno-strict-aliasing
OPTIONS = USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_PCRE=1
Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200
Encrypted password support via crypt(3): yes
Available polling systems :
sepoll : pref=400, test result OK
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 4 (4 usable), will use sepoll.
# my haproxy.conf (yes, I know the maxconn's are quite high.).
global
daemon
maxconn 320000
spread-checks 3
nbproc 1
pidfile /var/run/haproxy/haproxy.pid
stats socket /var/run/haproxy/haproxy.sock
user haproxy
group haproxy
defaults
mode http
maxconn 300000
retries 2
timeout connect 5000ms
timeout client 5000ms
timeout server 5000ms
timeout queue 3000ms
timeout check 2000ms
timeout http-request 10s
timeout http-keep-alive 10s
errorfile 400 /etc/haproxy/errors/400.http
listen backends
bind *:80
balance leastconn
option splice-auto
option tcp-smart-accept
option tcp-smart-connect
option forwardfor except 127.0.0.1
option http-server-close
option redispatch
option http-no-delay
option httpchk GET /health?from=haproxy
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /xx?xx
stats auth xx:xx
# server localhost 127.0.0.1:8080 maxconn 40000 check fall 1
weight 75
server backend1 122.200.x.x:8080 maxconn 40000 check fall 1
weight 50
server backend2 122.200.x.x:8080 maxconn 40000 check fall 1
weight 50
server backend3 122.200.x.x:8080 maxconn 40000 check fall 1
weight 50
server backend4 122.200.x.x:8080 maxconn 40000 check fall 1
weight 50
server backend5 122.200.x.x:8080 maxconn 40000 check fall 1
weight 70
server backend6 122.200.x.x:8080 maxconn 40000 check fall 1
weight 70
server backend7 122.200.x.x:8080 maxconn 40000 check fall 1
weight 70
server backend8 122.200.x.x:8080 maxconn 40000 check fall 1
weight 85
# top -b -n1 -U haproxy
top - 14:28:28 up 3:51, 4 users, load average: 1.03, 1.07, 1.01
Tasks: 195 total, 2 running, 193 sleeping, 0 stopped, 0 zombie
Cpu0 : 16.5%us, 38.5%sy, 0.0%ni, 40.5%id, 0.0%wa, 0.0%hi, 4.4%si,
0.0%st
Cpu1 : 0.4%us, 1.4%sy, 0.0%ni, 94.4%id, 0.1%wa, 0.0%hi, 3.7%si,
0.0%st
Cpu2 : 0.5%us, 1.3%sy, 0.0%ni, 94.5%id, 0.1%wa, 0.0%hi, 3.7%si,
0.0%st
Cpu3 : 1.0%us, 1.5%sy, 0.0%ni, 91.9%id, 0.1%wa, 0.0%hi, 5.6%si,
0.0%st
Cpu4 : 0.3%us, 1.2%sy, 0.0%ni, 93.1%id, 0.0%wa, 0.0%hi, 5.4%si,
0.0%st
Cpu5 : 0.3%us, 0.2%sy, 0.0%ni, 95.7%id, 0.0%wa, 0.0%hi, 3.8%si,
0.0%st
Cpu6 : 0.3%us, 0.4%sy, 0.0%ni, 95.9%id, 0.0%wa, 0.0%hi, 3.4%si,
0.0%st
Cpu7 : 0.4%us, 0.3%sy, 0.0%ni, 94.2%id, 0.0%wa, 0.0%hi, 5.1%si,
0.0%st
Mem: 8192404k total, 3547392k used, 4645012k free, 52668k buffers
Swap: 3878904k total, 0k used, 3878904k free, 183604k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15611 haproxy 20 0 6848m 2.7g 700 R 99 34.0 97:57.06
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p
/var/run/haproxy/haproxy.pid
# strace -c -p $(pidof haproxy)
Process 15611 attached - interrupt to quit
^CProcess 15611 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
26.29 0.002494 0 48144 close
26.26 0.002491 0 78178 51496 connect
10.77 0.001022 0 38527 4026 sendto
10.16 0.000964 0 103831 setsockopt
8.15 0.000773 0 70599 fcntl
7.32 0.000694 0 51337 socket
5.22 0.000495 0 19263 accept
3.11 0.000295 0 34871 676 recvfrom
2.49 0.000236 0 9766 shutdown
0.22 0.000021 0 4532 epoll_ctl
0.00 0.000000 0 300 epoll_wait
------ ----------- ----------- --------- --------- ----------------
100.00 0.009485 459348 56198 total
Regards,
Anton