Hi list...
I've noticed that the HAProxy processes occasionally jump to 100% cpu load,
while the load before and after these peaks is only 3-5%, and the traffic
is also the same as outside of these cpu-peaks.
I saw a thread about this earlier (april/may), which concluded that there
was a bug, which was fixed in 1.5-dev19. Since we were running dev18 and
also experiencing this issue, we upgraded to dev19.
However, on dev19 I'm also seeing these cpu-load peaks surface a few times
per day.
As a precaution, we have configured nbproc to 7 currently, (8-cores in
these boxes).
I've been able to get some straces on the processes eating 100%, but
usually they drop back to 4% after I start the strace.
I did see large amounts of sequential epoll_wait calls in the processes
with 100% cpu load, and not with the other processes.
epoll_wait(0, {}, 200, 0) = 0
(repeated 10-15 times)
Haproxy config (edited)
# Defaults Section
defaults
mode http
timeout connect 5000ms
timeout client 500000ms
timeout server 500000ms
option splice-auto
option forwardfor
option log-health-checks
# Global Options
global
daemon
maxconn 50000
log 192.168.99.10:514 local1 info
stats socket /var/run/haproxy.sock uid 0 gid 0 mode 0600 level
admin
chroot /var/empty/haproxy
user haproxy
group haproxy
nbproc 7
node HOSTNAME
spread-checks 5
listen stats <deleted>
frontend in-10
bind IPIPIPIP:80 defer-accept
bind IPIPIPIP:443 ssl crt /etc/haproxy/ssl/CERT.pem defer-accept
ciphers RC4:HIGH:!aNULL:!MD5
maxconn 100000
default_backend backend-10
log global
mode http
option httplog
option dontlog-normal
acl SITE-DEAD nbsrv(backend-10) lt 1
redirect location http://we-are-down.site.tld code 303 if SITE-DEAD
backend backend-10
balance roundrobin
option http-server-close
option httpchk GET /test HTTP/1.0\nHost: site.tld\nConnection:
close\n\n
cookie JSESSIONID prefix
server server1 IPIPIPIP:80 check inter 20000 fall 5 downinter
30000 maxconn 2000 cookie 4 weight 1
server server2 IPIPIPIP:80 check inter 20000 fall 5 downinter
30000 maxconn 2000 cookie 5 weight 1
server server3 IPIPIPIP:80 check inter 20000 fall 5 downinter
30000 maxconn 2000 cookie 6 weight 1
server server4 IPIPIPIP:80 check inter 20000 fall 5 downinter 30000
maxconn 2000 cookie 0 weight 1
appsession JSESSIONID len 64 timeout 3h request-learn mode
path-parameters
option redispatch
option persist
contimeout 2000
log global
--
Mark Janssen -- maniac(at)maniac.nl
Unix / Linux Open-Source and Internet Consultant
Maniac.nl Sig-IO.nl Vps.Stoned-IT.com