-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, I've been trying to use haproxy 1.3.15.7 in front of a couple of
erlang mochiweb servers in EC2.

The server alone can deal with about 3000 req/sec and
I can hit it directly with ab or siege or tsung and see a
similar result.

I then tried using nginx in front of the system and it was
about to reach about the same numbers although apparently
it couldn't really improve performance as much as I expected
and instead it increases latency quite a lot.

I then went on to try with haproxy but when I use ab to
benchmark with 100k connection with 1000 concurrency
after 30k requests I see haproxy jumping to 100% CPU usage.
I tried looking into a strace of what's going on and there are
many EADDRNOTAVAIL errors which I suppose means that
ports are finished, even though I increased the available ports
with sysctl.

haproxy configuration is the following:

global
    maxconn 25000
    user haproxy
    group haproxy

defaults
    log global
    mode    http
    option  dontlognull
    option httpclose
    option forceclose
    option forwardfor
    maxconn 25000
    timeout connect      5000
    timeout client       2000
    timeout server       10000
    timeout http-request 15000
    balance roundrobin

listen adserver
    bind :80
    server ad1 127.0.0.1:8000 check inter 10000 fall 50 rise 1

stats enable
    stats uri /lb?stats
    stats realm Haproxy\ Stats
    stats auth admin:pass
    stats refresh 5s

Reading this list archives I think I have some of the symptoms explained in
these mails:

http://www.formilux.org/archives/haproxy/0901/1670.html
This is caused by connect() failing for EADDRNOTAVAIL and thus considers
the server down.

http://www.formilux.org/archives/haproxy/0901/1735.html
I think I'm seeing exactly the same issue here.

A small strace excerpt:

socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 18
fcntl64(18, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(18, SOL_TCP, TCP_NODELAY, [1], 4) = 0
connect(18, {sa_family=AF_INET, sin_port=htons(8000), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
close(18)

or

recv(357, 0x9c1acb8, 16384, MSG_NOSIGNAL) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(0, EPOLL_CTL_ADD, 357, {EPOLLIN, {u32=357, u64=357}}) = 0

The last one mostly to show that I'm using epoll, in fact speculative epoll,
but even turning it off doesn't solve the issue.

An interesting problem is that if I use mode tcp instead of mode http this doesn't happen, but since it doesn't forward the client IP address (and I can't patch
an EC2 kernel) I can't do it.

ulimit-n showed by haproxy is 50k sockets, well above maxconn and well above
the 30k wehere it breaks.

sysctl.conf has the following settings:

# the following stops low-level messages on console
kernel.printk = 4 4 1 7
fs.inotify.max_user_watches = 524288
# some spoof protection
net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.all.rp_filter=1
# General gigabit tuning:
net.core.rmem_max = 33554432
net.core.wmem_max = 33554432
net.ipv4.tcp_rmem = 4096 16384 33554432
net.ipv4.tcp_wmem = 4096 16384 33554432
net.ipv4.tcp_mem = 786432 1048576 26777216
net.ipv4.tcp_max_tw_buckets = 360000
net.core.netdev_max_backlog = 2500
vm.min_free_kbytes = 65536
vm.swappiness = 0
net.ipv4.ip_local_port_range = 25000 65535

Everything runs on an ubuntu 8.04 with 2.6.21.7. Is there anything that I get
spectacularly wrong? Do you need more strace output?

- --
Valentino Volonghi aka Dialtone
Now running MacOS X 10.5
Home Page: http://www.twisted.it
http://www.adroll.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkmdrTUACgkQ9Llz28widGXofwCfaLI1/BYqRxdyRBbuVTxjCgPS
K1kAnRhe9c7gkHgR65kqULvVibHkl++T
=e6kt
-----END PGP SIGNATURE-----

Reply via email to