On Wed, Jan 29, 2014 at 03:30:31PM +0100, Jens-Christian Fischer wrote: > This is a rather complicated network setup. The VMs have an internal network > (10?.) and the public IP addresses are one a physical hosts, that NATs this > to the physical host where the VM is running. > > pinging from the loadbalancer to one of the web servers isn't that much > better though? > > root@box-lb1:~# ping 10.0.20.48 > PING 10.0.20.48 (10.0.20.48) 56(84) bytes of data. > 64 bytes from 10.0.20.48: icmp_seq=1 ttl=64 time=1.15 ms > 64 bytes from 10.0.20.48: icmp_seq=2 ttl=64 time=0.775 ms
Indeed. > > Huh! Could you check the application directly first to confirm that the 54 > > ms > > are the expected response time ? The SSL handshake adds some round trips, so > > given your network latency, it can explain why it adds 10ms (though that > > would > > mean 10 round trips) over HTTP. > > the app indeed takes that long > > (now on one of the web servers) > root@box-web3:~# ab -c 1 -n 200 "http://127.0.0.1/status.php" > Time per request: 50.340 [ms] (mean) OK. > Now testing a "hello world.php" script > > root@box-web3:~# ab -c 1 -n 200 "http://127.0.0.1/foo.php" > Time per request: 1.148 [ms] (mean) > > and now a 10 byte html file > root@box-web3:~# ab -c 1 -n 200 "http://127.0.0.1/foo.html" > Time per request: 0.227 [ms] (mean) Fine, this one could be used as a reference for further testing. > > That's indeed abnormally slow. And I can't imagine the latency with these > > numbers! Could you please run ab from the haproxy machine to the PHP server > > to get a reference of this direction ? It will also tell you what part of > > the response time the network represents. > > sure: > > root@box-lb1:~# ab -c 8 -n 500 "http://10.0.20.48/foo.php" > Time per request: 4.874 [ms] (mean) So the network has basically quadrupled the load time here. > root@box-lb1:~# ab -c 8 -n 500 "http://10.0.20.48/status.php" > Time per request: 58.299 [ms] (mean) > > root@box-lb1:~# ab -c 8 -n 500 "http://10.0.20.48/foo.html" > Time per request: 2.690 [ms] (mean) Overall we could say that you have 2ms RTT in HTTP if we use foo.html as a reference. That's equivalent to a 200km WAN link. The positive point is that you would not lose anything more by making it geographically redundant :-) > > Also you can try with ab -k in a second test (keep-alive) which will save > > you from connection setup times. It will save two round trips per request, > > one of which appears in the measure. On a sane local server, you should > > almost not notice the difference. Here I already expect a difference. > > again: from a physical host: > > ab -c 8 -n 500 "https:/example.com/foo.php" > Time per request: 36.914 [ms] (mean) I seem to remember that ab renegociates SSL for every connection. I may be wrong but that's what I had in mind, so possibly you're benchmarking the worst case (new client for each connection) which is always important to determine. > ab -kc 8 -n 500 "https:/example.com/foo.php" > Time per request: 36.833 [ms] (mean) Seems strange that these are the same numbers, because once the connection is established, it should remain fast. Did ab report that it used keep-alive ? It's also possible that for any reason the connection was closed after each response. > Note: I have changed the cipher setup in haproxy since the last round of > tests: OK, no problem. > --- cut --- > frontend ssl # @zabbix_frontend(box-lb1) > bind 0.0.0.0:443 ssl crt /etc/haproxy/example.com.crt.pem ciphers > ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AES:RSA+3DES:!ADH:!AECDH:!MD5:!DSS > --- cut --- (...) > >> Are these numbers something that is expected? should HAProxy be able to > >> terminate more than 200 SSL requests per second? > > > > Clearly not, your numbers should be at least 10-100 times that. That smells > > like the usual VM environment running on an overbooked system capable of > > running 8 VM and sold to run 200. > > this is infrastructure we are building and providing ourselves: The load > balancing VM is on a host that has 9 VMs running (33 virtual CPUs) on a > system with 24 real cores. OK if you're building it yourself, at least you won't be victim of your provider's choices nor of noisy neighbours. That said, as you can see, making 33 virtual CPUs out of 24 real ones necessarily means you add a lot of latency because you need to schedule between them and you can't offer less than a timeslice (generally 1ms, but could even be 10). > This is a top from the physical server while running a test against the > loadbalancer: > > top - 15:22:48 up 37 days, 3:19, 1 user, load average: 0.89, 0.79, 0.81 > Tasks: 396 total, 1 running, 395 sleeping, 0 stopped, 0 zombie > %Cpu(s): 5.0 us, 0.8 sy, 0.0 ni, 93.7 id, 0.1 wa, 0.0 hi, 0.3 si, 0.0 > st > MiB Mem: 128915 total, 128405 used, 509 free, 44 buffers > MiB Swap: 65519 total, 5079 used, 60440 free, 113904 cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 29297 libvirt- 20 0 22.7g 886m 13m S 137 0.7 52:27.96 qemu-system-x86 OK, the load is not that high. > The VM of the load balancer currently is configured with 8 vCPUs and 16GB of > RAM. You can safely remove 6 CPUs that you will probably never use. If you could dedicate 2 real cores instead (one for the interrupts, one for haproxy), it would be much more efficient because there would be only full timeslices. > We are using Ceph (http://ceph.com) as a distributed storage system that > contains the volumes the VM boots off of. I am not 100% sure that some slow > downs aren't coming from that (as all disk access from a VM is basically > reaching out through the network to a bunch of servers that hold parts of the > disk), but I assume, that HAProxy is not using much disk IO. HAProxy does not use *any* I/O once started. It does not even have this possibility. It will only use network, CPU and memory. > > (...) > > > > Your configuration is fine and clean. You should definitely get much > > better numbers. Recently one guy posted so ugly results on the list > > when running in a virtualized environment, and moving to a sane place > > got rid of all his trouble. > > Will keep that in mind :) > > I also checked that the VM has access to the AES-NI instruction set of the > CPU: It does. OK, but anyway at such low loads, it doesn't matter at all. The benefit of AES-NI is for large transfers. For example, you can easily use it to reach multi-gig rates on streaming data, videos, etc. > > The only thing I like with these dishonnest people selling overloaded > > VMs is that they make us spread load balancers where none should be > > needed if what they sold was what they advertise! > > Ah - we are not selling this - does that make us less dishonest? :) :-) You can tune whatever you want if you have control over everything, which is good. Most cloud users do not have this chance and only have the option to leave and try another one which is less overloaded. But with todays dedicated hosting prices, it makes no sense to be hosted outside on VMs, it's only seeking trouble. Cheers, Willy

