Added latency in cloud

haproxy Mon, 18 Oct 2010 14:24:44 -0700

I currently have Apache acting as a proxy/load balancer in front of several 
backend application servers. It works pretty well, but as our traffic is 
ramping up I feel we need a more intelligent proxy solution. So I've come 
across haproxy as what seems to be the standard for extremely high traffic load 
balancing.  Editors note: this is all set up in Rackspace cloud servers.  So 
far I've been able to set up all functionality I need, (sticky sessions with 
cookie, and an ACL where certain requests will go to a different set of 
application servers).


The only problem I'm having is that the round-trip time for a request has gone 
up dramatically. When using apache or nginx, with both HTTP and HTTPS 
connections, there is an average RTT per request of 50ms (which I consider 
good). With haproxy I am getting with HTTP an average RTT of a bit over 100ms, 
and adding stunnel4 in front to terminate the HTTPS connections adds another 
~50ms which brings the total upwards 175-225ms per request.

This is strange because when I prototyped this network layout at home with 
hardware (e.g. no virtual machines) haproxy had a higher minimum RTT than 
nginx, but it did not add anywhere near 50ms even when using stunnel in front 
of and again behind haproxy (to retain fully encrypted traffic to the backend 
servers) and under very high loads there was no significant increase as well. 
Granted that was all on a LAN, but when I see +50ms added to my requests in the 
cloud just for using haproxy instead of apache/nginx, that seems a little out 
of the ordinary. Another +50ms for terminating an HTTPS request with stunnel 
and routing it to a local static page?  That's crazy.

My theory was that perhaps in the virtualized environment when you route a 
request through the localhost interface before sending out through your 
internal LAN IP (which is how haproxy/stunnel are passing on their 
connections), it somehow hits a virtualization buffer/queue that takes it out 
to the host OS network stack before being routed back as a localhost request. 
This could explain all the added latency every time I make a new localhost 
based route. But when I ping localhost on a cloud server, I get 0.000ms RTT. 
Same if I ping my eth0 or eth1 network interfaces. Which is the way it should 
be, no network added latency and in fact no measurable latency at all.

If I am doing a proxy configuration that allows SSL to the proxy and opens 
another SSL tunnel to a backend server adding 250-300ms latency penalty right 
off the top, that is just not acceptable. My site will look sluggish and 
unresponsive.  (And I will need this full-through encryption to meet PCI 
compliance.)

I've tried every option related to keepalive and no keepalive...

This is my current haproxy.cfg

global
        log 127.0.0.1   local0
        maxconn 4096
        user haproxy
        group haproxy
        daemon
# for hatop
        stats socket /var/run/haproxy.sock mode 0600 level admin

defaults
        log     global
        mode   http
        option  httplog
        option  logasap
        option  redispatch
        retries 3
        maxconn 2000
        timeout connect 10s
        timeout client  60s
        timeout server  60s

        option  http-server-close
        option http-pretend-keepalive
        option forwardfor
        cookie BALANCEID

frontend http_proxy
        bind 0.0.0.0:80
        default_backend cluster1
        acl is_gfish path_beg /gfish_url
        use_backend cluster2 if is_gfish

backend cluster1
        server swww1 :80 cookie balancer.www1
        server swww2 :80 cookie balancer.www2

backend cluster2
        # replace the HTTP version in the request header (from 1.1 to 1.0) 
because glassfish will give a 0 byte response otherwise
        reqrep ^(.*)\ HTTP/[^\ ]+$ \1\ HTTP/1.0
        server gfish1 :8080 cookie balancer.www1
        server gfish2 :8080 cookie balancer.www2


I was also thinking that maybe the reqrep header mangling would add some 
latency, but I don't know how much and I can't test since I don't manage the 
glassfish servers on the back to try them with different settings.  But since 
stunnel+haproxy is adding latency to all requests, and not just glassfish ones, 
then I don't really know.  Any ideas for debugging this issue would be greatly 
appreciated.

---
posted at http://www.serverphorums.com
http://www.serverphorums.com/read.php?10,216799,216799#msg-216799

Added latency in cloud

Reply via email to