Limiting throughput with a cold cache

Dmitri Smirnov Fri, 12 Nov 2010 11:44:04 -0800

Hi,

We have been using haproxy for a few months now and the benefits havebeen immense. This list in particular is an indispensable resource.

We use it in the cloud to consistently distribute requests among thesquids using haproxy 1.4.8.

We run N proxies in front of M squids in different availability zoneswith the same configuration.

It also shields the clients from the volatile nature of amazon instancesbehind the proxies as proxies instantly redispatch requests when squidsdo down.

By doing this we, of course, loose a portion of the cache but it isacceptable when only 1 or 2 squids are out.

This brings me to the biggest challenge we have currently and that is ofa cold or mostly cold cache.

There is a drastic difference in the performance characteristics for thesystem when caches are cold and when they are hot.

While we are hot we serve about 6,000-10,000 rps, queues to backends arezero, the number of concurrent connections to backends is near zero.This boils down to a range 600 - 1,000 rps per haproxy instance for 10instances.

With cold caches or when the distribution is thrown off the latenciesshoot up 2-3 orders of magnitude and the number of concurrentconnections to squids goes up to hundreds. This lead to a flood clientretries( being fixed now) often maxing out the number of sockets leadinghaproxy to believe that squids are not reachable and marking them down(flip/flop). This led to redispatches making the picture even worse.

The limiting factor here is a latency and the number of concurrentpersistent connections that can be established back to the database fromsquids which I believe to be around 500.

Naturally, we have caches here in the first place for the reason.

This problem will require some time to address and is being activelyworked on.

While there are some fundamental problems here to work on, I waswondering if I could quickly tweak haproxies configuration to gracefullysupport both modes of operation in a short term since it is currentlythe only place in a chain where a powerful scripting can be done.


The objectives are:

1) allow maximum possible throughput when caches are hot

2) When caches are cold sustain a level of throughput that will allowcaches to warm up w/o melting the system down.


3) detect a slowdown by checking or/and
- avg_queue size
- queue size
- number of concurrent connections going up

4) Quickly reject requests that come beyond predetermined cold cachecapacity. If possible, do it on an individual server level rather thanon a backend level ( for cases when only some caches are cold).

One of the issues here is that if I specify maxconn for the individualserver, the connection is not rejected but goes to a queue. If I limitthe queue size then when timeout expires it will redispatch to anotherserver. I want re-dispatches only when a squid is down.


Below is a version of config under construction and somewhat simplified.

I will work out the exact numbers later. Right now server maxconn,slowstart timeout and queue threshold is a pure speculation.

I would appreciate any help as I am trying to wrap my brain around a lotof variables here and available tuning knobs.


--
Dmitri Smirnov

# This is a CE haproxy test config boilerplate
global
    daemon
    stats socket /apps/haproxy/var/stats level admin
    maxconn 10000

defaults
    mode http
    balance uri
    hash-type consistent
# local0 needs to be configured at /etc/syslog.conf
    log /dev/log local0
    option httplog

# Maximum number of concurrent connections on the frontend
# set to be the half of the total max in the global section above
    maxconn 5000

# timeout client is the max time of client inactivity
# when the client is expected to ack or send data
# we do not want to tie up for long time
    timeout client  100ms

# This is a max time to wait for connection to a server to succeed
    timeout connect 200ms

# This is a maximum timeout to wait in a queue at the backend
# by default it is the same as timeout connect but we set it explicitely

# Below we do not allow the queue to grow beyond 1 as this indicatesthat servers

# are slow and overloaded.
    timeout queue 200ms

# Maximum inactivity timeout for the server to ack or send data

# In other words, in situtions of meltdown we are not going to wait forslow data to come back ( not what is currently in prod)

# but this will still hopefully allow squid to refill
# max time is usually less than a second
    timeout server 1000ms

frontend http-in
   bind *:8080
   default_backend servers

# Problem, if one squid is cold this reject requests for the whole farm
   acl q_too_long avg_queue(servers) gt 0
   use_backend overload if q_too_long

backend overload
# HAproxy will issue 503 because no servers available for this backend
# Here we customize the response
    errorfile 503 /apps/haproxy/etc/fe_503.http

backend servers
    stats enable
    stats uri     /haproxy?status
    stats refresh 5s
    stats show-legends
    stats show-node
    option forceclose
    option forwardfor

# Redispatch if the destination server is down. This option will also
# redispatch if a queue timeout expired. However, we do not want
# to redispatch in that case.
    option redispatch
    retries 1

# Dynamically generated section follows.
# Example

server ec2-XXXX ec2-XXXX.compute-1.amazonaws.com:8080 check inter 1000rise 5 fall 3 maxconn 20 slowstart 30s

Limiting throughput with a cold cache

Reply via email to