Re: Backend per-server rate limiting

Willy Tarreau Tue, 07 Aug 2012 23:11:37 -0700

Hi Andrew,

On Tue, Aug 07, 2012 at 11:44:53PM -0600, Andrew Davidoff wrote:
> Hi,
> 
> I'm trying to determine if haproxy can be configured to solve a rate
> limiting based problem I have. I believe that it can, but that I am not
> seeing how to put the configuration together to get it done. Here's what
> I'm trying to do:
> 
> I have a set of servers (backends) that can each handle a specific number
> of requests per second (the same rate for each backend). I'd like haproxy
> to accept requests and farm them out to these backends so that each request
> is sent to the first backend that isn't over its rate limit. If all
> backends are over their rate limits, ideally the client connection would
> just block and wait, but if haproxy has to return a rejection, I think I
> can deal with this.
> 
> My first thought was to use frontend's rate-limit sessions, setting it to
> n*rate-limit where n is the number of backends I have to serve these
> requests. Additionally, those backends would be balanced round-robin.
> 
> The problem with this is that if a backend falls out, the front end rate
> limit is then too high since there are less backends available than there
> were when it was originally configured. The only way I see that I could
> dynamically change the frontend rate-limit as backends rise and fall is to
> write something that watches the logs for rise/fall messages and uses the
> global rate limit setting via the haproxy socket. This might work, but the
> biggest drawback is that one instance of haproxy could only handle requests
> of a single rate limit, since modifications after starting would have to be
> global (not per frontend).
> 
> I guess in other words, I am trying to apply rate limits to individual
> backend servers, and to have a front end cycle through all available
> backend servers until it either finds one that can handle the request, or
> exhausts them all, at which time it'd ideally just block and keep trying,
> or less ideally send some sort of failure/rejection to the client.
> 
> I feel like there's a simple solution here that I'm not seeing. Any help is
> appreciated.


What you're asking for is in the 1.6 roadmap and the road will be long before
we reach this point.

Maybe in the mean time we could develop a new LB algorithm which considers
each server's request rate, and forwards the traffic to the least used one.
In parallel, having an ACL which computes the average per-server request
rate would allow requests to be rejected when there's a risk to overload
the servers. But that doesn't seem trivial and I have doubts about its real
usefulness.

What is needed is to convert a rate into a concurrency in order to queue
excess requests. What you can do at the moment, if you don't have too many
servers, is to have one proxy per server with its own rate limit. This way
you will be able to smooth the load in the first stage between all servers,
and even reject requests when the load is too high. You have to check the
real servers though, otherwise the health-checks would cause flapping when
the second level proxies are saturated. This would basically look like this :

   listen front
      bind :80
      balance leastconn
      server srv1 127.0.0.1:8000 maxconn 100 track back1/srv
      server srv2 127.0.0.2:8000 maxconn 100 track back2/srv
      server srv3 127.0.0.3:8000 maxconn 100 track back3/srv

   listen back1
      bind 127.0.0.1:8000
      rate-limit 10
      server srv 192.168.0.1:80 check

   listen back2
      bind 127.0.0.2:8000
      rate-limit 10
      server srv 192.168.0.2:80 check

   listen back3
      bind 127.0.0.3:8000
      rate-limit 10
      server srv 192.168.0.3:80 check

Then you have to play with the maxconn, maxqueue and timeout queue in
order to evict requests that are queued for too long a time, but you
get the idea.

Could I know what use case makes your servers sensible to the request rate ?
This is something totally abnormal since it should necessarily translate into
a concurrent number of connections at any place in the server. If the server
responds quickly, there should be no reason it cannot accept high request
rates. It's important to understand the complete model in order to build a
rock-solid configuration that will not just be a workaround for a symptom.

Regards,
Willy

Re: Backend per-server rate limiting

Reply via email to