Re: Backend per-server rate limiting

David Birdsong Wed, 08 Aug 2012 00:23:47 -0700

On Tue, Aug 7, 2012 at 11:51 PM, Andrew Davidoff <[email protected]> wrote:
> Willy,
>
> Thanks for the quick response. I haven't fully digested your example
> suggestion yet but I will sit down with it and the haproxy configuration
> documentation and sort it out in my brain.
>
> Here's the basic idea of the use case. Let me go ahead and state that maybe
> haproxy just isn't the right solution here. There are many ways to solve
> this, It just seemed to me like haproxy might have been a magic answer.
>
> We make a bunch of requests to an API that rate limits based on source IP.
> To maximize our overall request rate, we utilize proxies to afford us more
> source IPs. Even if those proxies can handle a ton of work themselves, if we
> push them, individually, over the API's rate limits, they can be temporarily
> or permanently disallowed from accessing the API.


You could also consider solving this socially instead. Have you tried
reaching out to this service to ask for a higher rate limit? Your work
would be wasted if they improved their scraper detection beyond source
ip addresses.

What service is it?

>
> Right now our API clients (scripts) handle rate limiting themselves. The way
> they currently do this involves knowledge of the per-source-IP rate limit
> for the API they're talking to, and how many proxies live behind a squid
> instance that all their requests go through. That squid instance hands out
> proxies round-robin, which is what makes the request rate work.
>
> Based on how the scripts currently handle the rate limiting, we start
> running into problems if we want multiple scripts accessing the same API to
> run at the same time. Basically, each running script must then know about
> any other scripts that are running and talking to the same API, so it can
> adjust its request rate accordingly, and anything already running needs be
> informed that more scripts access the same API have started up, so it can do
> the same.
>
> Additionally, we run into the problem of proxies failing. If a proxy fails
> and the scripts don't learn then and adjust their rate limits, then the
> per-proxy rate limit has inadvertently increased across all proxies.
>
> So, again, there are many ways to solve this and maybe haproxy just isn't
> the answer, but I thought maybe it would be. At the moment I'm very much in
> "don't reinvent the wheel" mode, and I thought maybe haproxy had solved
> this.
>
> Thanks again for your help.
> Andy
>
>
> On Wed, Aug 8, 2012 at 12:11 AM, Willy Tarreau <[email protected]> wrote:
>>
>> Hi Andrew,
>>
>> On Tue, Aug 07, 2012 at 11:44:53PM -0600, Andrew Davidoff wrote:
>> > Hi,
>> >
>> > I'm trying to determine if haproxy can be configured to solve a rate
>> > limiting based problem I have. I believe that it can, but that I am not
>> > seeing how to put the configuration together to get it done. Here's what
>> > I'm trying to do:
>> >
>> > I have a set of servers (backends) that can each handle a specific
>> > number
>> > of requests per second (the same rate for each backend). I'd like
>> > haproxy
>> > to accept requests and farm them out to these backends so that each
>> > request
>> > is sent to the first backend that isn't over its rate limit. If all
>> > backends are over their rate limits, ideally the client connection would
>> > just block and wait, but if haproxy has to return a rejection, I think I
>> > can deal with this.
>> >
>> > My first thought was to use frontend's rate-limit sessions, setting it
>> > to
>> > n*rate-limit where n is the number of backends I have to serve these
>> > requests. Additionally, those backends would be balanced round-robin.
>> >
>> > The problem with this is that if a backend falls out, the front end rate
>> > limit is then too high since there are less backends available than
>> > there
>> > were when it was originally configured. The only way I see that I could
>> > dynamically change the frontend rate-limit as backends rise and fall is
>> > to
>> > write something that watches the logs for rise/fall messages and uses
>> > the
>> > global rate limit setting via the haproxy socket. This might work, but
>> > the
>> > biggest drawback is that one instance of haproxy could only handle
>> > requests
>> > of a single rate limit, since modifications after starting would have to
>> > be
>> > global (not per frontend).
>> >
>> > I guess in other words, I am trying to apply rate limits to individual
>> > backend servers, and to have a front end cycle through all available
>> > backend servers until it either finds one that can handle the request,
>> > or
>> > exhausts them all, at which time it'd ideally just block and keep
>> > trying,
>> > or less ideally send some sort of failure/rejection to the client.
>> >
>> > I feel like there's a simple solution here that I'm not seeing. Any help
>> > is
>> > appreciated.
>>
>> What you're asking for is in the 1.6 roadmap and the road will be long
>> before
>> we reach this point.
>>
>> Maybe in the mean time we could develop a new LB algorithm which considers
>> each server's request rate, and forwards the traffic to the least used
>> one.
>> In parallel, having an ACL which computes the average per-server request
>> rate would allow requests to be rejected when there's a risk to overload
>> the servers. But that doesn't seem trivial and I have doubts about its
>> real
>> usefulness.
>>
>> What is needed is to convert a rate into a concurrency in order to queue
>> excess requests. What you can do at the moment, if you don't have too many
>> servers, is to have one proxy per server with its own rate limit. This way
>> you will be able to smooth the load in the first stage between all
>> servers,
>> and even reject requests when the load is too high. You have to check the
>> real servers though, otherwise the health-checks would cause flapping when
>> the second level proxies are saturated. This would basically look like
>> this :
>>
>>    listen front
>>       bind :80
>>       balance leastconn
>>       server srv1 127.0.0.1:8000 maxconn 100 track back1/srv
>>       server srv2 127.0.0.2:8000 maxconn 100 track back2/srv
>>       server srv3 127.0.0.3:8000 maxconn 100 track back3/srv
>>
>>    listen back1
>>       bind 127.0.0.1:8000
>>       rate-limit 10
>>       server srv 192.168.0.1:80 check
>>
>>    listen back2
>>       bind 127.0.0.2:8000
>>       rate-limit 10
>>       server srv 192.168.0.2:80 check
>>
>>    listen back3
>>       bind 127.0.0.3:8000
>>       rate-limit 10
>>       server srv 192.168.0.3:80 check
>>
>> Then you have to play with the maxconn, maxqueue and timeout queue in
>> order to evict requests that are queued for too long a time, but you
>> get the idea.
>>
>> Could I know what use case makes your servers sensible to the request rate
>> ?
>> This is something totally abnormal since it should necessarily translate
>> into
>> a concurrent number of connections at any place in the server. If the
>> server
>> responds quickly, there should be no reason it cannot accept high request
>> rates. It's important to understand the complete model in order to build a
>> rock-solid configuration that will not just be a workaround for a symptom.
>>
>> Regards,
>> Willy
>>
>

Re: Backend per-server rate limiting

Reply via email to