Re: Backend per-server rate limiting
> On 28 Sep 2016, at 10:49, Stephan Müllerwrote: > > Hi, > > i want to configure a rate limit (say 100 http req/sec) for each backend > server like this: > > listen front > bind :80 > balance leastconn > server srv1 127.0.0.1:8000 limit 100 > server srv2 127.0.0.2:8000 limit 100 > > As far i can see rate limiting is only supported for frontends [1]. > However,a long time ago, someone asked about the same question [2]. The > proposed solution was a multi tier load balancing having an extra proxy per > backend server, like this: > > listen front > bind :80 > balance leastconn > server srv1 127.0.0.1:8000 maxconn 100 track back1/srv > server srv2 127.0.0.2:8000 maxconn 100 track back2/srv > > listen back1 > bind 127.0.0.1:8000 > rate-limit 10 > server srv 192.168.0.1:80 check > > listen back2 > bind 127.0.0.2:8000 > rate-limit 10 > server srv 192.168.0.2:80 check > > Is there a better (new) way to do that? The old thread mentioned, its on the > roadmap for 1.6. > As far as I understand, "track" only affects health checks. Otherwise servers with the same name in different backend work independently. So servers in your first frontend (:80) will have no ratelimit.
Backend per-server rate limiting
Hi, i want to configure a rate limit (say 100 http req/sec) for each backend server like this: listen front bind :80 balance leastconn server srv1 127.0.0.1:8000 limit 100 server srv2 127.0.0.2:8000 limit 100 As far i can see rate limiting is only supported for frontends [1]. However,a long time ago, someone asked about the same question [2]. The proposed solution was a multi tier load balancing having an extra proxy per backend server, like this: listen front bind :80 balance leastconn server srv1 127.0.0.1:8000 maxconn 100 track back1/srv server srv2 127.0.0.2:8000 maxconn 100 track back2/srv listen back1 bind 127.0.0.1:8000 rate-limit 10 server srv 192.168.0.1:80 check listen back2 bind 127.0.0.2:8000 rate-limit 10 server srv 192.168.0.2:80 check Is there a better (new) way to do that? The old thread mentioned, its on the roadmap for 1.6. Cheers Stephan -- [1] http://cbonte.github.io/haproxy-dconv/1.6/configuration.html#rate-limit [2] http://comments.gmane.org/gmane.comp.web.haproxy/9199
Re: Backend per-server rate limiting
Hi Andrew, On Tue, Aug 07, 2012 at 11:44:53PM -0600, Andrew Davidoff wrote: Hi, I'm trying to determine if haproxy can be configured to solve a rate limiting based problem I have. I believe that it can, but that I am not seeing how to put the configuration together to get it done. Here's what I'm trying to do: I have a set of servers (backends) that can each handle a specific number of requests per second (the same rate for each backend). I'd like haproxy to accept requests and farm them out to these backends so that each request is sent to the first backend that isn't over its rate limit. If all backends are over their rate limits, ideally the client connection would just block and wait, but if haproxy has to return a rejection, I think I can deal with this. My first thought was to use frontend's rate-limit sessions, setting it to n*rate-limit where n is the number of backends I have to serve these requests. Additionally, those backends would be balanced round-robin. The problem with this is that if a backend falls out, the front end rate limit is then too high since there are less backends available than there were when it was originally configured. The only way I see that I could dynamically change the frontend rate-limit as backends rise and fall is to write something that watches the logs for rise/fall messages and uses the global rate limit setting via the haproxy socket. This might work, but the biggest drawback is that one instance of haproxy could only handle requests of a single rate limit, since modifications after starting would have to be global (not per frontend). I guess in other words, I am trying to apply rate limits to individual backend servers, and to have a front end cycle through all available backend servers until it either finds one that can handle the request, or exhausts them all, at which time it'd ideally just block and keep trying, or less ideally send some sort of failure/rejection to the client. I feel like there's a simple solution here that I'm not seeing. Any help is appreciated. What you're asking for is in the 1.6 roadmap and the road will be long before we reach this point. Maybe in the mean time we could develop a new LB algorithm which considers each server's request rate, and forwards the traffic to the least used one. In parallel, having an ACL which computes the average per-server request rate would allow requests to be rejected when there's a risk to overload the servers. But that doesn't seem trivial and I have doubts about its real usefulness. What is needed is to convert a rate into a concurrency in order to queue excess requests. What you can do at the moment, if you don't have too many servers, is to have one proxy per server with its own rate limit. This way you will be able to smooth the load in the first stage between all servers, and even reject requests when the load is too high. You have to check the real servers though, otherwise the health-checks would cause flapping when the second level proxies are saturated. This would basically look like this : listen front bind :80 balance leastconn server srv1 127.0.0.1:8000 maxconn 100 track back1/srv server srv2 127.0.0.2:8000 maxconn 100 track back2/srv server srv3 127.0.0.3:8000 maxconn 100 track back3/srv listen back1 bind 127.0.0.1:8000 rate-limit 10 server srv 192.168.0.1:80 check listen back2 bind 127.0.0.2:8000 rate-limit 10 server srv 192.168.0.2:80 check listen back3 bind 127.0.0.3:8000 rate-limit 10 server srv 192.168.0.3:80 check Then you have to play with the maxconn, maxqueue and timeout queue in order to evict requests that are queued for too long a time, but you get the idea. Could I know what use case makes your servers sensible to the request rate ? This is something totally abnormal since it should necessarily translate into a concurrent number of connections at any place in the server. If the server responds quickly, there should be no reason it cannot accept high request rates. It's important to understand the complete model in order to build a rock-solid configuration that will not just be a workaround for a symptom. Regards, Willy
Re: Backend per-server rate limiting
Willy, Thanks for the quick response. I haven't fully digested your example suggestion yet but I will sit down with it and the haproxy configuration documentation and sort it out in my brain. Here's the basic idea of the use case. Let me go ahead and state that maybe haproxy just isn't the right solution here. There are many ways to solve this, It just seemed to me like haproxy might have been a magic answer. We make a bunch of requests to an API that rate limits based on source IP. To maximize our overall request rate, we utilize proxies to afford us more source IPs. Even if those proxies can handle a ton of work themselves, if we push them, individually, over the API's rate limits, they can be temporarily or permanently disallowed from accessing the API. Right now our API clients (scripts) handle rate limiting themselves. The way they currently do this involves knowledge of the per-source-IP rate limit for the API they're talking to, and how many proxies live behind a squid instance that all their requests go through. That squid instance hands out proxies round-robin, which is what makes the request rate work. Based on how the scripts currently handle the rate limiting, we start running into problems if we want multiple scripts accessing the same API to run at the same time. Basically, each running script must then know about any other scripts that are running and talking to the same API, so it can adjust its request rate accordingly, and anything already running needs be informed that more scripts access the same API have started up, so it can do the same. Additionally, we run into the problem of proxies failing. If a proxy fails and the scripts don't learn then and adjust their rate limits, then the per-proxy rate limit has inadvertently increased across all proxies. So, again, there are many ways to solve this and maybe haproxy just isn't the answer, but I thought maybe it would be. At the moment I'm very much in don't reinvent the wheel mode, and I thought maybe haproxy had solved this. Thanks again for your help. Andy On Wed, Aug 8, 2012 at 12:11 AM, Willy Tarreau w...@1wt.eu wrote: Hi Andrew, On Tue, Aug 07, 2012 at 11:44:53PM -0600, Andrew Davidoff wrote: Hi, I'm trying to determine if haproxy can be configured to solve a rate limiting based problem I have. I believe that it can, but that I am not seeing how to put the configuration together to get it done. Here's what I'm trying to do: I have a set of servers (backends) that can each handle a specific number of requests per second (the same rate for each backend). I'd like haproxy to accept requests and farm them out to these backends so that each request is sent to the first backend that isn't over its rate limit. If all backends are over their rate limits, ideally the client connection would just block and wait, but if haproxy has to return a rejection, I think I can deal with this. My first thought was to use frontend's rate-limit sessions, setting it to n*rate-limit where n is the number of backends I have to serve these requests. Additionally, those backends would be balanced round-robin. The problem with this is that if a backend falls out, the front end rate limit is then too high since there are less backends available than there were when it was originally configured. The only way I see that I could dynamically change the frontend rate-limit as backends rise and fall is to write something that watches the logs for rise/fall messages and uses the global rate limit setting via the haproxy socket. This might work, but the biggest drawback is that one instance of haproxy could only handle requests of a single rate limit, since modifications after starting would have to be global (not per frontend). I guess in other words, I am trying to apply rate limits to individual backend servers, and to have a front end cycle through all available backend servers until it either finds one that can handle the request, or exhausts them all, at which time it'd ideally just block and keep trying, or less ideally send some sort of failure/rejection to the client. I feel like there's a simple solution here that I'm not seeing. Any help is appreciated. What you're asking for is in the 1.6 roadmap and the road will be long before we reach this point. Maybe in the mean time we could develop a new LB algorithm which considers each server's request rate, and forwards the traffic to the least used one. In parallel, having an ACL which computes the average per-server request rate would allow requests to be rejected when there's a risk to overload the servers. But that doesn't seem trivial and I have doubts about its real usefulness. What is needed is to convert a rate into a concurrency in order to queue excess requests. What you can do at the moment, if you don't have too many servers, is to have one proxy per server with its own rate limit. This way you
Re: Backend per-server rate limiting
On Wed, Aug 08, 2012 at 12:51:23AM -0600, Andrew Davidoff wrote: Willy, Thanks for the quick response. I haven't fully digested your example suggestion yet but I will sit down with it and the haproxy configuration documentation and sort it out in my brain. Here's the basic idea of the use case. Let me go ahead and state that maybe haproxy just isn't the right solution here. There are many ways to solve this, It just seemed to me like haproxy might have been a magic answer. We make a bunch of requests to an API that rate limits based on source IP. To maximize our overall request rate, we utilize proxies to afford us more source IPs. Even if those proxies can handle a ton of work themselves, if we push them, individually, over the API's rate limits, they can be temporarily or permanently disallowed from accessing the API. OK I see now. It's not a performance limit but a contractual limit you must not go over if you don't want to be banned. Why don't you use a larger number of IP addresses to access your API then ? Right now our API clients (scripts) handle rate limiting themselves. The way they currently do this involves knowledge of the per-source-IP rate limit for the API they're talking to, and how many proxies live behind a squid instance that all their requests go through. That squid instance hands out proxies round-robin, which is what makes the request rate work. Based on how the scripts currently handle the rate limiting, we start running into problems if we want multiple scripts accessing the same API to run at the same time. Basically, each running script must then know about any other scripts that are running and talking to the same API, so it can adjust its request rate accordingly, and anything already running needs be informed that more scripts access the same API have started up, so it can do the same. Additionally, we run into the problem of proxies failing. If a proxy fails and the scripts don't learn then and adjust their rate limits, then the per-proxy rate limit has inadvertently increased across all proxies. These are precisely the types of issues that you won't have with the two-stage LB I gave in the example, because the load sent to the servers will be smoothed as much as possible and will never go beyond the configured limit. In my tests I have always observed 3-4 digits stability in rate-limits. The first stage will ensure a global distribution and the second stage will ensure you never go over the limit for each server. You can even use this to increase the number of source addresses, with all servers going to the same address (some people use this to browse via multiple outgoing ADSL lines). For instance, let's say your API is on 192.168.0.1:80 and you want to use 3 different source IP addresses (192.168.0.11..13) each limited to exactly 10 requests per second, and you want to distribute the load across them in order to always use them to the max possible rate : listen front bind :80 balance leastconn server srv1 127.0.0.1:8000 maxconn 100 track back1/srv server srv2 127.0.0.2:8000 maxconn 100 track back2/srv server srv3 127.0.0.3:8000 maxconn 100 track back3/srv listen back1 bind 127.0.0.1:8000 rate-limit 10 server srv 192.168.0.1:80 source 192.168.0.11 check listen back2 bind 127.0.0.2:8000 rate-limit 10 server srv 192.168.0.1:80 source 192.168.0.12 check listen back3 bind 127.0.0.3:8000 rate-limit 10 server srv 192.168.0.1:80 source 192.168.0.13 check Regards, Willy
Re: Backend per-server rate limiting
On Tue, Aug 7, 2012 at 11:51 PM, Andrew Davidoff david...@qedmf.net wrote: Willy, Thanks for the quick response. I haven't fully digested your example suggestion yet but I will sit down with it and the haproxy configuration documentation and sort it out in my brain. Here's the basic idea of the use case. Let me go ahead and state that maybe haproxy just isn't the right solution here. There are many ways to solve this, It just seemed to me like haproxy might have been a magic answer. We make a bunch of requests to an API that rate limits based on source IP. To maximize our overall request rate, we utilize proxies to afford us more source IPs. Even if those proxies can handle a ton of work themselves, if we push them, individually, over the API's rate limits, they can be temporarily or permanently disallowed from accessing the API. You could also consider solving this socially instead. Have you tried reaching out to this service to ask for a higher rate limit? Your work would be wasted if they improved their scraper detection beyond source ip addresses. What service is it? Right now our API clients (scripts) handle rate limiting themselves. The way they currently do this involves knowledge of the per-source-IP rate limit for the API they're talking to, and how many proxies live behind a squid instance that all their requests go through. That squid instance hands out proxies round-robin, which is what makes the request rate work. Based on how the scripts currently handle the rate limiting, we start running into problems if we want multiple scripts accessing the same API to run at the same time. Basically, each running script must then know about any other scripts that are running and talking to the same API, so it can adjust its request rate accordingly, and anything already running needs be informed that more scripts access the same API have started up, so it can do the same. Additionally, we run into the problem of proxies failing. If a proxy fails and the scripts don't learn then and adjust their rate limits, then the per-proxy rate limit has inadvertently increased across all proxies. So, again, there are many ways to solve this and maybe haproxy just isn't the answer, but I thought maybe it would be. At the moment I'm very much in don't reinvent the wheel mode, and I thought maybe haproxy had solved this. Thanks again for your help. Andy On Wed, Aug 8, 2012 at 12:11 AM, Willy Tarreau w...@1wt.eu wrote: Hi Andrew, On Tue, Aug 07, 2012 at 11:44:53PM -0600, Andrew Davidoff wrote: Hi, I'm trying to determine if haproxy can be configured to solve a rate limiting based problem I have. I believe that it can, but that I am not seeing how to put the configuration together to get it done. Here's what I'm trying to do: I have a set of servers (backends) that can each handle a specific number of requests per second (the same rate for each backend). I'd like haproxy to accept requests and farm them out to these backends so that each request is sent to the first backend that isn't over its rate limit. If all backends are over their rate limits, ideally the client connection would just block and wait, but if haproxy has to return a rejection, I think I can deal with this. My first thought was to use frontend's rate-limit sessions, setting it to n*rate-limit where n is the number of backends I have to serve these requests. Additionally, those backends would be balanced round-robin. The problem with this is that if a backend falls out, the front end rate limit is then too high since there are less backends available than there were when it was originally configured. The only way I see that I could dynamically change the frontend rate-limit as backends rise and fall is to write something that watches the logs for rise/fall messages and uses the global rate limit setting via the haproxy socket. This might work, but the biggest drawback is that one instance of haproxy could only handle requests of a single rate limit, since modifications after starting would have to be global (not per frontend). I guess in other words, I am trying to apply rate limits to individual backend servers, and to have a front end cycle through all available backend servers until it either finds one that can handle the request, or exhausts them all, at which time it'd ideally just block and keep trying, or less ideally send some sort of failure/rejection to the client. I feel like there's a simple solution here that I'm not seeing. Any help is appreciated. What you're asking for is in the 1.6 roadmap and the road will be long before we reach this point. Maybe in the mean time we could develop a new LB algorithm which considers each server's request rate, and forwards the traffic to the least used one. In parallel, having an ACL which computes the average per-server request rate would allow
Re: Backend per-server rate limiting
On Wed, Aug 8, 2012 at 1:12 AM, Willy Tarreau w...@1wt.eu wrote: These are precisely the types of issues that you won't have with the two-stage LB I gave in the example, because the load sent to the servers will be smoothed as much as possible and will never go beyond the configured limit. In my tests I have always observed 3-4 digits stability in rate-limits. Thanks Willy. I have taken a longer look at your example and the configuration guide and I think I understand how it works, and how it'd solve my problems. I will give it a try soon. Thanks again for your quick and thorough responses. Andy
Backend per-server rate limiting
Hi, I'm trying to determine if haproxy can be configured to solve a rate limiting based problem I have. I believe that it can, but that I am not seeing how to put the configuration together to get it done. Here's what I'm trying to do: I have a set of servers (backends) that can each handle a specific number of requests per second (the same rate for each backend). I'd like haproxy to accept requests and farm them out to these backends so that each request is sent to the first backend that isn't over its rate limit. If all backends are over their rate limits, ideally the client connection would just block and wait, but if haproxy has to return a rejection, I think I can deal with this. My first thought was to use frontend's rate-limit sessions, setting it to n*rate-limit where n is the number of backends I have to serve these requests. Additionally, those backends would be balanced round-robin. The problem with this is that if a backend falls out, the front end rate limit is then too high since there are less backends available than there were when it was originally configured. The only way I see that I could dynamically change the frontend rate-limit as backends rise and fall is to write something that watches the logs for rise/fall messages and uses the global rate limit setting via the haproxy socket. This might work, but the biggest drawback is that one instance of haproxy could only handle requests of a single rate limit, since modifications after starting would have to be global (not per frontend). I guess in other words, I am trying to apply rate limits to individual backend servers, and to have a front end cycle through all available backend servers until it either finds one that can handle the request, or exhausts them all, at which time it'd ideally just block and keep trying, or less ideally send some sort of failure/rejection to the client. I feel like there's a simple solution here that I'm not seeing. Any help is appreciated. Thanks! Andy