This isn't tested, just a sample idea... obviously parts missing...
Create you acls something like:
frontend web
acl is_bot hdr_sub(User-Agent) -i bot
...
use_backend botq if is_bot
default_backend normalq
backend normalq
server be1 10.1.2.3 check minconn 20 maxconn 30
server be2 10.1.2.4 check minconn 20 maxconn 30
backend botq
server be1 10.1.2.3 track normalq/be1 minconn 1 maxconn 1
server be2 10.1.2.4 track normalq/be2 minconn 1 maxconn 1
So that doesn't really rate limit it, it just makes it so there is at most
one concurrent request (per backend) is shared/serviced at a time for all
identified bots. Personally I think that would be better than rate
limiting, but... that being said, if you really want to rate limit, nothing
says couldn't have botq connect to a different port on 127.0.0.1, and rate
limit that internal port as a different frontend in haproxy if you really
want it to rate limit it. Preserving original IP is possible through the
loopback, and so it is can be used as a way to do complicated setups such as
rate limiting the backend at the cost of cpu overhead having haproxy to talk
itself... As long as that is not the path for most traffic, it shouldn't be
a big deal.
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Wednesday, February 15, 2012 9:39 PM
> To: [email protected]
> Subject: Re: Rate limit spider / bots ?
>
> John how would i go about using acl ? I thought rate-limit option didn't
> support backends http://code.google.com/p/haproxy-
> docs/wiki/rate_limit_sessions ?
>
> ---
> posted at http://www.serverphorums.com
> http://www.serverphorums.com/read.php?10,445347,446647#msg-446647