Preventing bots from starving other users?

2009-11-15 Thread Wout Mertens
Hi there,

I was wondering if HAProxy helps in the following situation:

- We have a wiki site which is quite slow
- Regular users don't have many problems
- We also get crawled by a search bot, which creates many concurrent 
connections, more than the hardware can handle
- Therefore, service is degraded and users usually have their browsers time out 
on them

Given that we can't make the wiki faster, I was thinking that we could solve 
this by having a per-source-IP queue, which made sure that a given source IP 
cannot have more than e.g. 3 requests active at the same time. Requests beyond 
that would get queued.

Is this possible?

Thanks,

Wout.


RE: Preventing bots from starving other users?

2009-11-15 Thread John Lauro
I would probably do that sort of throttling at the OS level with iptables,
etc...

That said, before that I would investigate why the wiki is so slow...
Something probably isn't configured right if it chokes with only a few
simultaneous accesses.  I mean, unless it's embedded server with under 32MB
of RAM, the hardware should be able to handle that...


 -Original Message-
 From: Wout Mertens [mailto:wout.mert...@gmail.com]
 Sent: Sunday, November 15, 2009 9:57 AM
 To: haproxy@formilux.org
 Subject: Preventing bots from starving other users?
 
 Hi there,
 
 I was wondering if HAProxy helps in the following situation:
 
 - We have a wiki site which is quite slow
 - Regular users don't have many problems
 - We also get crawled by a search bot, which creates many concurrent
 connections, more than the hardware can handle
 - Therefore, service is degraded and users usually have their browsers
 time out on them
 
 Given that we can't make the wiki faster, I was thinking that we could
 solve this by having a per-source-IP queue, which made sure that a
 given source IP cannot have more than e.g. 3 requests active at the
 same time. Requests beyond that would get queued.
 
 
 Is this possible?
 
 Thanks,
 
 Wout.
 
 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.425 / Virus Database: 270.14.60/2495 - Release Date:
 11/15/09 07:50:00




Re: Preventing bots from starving other users?

2009-11-15 Thread Aleksandar Lazic

On Son 15.11.2009 15:57, Wout Mertens wrote:

Hi there,

I was wondering if HAProxy helps in the following situation:

- We have a wiki site which is quite slow
- Regular users don't have many problems
- We also get crawled by a search bot, which creates many concurrent
 connections, more than the hardware can handle
- Therefore, service is degraded and users usually have their browsers
 time out on them

Given that we can't make the wiki faster, I was thinking that we could
solve this by having a per-source-IP queue, which made sure that a
given source IP cannot have more than e.g. 3 requests active at the
same time. Requests beyond that would get queued.

Is this possible?


Maybe with http://haproxy.1wt.eu/download/1.3/doc/configuration.txt

src ip_address

fe_sess_rate

In the acl section.

Maybe you get some ideas from this
http://haproxy.1wt.eu/download/1.3/doc/haproxy-en.txt

5) Access lists


Hth

Aleks



Re: Small patch for the appsession feature

2009-11-15 Thread Aleksandar Lazic

Hi Cyril,

On Fre 13.11.2009 22:50, Cyril Bonté wrote:

Hello Willy,
sorry, I didn't have time to work on the patch as I wanted.

Le jeudi 5 novembre 2009 06:19:41, Willy Tarreau a écrit :

 Sorry but I can't see in the haproxy sources how the cookie prefix can be 
used for appsession.
 capture cookie allows to find this cookie prefix but it seems there's no 
code to then use it in appsession. If it's the case, I can work on it these next days, if 
you want.

OK you're indeed right. So since this feature has never been
usable nor used, let's fix it as you initially proposed.


It works well but I have a question before considering the patch
finalized :)
As using the captured cookie may add an unwanted behaviour in some
existant configurations, I set it as an option.
First I added it as I did for request-learn but shouldn't it be
better to define these options more haproxy like (using option and
no option) ?

Which would give :
(no) option appsession-request-learn
(no) option appsession-capture-cookie


+1

Aleks



Re: Preventing bots from starving other users?

2009-11-15 Thread Łukasz Jagiełło
2009/11/15 Wout Mertens wout.mert...@gmail.com:
 I was wondering if HAProxy helps in the following situation:

 - We have a wiki site which is quite slow
 - Regular users don't have many problems
 - We also get crawled by a search bot, which creates many concurrent 
 connections, more than the hardware can handle
 - Therefore, service is degraded and users usually have their browsers time 
 out on them

 Given that we can't make the wiki faster, I was thinking that we could solve 
 this by having a per-source-IP queue, which made sure that a given source IP 
 cannot have more than e.g. 3 requests active at the same time. Requests 
 beyond that would get queued.

 Is this possible?

Guess so. I move traffic from crawlers to special web backend cause
they mostly harvest when I got backup window and slow down everything
even more. Add request limit should be also easy. Just check docu.

-- 
Łukasz Jagiełło
System Administrator
G-Forces Web Management Polska sp. z o.o. (www.gforces.pl)

Ul. Kruczkowskiego 12, 80-288 Gdańsk
Spółka wpisana do KRS pod nr 246596 decyzją Sądu Rejonowego Gdańsk-Północ



Re: Small patch for the appsession feature

2009-11-15 Thread Willy Tarreau
Hi,

On Sun, Nov 15, 2009 at 10:28:21PM +0100, Aleksandar Lazic wrote:
 Hi Cyril,
 
 On Fre 13.11.2009 22:50, Cyril Bonté wrote:
 Hello Willy,
 sorry, I didn't have time to work on the patch as I wanted.
 
 Le jeudi 5 novembre 2009 06:19:41, Willy Tarreau a écrit :
  Sorry but I can't see in the haproxy sources how the cookie prefix can 
 be used for appsession.
  capture cookie allows to find this cookie prefix but it seems there's 
 no code to then use it in appsession. If it's the case, I can work on it 
 these next days, if you want.
 
 OK you're indeed right. So since this feature has never been
 usable nor used, let's fix it as you initially proposed.
 
 It works well but I have a question before considering the patch
 finalized :)
 As using the captured cookie may add an unwanted behaviour in some
 existant configurations, I set it as an option.
 First I added it as I did for request-learn but shouldn't it be
 better to define these options more haproxy like (using option and
 no option) ?
 
 Which would give :
 (no) option appsession-request-learn
 (no) option appsession-capture-cookie
 
 +1

I'm not that much in favor of options for such uses, because
options are inherited from default sections and will cause
unexpected behaviours.

In my opinion, options should be used when they impact the
general behaviour and when it is desirable to inherit them
from default sections. Right now, appsessions are limited
to backends only and cannot be declared in defaults sections,
so I think it could become a bit awkward to have such a split
configuration.

What you can do however is to create a new prefix keyword like
we have for timeout or tcp-request and put the flags somewhere
else. appsession would have been fine but it's already used.
Maybe you can use appcookie ? Something like this :

   appcookie request-learn
   appcookie capture

Just an idea.

Regards,
Willy