Hi,

On Fri, May 11, 2012 at 05:45:26PM +0200, Joeri Blokhuis | DongIT wrote:
> On 05/11/2012 04:50 PM, Baptiste wrote:
> > On Fri, May 11, 2012 at 4:36 PM, Joeri Blokhuis | DongIT
> > <[email protected]> wrote:
> >> Hello guys,
> >>
> >> I would like to benchmark and test the 'source' balance algorithm of
> >> HAProxy before any loadbalancers are put in
> >> production. The 'source' algorithm is based on the IP source address, so
> >> when nothing changes in the backend, a client
> >> should always connect to the same server. I would like to test this on a
> >> large scale, with many clients and different source addresses. Any
> >> suggestions/ideas and or tools on how I can accomplish this?
> >>
> >> Many thanks in advance,
> >>
> >> Joeri
> >>
> > Hi,
> >
> > For your information, you can enable "hash-type consistent" to avoid
> > client move when the number of server in the farm changes.
> >
> > cheers
> Thank you, I will definitely look into that option. Any idea on how I
> can create a test with such a load on my farm that  I can test how
> traffic is divided amongst the backend servers? And also to create some
> scenario's of when a server fails to see if the clients are moved to
> another server?

You need to inject traffic from many IP addresses, collect server stats
to see how the load spreads, and have a way to control that a client has
switched to another server (eg: have your load generator ensure that the
server never changes). You can have haproxy emit a cookie to report the
server name to the load generator for instance, or have your servers
return a different page.

But quite frankly, such tests are really boring to set up because they
need many parameters and what you want to observe is that there is almost
nothing to see.

What I can already tell you is the following :
  - the default source hash is perfectly smooth because it is a divide
    of the IP address by the number of available servers ;
  - the consistent hash is much less smooth, you'll get differences of
    around 20% between the least loaded and the most loaded servers
  - in the case of the default hash, losing a server in an N servers
    farm means than on average (N-1)/N of the users will be redistributed
  - in the consistent hash, only the users attached to the faulty servers
    are redistributed.
  - in both hashes, when a server comes back up, the same amount of users
    are redistributed again.
  - IP addresses on the internet are not stable at all, depending on your
    site and visitors, you can have up to 5% of users browsing with a
    variable IP address (eg: dual-DSL links, mobile accesses, etc...)

In general it's not a good idea to balance on the source if you absolutely
need stickiness. It's fine if the few percent of redistribution is not an
issue (eg: for SSL or when your application supports shared contexts). Also,
if a large number of your visitors come from large proxy farms, it is
possible that the distribution will be affected by the source address.
But here again it depends on your audience. Distributing ads to random
users is not the same as selling ring tones to cell-phone owners who all
browse via the same mobile operator's proxy !

Regards,
Willy


Reply via email to