By the way, in the benchmark, if there is only 1 machine to emulate/send 
thousands of concurrent web sessions/requests, can HAproxy still distribute the 
requests? (All the web requests have the same source IP addresses though.) It 
seems to me that HAproxy distribute requests based on incoming IP addresses and 
port numbers.
Chih-fan Hsin
-----Original Message-----
From: Hsin, Chih-fan 
Sent: Monday, January 05, 2009 3:34 PM
To: 'Willy Tarreau'
Cc: [email protected]
Subject: RE: Using HAproxy in web server benchmark configuration (e.g. SPECweb)?


I saved the file as a jpg graph file. It is just the arch graph of the SPECweb 
setup.

Chih-fan Hsin

-----Original Message-----
From: Willy Tarreau [mailto:[email protected]] 
Sent: Monday, January 05, 2009 3:20 PM
To: Hsin, Chih-fan
Cc: [email protected]
Subject: Re: Using HAproxy in web server benchmark configuration (e.g. SPECweb)?

On Mon, Jan 05, 2009 at 02:37:18PM -0800, Hsin, Chih-fan wrote:
> 
> I am not trying to benchmark any thing. I want to study the performance and 
> some network features in a cluster of servers. I need to some thing to 
> generate the workloads. When I present my findings, the workloads need to be 
> highly accepted as reasonable workloads. This is why I want to use web server 
> benchmark to generate the workloads. 

OK I see now, it completely makes sense. However, you may have noticed that
SpecWeb2005 is completely CPU-bound while you're clearly more interested in
something network-bound.

You may be interested in using small fast http servers like nginx or lighttpd
to deliver static contents from RAM. They're both capable of saturating a gig
interface on a small-sized server. I can also send you "httpterm" which is a
server returning the amount of data you ask for in the request. This is very
convenient for network testing. It can also wait some time before responding.
It scales quite correctly, it's in fact a modified version of haproxy 1.2. It
is not as efficient as 1.3 but still quite good.

For the client, you way use "ab" from the apache package ("apache bench"),
which is in part a derivate of the ancient "zb" ("Zeus bench"). It has some
downsides, but is widely known and accepted as a web performance tester for
simple contents such as static files.  

Using haproxy to distribute the load also makes sense with some system tuning.
Among the useful things you'll be able to do are :
  - ability to bind to a specific source IP for a given server, which allows
    you to use an arbitrary number of interfaces

  - L7 sticky/switching/hashing which will help you reproduce a more real-life
    looking architecture where a given client sticks to the same server and
    where you can address different server pools depending on the request.

  - rich logs with many timers (useful to monitor network activity)

  - connection regulation in order to limit the number of per-server connections
    to a reasonable level (eg for apache).

Also, current development version supports linux 2.6 splice() syscall to move
data without copying. Unfortunately, implementing this has uncovered some deep
corruption bug in current kernel implementation, which need to be fixed before
the feature is really usable.

Last, whether you use haproxy or anything else for the load balancing, please
use a proxy-based LB. You'll not regret it. It offers some nice features such
as the ability to have different TCP configurations on client side and server
side. This is useful to experiment with small packets on either side for
instance, using a small MTU. Also, having the ability to monitor TCP statistics
from one central place is awesome (eg: drops, retransmits, etc...).

> I attached an architecture graph based on my understanding.

I'm sorry, I have nothing here to read a PPT, and don't have enough space right
now to install openoffice. Could you please resend it as a more portable PDF ?

Regards,
Willy


Reply via email to