By the way, in the benchmark, if there is only 1 machine to emulate/send thousands of concurrent web sessions/requests, can HAproxy still distribute the requests? (All the web requests have the same source IP addresses though.) It seems to me that HAproxy distribute requests based on incoming IP addresses and port numbers. Chih-fan Hsin -----Original Message----- From: Hsin, Chih-fan Sent: Monday, January 05, 2009 3:34 PM To: 'Willy Tarreau' Cc: [email protected] Subject: RE: Using HAproxy in web server benchmark configuration (e.g. SPECweb)?
I saved the file as a jpg graph file. It is just the arch graph of the SPECweb setup. Chih-fan Hsin -----Original Message----- From: Willy Tarreau [mailto:[email protected]] Sent: Monday, January 05, 2009 3:20 PM To: Hsin, Chih-fan Cc: [email protected] Subject: Re: Using HAproxy in web server benchmark configuration (e.g. SPECweb)? On Mon, Jan 05, 2009 at 02:37:18PM -0800, Hsin, Chih-fan wrote: > > I am not trying to benchmark any thing. I want to study the performance and > some network features in a cluster of servers. I need to some thing to > generate the workloads. When I present my findings, the workloads need to be > highly accepted as reasonable workloads. This is why I want to use web server > benchmark to generate the workloads. OK I see now, it completely makes sense. However, you may have noticed that SpecWeb2005 is completely CPU-bound while you're clearly more interested in something network-bound. You may be interested in using small fast http servers like nginx or lighttpd to deliver static contents from RAM. They're both capable of saturating a gig interface on a small-sized server. I can also send you "httpterm" which is a server returning the amount of data you ask for in the request. This is very convenient for network testing. It can also wait some time before responding. It scales quite correctly, it's in fact a modified version of haproxy 1.2. It is not as efficient as 1.3 but still quite good. For the client, you way use "ab" from the apache package ("apache bench"), which is in part a derivate of the ancient "zb" ("Zeus bench"). It has some downsides, but is widely known and accepted as a web performance tester for simple contents such as static files. Using haproxy to distribute the load also makes sense with some system tuning. Among the useful things you'll be able to do are : - ability to bind to a specific source IP for a given server, which allows you to use an arbitrary number of interfaces - L7 sticky/switching/hashing which will help you reproduce a more real-life looking architecture where a given client sticks to the same server and where you can address different server pools depending on the request. - rich logs with many timers (useful to monitor network activity) - connection regulation in order to limit the number of per-server connections to a reasonable level (eg for apache). Also, current development version supports linux 2.6 splice() syscall to move data without copying. Unfortunately, implementing this has uncovered some deep corruption bug in current kernel implementation, which need to be fixed before the feature is really usable. Last, whether you use haproxy or anything else for the load balancing, please use a proxy-based LB. You'll not regret it. It offers some nice features such as the ability to have different TCP configurations on client side and server side. This is useful to experiment with small packets on either side for instance, using a small MTU. Also, having the ability to monitor TCP statistics from one central place is awesome (eg: drops, retransmits, etc...). > I attached an architecture graph based on my understanding. I'm sorry, I have nothing here to read a PPT, and don't have enough space right now to install openoffice. Could you please resend it as a more portable PDF ? Regards, Willy

