Simon Horman wrote: > On Mon, May 04, 2009 at 10:31:59AM +0200, Christian Frost wrote: > >> Hi, >> We have a setup including two real servers each of which runs an >> instance of MySql with the max_connections option set to 1000. In this >> setup we have run some performance tests with mysqlslap two determine >> the throughput of the setup. These tests involve simulating many >> simultaneous users querying the database. Under these conditions we have >> encountered some problems with the load balancer. Specifically, using >> ipvsadm -L -n to monitor the connections during the performance test >> there are intitially many connections represented as inactive. After a >> few seconds the inactive connections are represented as active in the >> respective real server. This causes a problem when the Least-Connection >> Scheduling algorithm is used because the connections are not equally >> between the two real hosts. The two real hosts are almost equal in terms >> of processing capacities. >> >> In the following the output of ipvsadm -L -n is shown which probably >> explains the problem better. >> >> ipvsadm -L -n a few seconds in the test simulating 200 MySql clients >> connecting simultaneously. >> >> IP Virtual Server version 1.2.1 (size=4096) >> Prot LocalAddress:Port Scheduler Flags >> -> RemoteAddress:Port Forward Weight ActiveConn InActConn >> TCP 10.0.1.5:3306 lc >> -> 10.0.1.2:3306 Route 1 71 0 >> -> 10.0.1.4:3306 Route 1 70 60 >> >> >> ipvsadm -L -n after 30 seconds in the test simulating 200 MySql clients >> connecting simultaneously. Note that the load balancer uses the >> Least-Connection scheduling algorithm. >> >> IP Virtual Server version 1.2.1 (size=4096) >> Prot LocalAddress:Port Scheduler Flags >> -> RemoteAddress:Port Forward Weight ActiveConn InActConn >> TCP 10.0.1.5:3306 lc >> -> 10.0.1.2:3306 Route 1 71 0 >> -> 10.0.1.4:3306 Route 1 130 0 >> >> >> The problem does not occur if the connections are made sequentially and >> if the number of total connections is below about 100. >> >> Is there anything we can do to avoid these problems? >> > > Hi Christian, > > I'm taking a bit of a stab in the dark, but I think that the problem that > you are seeing is with the lc (and wlc) algorithms interraction with burst > of connections. > > I think that the core of the problem is the way that lc calculates the > overhead of a server. This being relevant as an incomming connection is > allocated to whichever real-server is deemed to have the lowest overhead > at that time. > > In net/netfilter/ipvs/ip_vs_lc.c:ip_vs_lc_dest_overhead() > overhead is calculated as: > > active_connections * 256 + inactive_connections > > So suppose that things are in a more or less balanced state, > real-server A has 71 connections and real-server B has 70. > > Then a big burst of 60 new connections comes in. The first of these new > connections will go to real-server B, as expected. This connection will be > in the inactive state until the 3 way handshake is complete. So far so good. > > Unfortunately, if the other 59 new connections come in before any of the > other new connections complete the handshake and move into the active > state, they will all be allocated to real-server B because: > > 71 * 256 + 0 > 70 * 256 + n > where: n < 256 > > Assuming that I am correct I can think of two methods of addressing this > problem: > > 1) Simply change 256 to a smaller value. In this case 256 basically > ends up being the granularity of balancing for bursts of connections. > And in the case at hand, clearly 256 is too coarse. Perhaps 8, 2 or > even 1 would be a better value. > > This should be a trivial change to the code, and if lc is a module > you wouldn't even need to recompile the entire kernel - though you > would need to track down the original kernel source and config. > > The main drawback of this is that if you have a lot of old, actually > dead, connections in the inactive state, then it might cause imbalance. > > If that does help it might be good to consider making this parameter > configurable at run time, at least globally. > > 2) A more complex though arguably better approach would be to implement > some kind of slow start feature. That is, to assign some kind of weight > to new connections. I had a stab at this one in the past - it should > be in the archives - though I think my solution only addressed the > problem for active connections. But the idea seems reasonable > to extend to this problem. > > Hi,
We tried method 1, which turned out to balance the connections perfectly. We multiplied with 1. Thank you. /Christian _______________________________________________ Please read the documentation before posting - it's available at: http://www.linuxvirtualserver.org/ LinuxVirtualServer.org mailing list - [email protected] Send requests to [email protected] or go to http://lists.graemef.net/mailman/listinfo/lvs-users
