My company has a large number of secure web applications that we have running on an LVS cluster. There are about 200 real IP's (for 200 different domains / SSL certificates) and we have 5 different realservers in the mix. Since we are serving http and https (80 and 443) this works out to be 2,000 realserver entries that ldirectord has to go through. Obviously this takes some time. We have seen it take up to 15 minutes to expire a downed node, or to reinstate a realserver once we bring it back up, of course depending on how far along the list ldirectord is. Using the forking option is not possible since spawning a high number of processes simultaneously brings the load balancer to its knees.
The top of our ldirectord.cf looks like this: autoreload=no logfile="local0" quiescent=no checktimeout = 2 negotiatetimeout = 2 checkinterval = 10 checkcount = 2 Does anyone have any suggestions on how we could improve the very poor response time to expiring downed servers? Throughput performance is very good, however potentially having 20% of our clients wait up to 15 minutes in the event of a realserver failure is not something management wants to accept. Thank you very much, -Anthony Sturchio _______________________________________________ Please read the documentation before posting - it's available at: http://www.linuxvirtualserver.org/ LinuxVirtualServer.org mailing list - [email protected] Send requests to [email protected] or go to http://lists.graemef.net/mailman/listinfo/lvs-users
