Re[2]: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-)
That's more of an issue with the site than a (proxy based) load balancer - the LB would be doing the exact same thing as the client. WT Precisely not and that's the problem. The proxy cannot ask the user WT if he wants to retry on sensible requests, and the cannot precisely WT know what is at risk and what is not. The client knows a lot more about WT that. For instance, I think that the client will not necessarily repost WT a GET form without asking the user, but it might automatically report a WT request for an image. I can see a small confusion here because I've used the wrong terminology. Proxy is not the correct term, as there are actual proxy devices out there (eg: Squid) which are generally visible to the client/server and shouldn't be intentionally resending requests upon failure. To describe what I mean is that the loadbalancer would keep a copy (silently) of the client's request until a server gave a valid response. So should the connection drop unexpectedly with server A after the request, the load balancer would assume something went wrong with that server, and then resend the request to Server B. Throughout this, the end client would have only sent 1 request to the loadbalancer (as it sees the LB as the end server). Obviously this also allowed the loadbalancer to manipulate the headers and route requests as required. WT So probably that a reasonable balance can be found but it is WT clear that from time to time a user will get an error. That sounds like the mantra of the internet in general. :-) WT I don't 100% agree. Sorry, I meant out of the context of this conversation - there's many many times that your statement has had context within other conversations about internet connectivity in general and admin's views on it (usually ending up with it's good enough - Especially with DPI manipulated telco connectivity). WT Oh I precisely see how it works, it has several names from vendor to vendor, WT often they call it connection pooling. Doing a naive implementation is not WT hard at all if you don't want to care about errors. The problems start when WT you want to add the expected reliability in the process... I will mention the vendor's software we used has since then been completely re-written from ground up, probably to cover some of those issues and get much better performance at higher speeds. WT In practice, instead of doing an expensive copy, I think that 1) configuring WT a maximum number of times a connection can be used, 2) configuring the maximum WT duration of a connection and 3) configuring a small idle timeout on a connection WT can prevent most of the issues. Then we could also tag some requests at risk WT and other ones riskless and have an option for always renewing a connection WT on risked requests. In practice on a web site, most of the requests are images WT and a few ones are transactions. You can already lower the load by keeping 95% WT of the requests on keep-alive connections. That does sound very logical. WT I believe you that it worked fine. But my concern is not to confirm WT after some tests that finally it works fine, but rather to design it WT so that it works fine. Unfortunately HTTP doesn't permit it, so there WT are tradeoffs to make, and that causes me a real problem you see. Yes, the more I re-read the rfc, the more I feel your pain when they specify SHOULD/MAY rather than MUST/MUST NOT allowing for those corner cases to occur in the first place. WT Indeed. Not to mention that applications today use more and more resources WT because they're written by stacking up piles of crap and sometimes the WT network has absolutely no impact at all due to the amount of crap being WT executed during a request. I don't want to get started in the [non-]quality of the asp programmer's code of that project. I still have nightmares. Cheers, Ross. --
Re[2]: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-)
I'll enter in this conversation as I've used (successfully) a load balancer which did server-side keep-alive a while ago. WT Hmmm that's different. There are issues with the HTTP protocol WT itself making this extremely difficult. When you're keeping a WT connection alive in order to send a second request, you never WT know if the server will suddenly close or not. If it does, then WT the client must retransmit the request because only the client WT knows if it takes a risk to resend or not. An intermediate WT equipemnt is not allowed to do so because it might send two WT orders for one request. This might be an architecture based issue and probably depends on the amount of caching/proxying of the request that the load balancer does (ie: holds the full request until server side completes successfully). WT So by doing what you describe, your clients would regularly get some WT random server errors when a server closes a connection it does not WT want to sustain anymore before haproxy has a chance to detect it. Never had any complaints of random server issues that could be attributed to connection issues. But that's probably attributable to the above architectural comment. WT Another issue is that there are (still) some buggy applications which WT believe that all the requests from a same session were initiated by WT the same client. So such a feature must be used with extreme care. We found the biggest culprit is Microsoft's NTLM authentication system. It actually breaks the http spec by authenticating the tcp session, not the individual http requests (except the first one in the tcp session). Last time I looked into it, the squid people had made some progress into it, but hadn't gotten it to successfully proxy. WT Last, I'd say there is in my opinion little benefit to do that. Where WT the most time is elapsed is between the client and haproxy. Haproxy WT and the server are on the same LAN, so a connection setup/teardown WT here is extremely cheap, as it's where we manage to run at more than WT 4 connections per second (including connection setup, send request, WT receive response and close). That means only 25 microseconds for the WT whole process which isn't measurable at all by the client and is WT extremely cheap for the server. When we placed the load balancer in front of our IIS based cluster, we got around a 80-100% (!!) performance improvement immediately. We were estimating around a 25% increase only with our experience with Microsoft's tcp stack. Running against a unix based stack (Solaris BSD) got us a much more realistic 5-10% improvement. nb: Improvement mainly being defined as a reduction in server side processing/load. Actual request speed was about the same. Obviously over the years OS vendors have improved their systems' stacks greatly, but server side keep-alives did work quite well for us in saving server resources, as have the better integration of network stacks and the hardware (chipsets) they use. I doubt that you'd get the same kind of performance improvements we did. Cheers, Ross. --
Re[2]: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-)
WT It's not only a matter of caching the request to replay it, it is that WT you're simply not allowed to. I know a guy who ordered a book at a WT large well-known site. His order was processed twice. Maybe there is WT something on this site which grants itself the right to replay a user's WT request when a server connection suddenly closes on keep-alive timeout WT or count. That's more of an issue with the site than a (proxy based) load balancer - the LB would be doing the exact same thing as the client. According to the rfc, if a connection is prematurely closed, then the client would (silently) retry the request. In our case the LB just emulated the client's behavior towards the servers. Unfortunately for your friend, it could mean the code on the site didn't do any duplicate order checking. A corner case taken care of by their support department I guess. WT So probably that a reasonable balance can be found but it is WT clear that from time to time a user will get an error. That sounds like the mantra of the internet in general. :-) WT Maybe your LB was regularly sending dummy requests on the connections WT to keep them alive, but since there is no NOP instruction in HTTP, you WT have to send real work anyway. Well, the site was busy enough that it didn't require to do the equivalent of a NOP to keep connections open. :-) But the idea of NOPs can be mitigated by adjusting timeouts on stale connections. My understanding was that the loadbalancer actually just used a pool of open tcp sessions, and would send the next request (from any of it's clients) down the next open tcp connection that wasn't busy. If none were free, a new connection was established, which would eventually timeout and close naturally. I don't believe it was pipelining the requests. This would mean that multiple requests from clients A, B, C may go down tcp connections X, Y, Z in a 'random' order. (eg: tcp connection X may have requests from A, B, A, A, C, B) Sounds rather chaotic, but actually worked fine. Last time I looked into it, the squid people had made some progress into it, but hadn't gotten it to successfully proxy. After checking, I stand corrected - it looks to be that Squid have a working proxy helper application to make ntlm authentication work. WT Was it really just an issue with the TCP stack ? maybe there was a firewall WT loaded on the machine ? Maybe IIS was logging connections and not requests, WT so that it almost stopped logging ? There was additional security measures on the machines, so yes, I should say the stack wasn't fully the issue, but once they got disabled in testing, we definitely still had better performance that before. WT It depends a lot on what the server does behind. File serving will not WT change, it's generally I/O bound. However if the server was CPU-bound, WT you might have won something, especially if there was a firewall on WT the server. CPU was our main issue - as this was quite a while ago, things have since dramatically improved with better offload support in drivers and on network cards, plus much profiling been done by OS vendors in their kernels with regards to network performance. So I doubt people would get the same level of performance increase these days that we saw back then. Cheers, Ross. --