Re[2]: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-)

2010-01-13 Thread Ross West

 That's more of an issue with the site than a (proxy based) load
 balancer - the LB would be doing the exact same thing as the client.

WT Precisely not and that's the problem. The proxy cannot ask the user
WT if he wants to retry on sensible requests, and the cannot precisely
WT know what is at risk and what is not. The client knows a lot more about
WT that. For instance, I think that the client will not necessarily repost
WT a GET form without asking the user, but it might automatically report a
WT request for an image.

I can see a small confusion here because I've used the wrong
terminology. Proxy is not the correct term, as there are actual proxy
devices out there (eg: Squid) which are generally visible to the
client/server and shouldn't be intentionally resending requests upon
failure.

To describe what I mean is that the loadbalancer would keep a copy
(silently) of the client's request until a server gave a valid
response. So should the connection drop unexpectedly with server A
after the request, the load balancer would assume something went wrong
with that server, and then resend the request to Server B.
Throughout this, the end client would have only sent 1 request to the
loadbalancer (as it sees the LB as the end server).

Obviously this also allowed the loadbalancer to manipulate the headers
and route requests as required.

 WT So probably that a reasonable balance can be found but it is
 WT clear that from time to time a user will get an error.
 
 That sounds like the mantra of the internet in general.  :-)

WT I don't 100% agree.

Sorry, I meant out of the context of this conversation - there's many
many times that your statement has had context within other
conversations about internet connectivity in general and admin's views
on it (usually ending up with it's good enough - Especially with
DPI manipulated telco connectivity).

WT Oh I precisely see how it works, it has several names from vendor to vendor,
WT often they call it connection pooling. Doing a naive implementation is not
WT hard at all if you don't want to care about errors. The problems start when
WT you want to add the expected reliability in the process...

I will mention the vendor's software we used has since then been
completely re-written from ground up, probably to cover some of those
issues and get much better performance at higher speeds.

WT In practice, instead of doing an expensive copy, I think that 1) configuring
WT a maximum number of times a connection can be used, 2) configuring the 
maximum
WT duration of a connection and 3) configuring a small idle timeout on a 
connection
WT can prevent most of the issues. Then we could also tag some requests at 
risk
WT and other ones riskless and have an option for always renewing a 
connection
WT on risked requests. In practice on a web site, most of the requests are 
images
WT and a few ones are transactions. You can already lower the load by keeping 
95%
WT of the requests on keep-alive connections.

That does sound very logical.

WT I believe you that it worked fine. But my concern is not to confirm
WT after some tests that finally it works fine, but rather to design it
WT so that it works fine. Unfortunately HTTP doesn't permit it, so there
WT are tradeoffs to make, and that causes me a real problem you see.

Yes, the more I re-read the rfc, the more I feel your pain when they
specify SHOULD/MAY rather than MUST/MUST NOT allowing for those
corner cases to occur in the first place.

WT Indeed. Not to mention that applications today use more and more resources
WT because they're written by stacking up piles of crap and sometimes the
WT network has absolutely no impact at all due to the amount of crap being
WT executed during a request.

I don't want to get started in the [non-]quality of the asp programmer's
code of that project.  I still have nightmares.



Cheers,
  Ross.

-- 




Re[2]: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-)

2010-01-12 Thread Ross West

I'll enter in this conversation as I've used (successfully) a load
balancer which did server-side keep-alive a while ago.

WT Hmmm that's different. There are issues with the HTTP protocol
WT itself making this extremely difficult. When you're keeping a
WT connection alive in order to send a second request, you never
WT know if the server will suddenly close or not. If it does, then
WT the client must retransmit the request because only the client
WT knows if it takes a risk to resend or not. An intermediate
WT equipemnt is not allowed to do so because it might send two
WT orders for one request.

This might be an architecture based issue and probably depends on the
amount of caching/proxying of the request that the load balancer does
(ie: holds the full request until server side completes successfully).

WT So by doing what you describe, your clients would regularly get some
WT random server errors when a server closes a connection it does not
WT want to sustain anymore before haproxy has a chance to detect it.

Never had any complaints of random server issues that could be
attributed to connection issues.  But that's probably attributable to
the above architectural comment.

WT Another issue is that there are (still) some buggy applications which
WT believe that all the requests from a same session were initiated by
WT the same client. So such a feature must be used with extreme care.

We found the biggest culprit is Microsoft's NTLM authentication
system. It actually breaks the http spec by authenticating the tcp
session, not the individual http requests (except the first one in the
tcp session). Last time I looked into it, the squid people had made
some progress into it, but hadn't gotten it to successfully proxy.

WT Last, I'd say there is in my opinion little benefit to do that. Where
WT the most time is elapsed is between the client and haproxy. Haproxy
WT and the server are on the same LAN, so a connection setup/teardown
WT here is extremely cheap, as it's where we manage to run at more than
WT 4 connections per second (including connection setup, send request,
WT receive response and close). That means only 25 microseconds for the
WT whole process which isn't measurable at all by the client and is
WT extremely cheap for the server.

When we placed the load balancer in front of our IIS based cluster, we
got around a 80-100% (!!) performance improvement immediately.  We
were estimating around a 25% increase only with our experience with
Microsoft's tcp stack.

Running against a unix based stack (Solaris  BSD) got us a much more
realistic 5-10% improvement.

nb: Improvement mainly being defined as a reduction in server side
processing/load.  Actual request speed was about the same.

Obviously over the years OS vendors have improved their systems'
stacks greatly, but server side keep-alives did work quite well for
us in saving server resources, as have the better integration of
network stacks and the hardware (chipsets) they use.  I doubt that
you'd get the same kind of performance improvements we did.

Cheers,
  Ross.

-- 




Re[2]: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-)

2010-01-12 Thread Ross West

WT It's not only a matter of caching the request to replay it, it is that
WT you're simply not allowed to. I know a guy who ordered a book at a
WT large well-known site. His order was processed twice. Maybe there is
WT something on this site which grants itself the right to replay a user's
WT request when a server connection suddenly closes on keep-alive timeout
WT or count.

That's more of an issue with the site than a (proxy based) load
balancer - the LB would be doing the exact same thing as the client.

According to the rfc, if a connection is prematurely closed, then the
client would (silently) retry the request. In our case the LB just
emulated the client's behavior towards the servers.

Unfortunately for your friend, it could mean the code on the site
didn't do any duplicate order checking.  A corner case taken care of
by their support department I guess.

WT So probably that a reasonable balance can be found but it is
WT clear that from time to time a user will get an error.

That sounds like the mantra of the internet in general.  :-)

WT Maybe your LB was regularly sending dummy requests on the connections
WT to keep them alive, but since there is no NOP instruction in HTTP, you
WT have to send real work anyway.

Well, the site was busy enough that it didn't require to do the
equivalent of a NOP to keep connections open. :-) But the idea of NOPs
can be mitigated by adjusting timeouts on stale connections.

My understanding was that the loadbalancer actually just used a pool
of open tcp sessions, and would send the next request (from any of
it's clients) down the next open tcp connection that wasn't busy. If
none were free, a new connection was established, which would
eventually timeout and close naturally. I don't believe it was
pipelining the requests.

This would mean that multiple requests from clients A, B, C may go
down tcp connections X, Y, Z in a 'random' order. (eg: tcp connection
X may have requests from A, B, A, A, C, B)

Sounds rather chaotic, but actually worked fine.

 Last time I looked into it, the squid people had made some progress into
 it, but hadn't gotten it to successfully proxy.

After checking, I stand corrected - it looks to be that Squid have a
working proxy helper application to make ntlm authentication work.

WT Was it really just an issue with the TCP stack ? maybe there was a firewall
WT loaded on the machine ? Maybe IIS was logging connections and not requests,
WT so that it almost stopped logging ?

There was additional security measures on the machines, so yes, I
should say the stack wasn't fully the issue, but once they got
disabled in testing, we definitely still had better performance that
before.

WT It depends a lot on what the server does behind. File serving will not
WT change, it's generally I/O bound. However if the server was CPU-bound,
WT you might have won something, especially if there was a firewall on
WT the server.

CPU was our main issue - as this was quite a while ago, things have
since dramatically improved with better offload support in drivers and
on network cards, plus much profiling been done by OS vendors in their
kernels with regards to network performance.  So I doubt people would
get the same level of performance increase these days that we saw back
then.

Cheers,
  Ross.




--