Re: load balancer and HA
On Wed, Mar 04, 2009 at 12:12:21AM +0100, Alexander Staubo wrote: On Tue, Mar 3, 2009 at 11:44 PM, Martin Karbon martin.kar...@asbz.it wrote: just wanted to know if anyone knows an opensource solution for a so called transparent failover: what I mean with that is, I installed two machines with haproxy on it which comunicate with each other via heartbeat. If one fails the other one goes from passive to active but all sessions are lost and users have to reconnect. We use Heartbeat (http://www.keepalived.org/) for this. Heartbeat lets us set up virtual service IPs which are reassigned to another box if the box goes down. Works like a charm. Current connections are lost, but new ones go to the new IP. Note that there are two current versions of Heartbeat. There's the old 1.x series, which is simple and stable, but which has certain limitations such as only supporting two nodes, if I remember correctly. Then there's 2.x, which is much more complex and less stable. We run 2.0.7 today, and we have had some situations where the Heartbeat processes have run wild. It's been running quietly for over a year now, so recent patches may have fixed the issues. I would still recommend sticking with 1.x if at all possible. I still don't understand why people stick to heartbeat for things as simple as moving an IP address. Heartbeat is more of a clustering solution, with abilities to perform complex tasks. When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. I've been told that ucarp was good at that too, though I've never tried it yet. While there are solutions out there that preserve connections on failover, my gut feeling is that they introduce a level of complexity and computational overhead that is necessarily puts a restraint on performance. In fact it's useless to synchronise TCP sessions between load-balancers for fast-moving connections (eg: HTTP traffic). Some people require that for long sessions (terminal server, ...) but this cannot be achieved in a standard OS, you need to synchronise every minor progress of the TCP stack with the peer. And that also prevents true randomness from being used at TCP and IP levels. It also causes trouble when some packets are lost between the peers, because they can quickly get out of sync. In practise, in order to synchronise TCP between two hosts, you need more bandwidth than that of the traffic you want to forward. There are intermediate solutions which synchronise at layer 4 only, without taking into account the data nor the sequence numbers. Those present the advantage of being able to take over a connection without too much overhead, but no layer 7 processing can be done there, and those cannot be system sockets. That's typically what you find in some firewalls or layer4 load balancers which just forward packets between two sides and maintain a vague context. Regards, Willy
RE: load balancer and HA
I still don't understand why people stick to heartbeat for things as simple as moving an IP address. Heartbeat is more of a clustering solution, with abilities to perform complex tasks. When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. One reason, heartbeat is standard in many distributions (ie: RHEL, Centos) and vrrp and keepalived are not. It might be overkill for just IP addresses, but being supported in the base OS is a plus that shouldn't be discounted. If you have to support heartbeat on other servers, using heartbeat for places you have to share resources is easier than using vrrp for some and heartbeat on others.
Re: load balancer and HA
On Fri, Mar 6, 2009 at 7:48 PM, Willy Tarreau w...@1wt.eu wrote: When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. Actually, it seems that my information is out of date, and we (that is, our IT management company that we outsource our system administration to) are in fact using Keepalived these days. I was confused by the presence of ha_logd on our boxes, which is part of the Heartbeat package; don't know what the one is doing there. So, yeah, you're right. Stick with Keepalived. :-) In fact it's useless to synchronise TCP sessions between load-balancers for fast-moving connections (eg: HTTP traffic). Some people require that for long sessions (terminal server, ...) but this cannot be achieved in a standard OS, you need to synchronise every minor progress of the TCP stack with the peer. A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. This depends on a couple of factors: For one, it only works if nothing has yet been sent back to the client. Secondly, it assumes the request itself is repeatable without side effects. The latter, of course, is application-dependent; but following the REST principle, in a well-designed app GET requests are supposed to have no side effects, so they can be retried, whereas POST, PUT etc. cannot. Still expensive and error-prone, of course, but much more pragmatic and limited in scope. Alexander.
Re: load balancer and HA
On Fri, Mar 06, 2009 at 11:47:14PM +0100, Alexander Staubo wrote: On Fri, Mar 6, 2009 at 7:48 PM, Willy Tarreau w...@1wt.eu wrote: When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. Actually, it seems that my information is out of date, and we (that is, our IT management company that we outsource our system administration to) are in fact using Keepalived these days. I was confused by the presence of ha_logd on our boxes, which is part of the Heartbeat package; don't know what the one is doing there. So, yeah, you're right. Stick with Keepalived. :-) Ah nice! The author will be please to read this, he's subscribed to the list :-) In fact it's useless to synchronise TCP sessions between load-balancers for fast-moving connections (eg: HTTP traffic). Some people require that for long sessions (terminal server, ...) but this cannot be achieved in a standard OS, you need to synchronise every minor progress of the TCP stack with the peer. A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. Will not work because the connection from the client to the proxy will have been broken during the take-over. The second proxy cannot inherit the primary one's sockets. This depends on a couple of factors: For one, it only works if nothing has yet been sent back to the client. Secondly, it assumes the request itself is repeatable without side effects. The latter, of course, is application-dependent; but following the REST principle, in a well-designed app GET requests are supposed to have no side effects, so they can be retried, whereas POST, PUT etc. cannot. Still expensive and error-prone, of course, but much more pragmatic and limited in scope. What you're talking about are idempotent HTTP requests, which are quite well documented in RFC2616. Those are important to consider because idempotent requests are the only ones a proxy may retry upon a connection error when sending a request on a keep-alive session. IIRC, HEAD, PUT, GET and DELETE were supposed to be idempotent methods. But we all know that GET is not that much when used with CGIs. Willy
Re: load balancer and HA
On Sat, Mar 7, 2009 at 12:07 AM, Willy Tarreau w...@1wt.eu wrote: A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. Will not work because the connection from the client to the proxy will have been broken during the take-over. The second proxy cannot inherit the primary one's sockets. Unless you have some kind of shared-memory L4 magic like the original poster talked about, that allows taking over an existing TCP connection. What you're talking about are idempotent HTTP requests, which are quite well documented in RFC2616. That was the exact word I was looking for. I didn't know that PUT was idempotent, but the others make sense. Alexander.
Re: load balancer and HA
On Sat, Mar 07, 2009 at 12:14:44AM +0100, Alexander Staubo wrote: On Sat, Mar 7, 2009 at 12:07 AM, Willy Tarreau w...@1wt.eu wrote: A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. Will not work because the connection from the client to the proxy will have been broken during the take-over. The second proxy cannot inherit the primary one's sockets. Unless you have some kind of shared-memory L4 magic like the original poster talked about, that allows taking over an existing TCP connection. in this case of course I agree. But that means kernel-level changes. What you're talking about are idempotent HTTP requests, which are quite well documented in RFC2616. That was the exact word I was looking for. I didn't know that PUT was idempotent, but the others make sense. in fact it also makes sense for PUT because you're supposed to use this method to send a file. Normally, you can send it as many times as you want, the result will not change. Willy
Re: load balancer and HA
On Tue, Mar 3, 2009 at 11:44 PM, Martin Karbon martin.kar...@asbz.it wrote: just wanted to know if anyone knows an opensource solution for a so called transparent failover: what I mean with that is, I installed two machines with haproxy on it which comunicate with each other via heartbeat. If one fails the other one goes from passive to active but all sessions are lost and users have to reconnect. We use Heartbeat (http://www.keepalived.org/) for this. Heartbeat lets us set up virtual service IPs which are reassigned to another box if the box goes down. Works like a charm. Current connections are lost, but new ones go to the new IP. Note that there are two current versions of Heartbeat. There's the old 1.x series, which is simple and stable, but which has certain limitations such as only supporting two nodes, if I remember correctly. Then there's 2.x, which is much more complex and less stable. We run 2.0.7 today, and we have had some situations where the Heartbeat processes have run wild. It's been running quietly for over a year now, so recent patches may have fixed the issues. I would still recommend sticking with 1.x if at all possible. While there are solutions out there that preserve connections on failover, my gut feeling is that they introduce a level of complexity and computational overhead that is necessarily puts a restraint on performance. Alexander.