http://www.howtoforge.com/high_availability_heartbeat_centos
Clustering is when two or more servers are linked together with the Network Load Balancing protocol (in Windows) to allow for faster response time and reliability. This way, if one server goes down, another can pick up the slack without any interuption in network performance - in theory anyway! :-) In a computer system, a cluster is a group of servers and other resources that act like a single system and enable high availability and, in some cases, load balancing and parallel processing. In computers, clustering is the use of multiple computers, typically PCs or UNIX workstations, multiple storage devices, and redundant interconnections, to form what appears to users as a single highly available system. Cluster computing can be used for load balancing as well as for high availability. A common use of cluster computing is to load balance traffic on high-traffic Web sites. A Web page request is sent to a "manager" server, which then determines which of several identical or very similar Web servers to forward the request to for handling. Having a Web farm (as such a configuration is sometimes called) allows traffic to be handled more quickly. Clustering has been available since the 1980s when it was used in DEC's High Availability In information technology, high availability refers to a system or component that is continuously operational for a desirably long length of time. availability can be measured relative to "100% operational" or "never failing. High-availability (HA) clusters High-availability clusters (also known as Failover Clusters) are implemented primarily for the purpose of improving the availability of services that the cluster provides. They operate by having redundant nodes, which are then used to provide service when system components fail. The most common size for an HA cluster is two nodes, which is the minimum requirement to provide redundancy. HA cluster implementations attempt to use redundancy of cluster components to eliminate single points of failure. Load Balancing Load balancing is dividing the amount of work that a computer has to do between two or more computers so that more work gets done in the same amount of time and, in general, all users get served faster. Load balancing can be implemented with hardware, software, or a combination of both. Typically, load balancing is the main reason for computer server Load-balancing is when multiple computers are linked together to share computational workload or function as a single virtual computer. Logically, from the user side, they are multiple machines, but function as a single virtual machine. Requests initiated from the user are managed by, and distributed among, all the standalone computers to form a cluster Load balancing can also be considered as distributing items into buckets: data to memory locations files to disks tasks to processors packets to network interfaces requests to servers Layer-2 Load Balancing Layer-4 Load Balancing Layer-7 Load Balancing MPLS Load Balancing DNS Load Balancing Link Load Balancing Database Load Balancing Computing Load Balancing Layer -2 load balancing Layer-2 load balancing, aka link aggregation, port aggregation, etherchannel, or gigabit etherchannel port bundling is to bond two or more links into a single, higher-bandwidth logical link. Layer-4 load balancing Layer-4 load balancing is to distribute requests to the servers at transport layer, such as TCP, UDP and SCTP transport protocol. The load balancer distributes network connections from clients who know a single IP address for a service, to a set of servers that actually perform the work Since connection must be established between client and server in connection-oriented transport before sending the request content, the load balancer usually selects a server without looking at the content of the request. IPVS is an implementation of layer-4 load balancing for the Linux kernel, IP Load Balancing Technologies LVS/NAT , LVS/TUN LVS/DR. Layer-7 load balancing, also known as application-level load balancing, is to parse requests in application layer and distribute requests to servers based on different types of request contents, so that it can provide quality of service requirements for different types of contents and improve overall cluster performance. MPLS Load Balancing MPLS load balancing is to balance network services based on the Multiprotocol Label Switching (MPLS) label information. See MPLS Load Balancing for more information.. DNS load balancing is to distribute requests to different servers though resolving the domain name to different IP addresses of servers. When a DNS request comes to the DNS server to resolve the domain name, it gives out one of the server IP addresses based on scheduling strategies, such as simple round-robin scheduling or geographical scheduling. Link Load Balancing Link load balancing is to balance traffic among multiple links from different ISPs or one ISP for better scalability and availability of Internet connectivity, and also cost saving. http://www.loadbalancer.org/load_balancing_methods.php#nat http://lcic.org/ http://lcic.org/documentation.html Grid computing Main article: Grid computing Grids are usually computer clusters, but more focused on throughput like a computing utility rather than running fewer, tightly-coupled jobs. Often, grids will incorporate heterogeneous collections of computers, possibly distributed geographically, sometimes administered by unrelated organizations. Grid computing (or the use of computational grids) is the combination of computer resources from multiple administrative domains applied to a common task, usually to a scientific, technical or business problem that requires a great number of computer processing cycles or the need to process large amounts of data. One of the main strategies of grid computing is using software to divide and apportion pieces of a program among several computers, sometimes up to many thousands. Ip traffic Iproute Iproute2 is a collection of utilities for controlling TCP / IP networking and traffic control in Linux. It is currently maintained by Stephen Hemminger <[email protected]>. The original author, Alexey Kuznetsov, is well known for the QoS implementation in the Linux kernel. http://freshmeat.net/articles/linux-clustering-software Software for building and using clusters High Performance Computing Software (Beowulf/Scyld, OSCAR, OpenMosix...), High Availability Software (Kimberlite, Heartbeat...). Load Balancing Software (Linux Virtual Server, Ultra Monkey...). Software used on, and for using, clusters File Systems (Intermezzo, ClusterNFS, DRBD...). Beowulf Project, also known these days as Scyld. Scyld contains an enhanced kernel and some tools and libraries that are used to present the cluster as a "Single System Image". This idea of a single system image means that processes that are running on slave nodes in the cluster are visible and manageable from the master node, giving the impression of the cluster being just a single system http://freshmeat.net/projects/beowulf/ openMosix Cluster for Linux openMosix is a a set of extensions to the standard Linux kernel allowing you to build a cluster of out of off-the-shelf PC hardware. openMosix scales perfectly up to thousands of nodes. You do not need to modify your applications to benefit from your cluster (unlike PVM, MPI, Linda, etc.). Processes in openMosix migrate transparently between nodes and the cluster will always auto-balance. HPC There are other HPC clustering solutions that do not change the way the kernel functions. These use other means to run jobs and deal with showing information about them. Cplant, the Ka Clustering Toolkit, and OSCAR all allow you to build, use, and manage your cluster in this manner. Filesystem Used The Global File System (GFS) The Global File System (GFS) is a 64-bit shared disk cluster file system for Linux. GFS cluster nodes physically share the same storage by means of Fibre Channel or shared SCSI devices. The file system appears to be local on each node and GFS synchronizes file access across the cluster. GFS is fully symmetric, meaning that all nodes are equal and there is no server which may be a bottleneck or single point of failure. GFS uses read and write caching while maintaining full UNIX file system semantics. GFS supports journaling, recovery from client failures, and many other features. OpenAFS AFS is a distributed filesystem which offers a client-server architecture, transparent data igration abilities, scalability, a single namespace, and integrated ancillary subsystems. High Availability Software Kimberlite specializes in shared data storage and maintaining data integrity. Piranha (a.k.a. the Red Hat High Availability Server Project), can serve in one of two ways; it can be a two-node high availability failover solution or a multi-node load balancing solution. HeartBeat One of the better-known projects in this space is probably the High Availability Linux Project, also known as Linux-HA. The heart of Linux-HA is Heartbeat, which provides a heartbeat, monitoring, and IP takeover functionality. It can run heartbeats over serial ports or UDP broadcast or multicast, and can re-allocate IP addresses and other resources to various members of the cluster when a node goes down, and restore them when the node comes back up. Linux-HA Heartbeat is a full-function high-availability system for Linux and other POSIX-like OSes. It monitors services and restarts them on errors. When managing a cluster (more than 1 machine), it will also monitor the members of the cluster and begin recovery of lost services in less than a second. It runs over serial ports and UDP broadcast/multicast, as well as OpenAIS multicast. It is easily adapted to different interconnect media and protocols. When used in a cluster, it can operate using shared disks, data replication, or no data sharing. Load Balancing Software One of the best known projects in this area is the Linux Virtual Server Project. It uses the load balancers to pass along requests to the servers, and can "virtualize" almost any TCP or UDP service, such as HTTP(S), DNS, ssh, POP, IMAP, SMTP, Load Balancing projects are based on LVS. Ultra Monkey incorporates LVS, a heartbeat, and service monitoring to provide highly available and load balanced services. Piranha has a load balancing mode, which it refers to in its documentation as LVS mode Keepalived adds a strong and robust keepalive facility to LVS. It monitors the server pools, and when one of the servers goes down, it tells the kernel and has the server removed from the LVS topology. The Zeus Load Balancer is not based on LVS, but offers similar functionality. It combines content-aware traffic management, site health monitoring, and failover services in its Web site load balancing. Pen, not based on LVS a simple load balancer for TCP-based protocols like HTTP or SMTP. Turbolinux Cluster Server is the last of the load balancing projects I will talk about. It is from the folks at Turbolinux, and its load balancing and monitoring software allows detection and recovery from hardware and software failures (if recovery is possible). LVS An LVS is a group of servers with a director that appear to the outside world (a client on the internet) as one server. The LVS can offer more services, or services of higher capacity/throughput, or redundant services (where individual servers can be brought down for maintenance) than is available from a single server. A service is defined here as a connection to a single port, eg telnet, http, https, nntp, ntp, nfs, ntp, ssh, smtp, pop, databases. In the computer bestiary, and LVS is a layer-4 switch. Standard client-server semantics are preserved. Each client thinks that it has connected directly with the realserver. Each realserver thinks that it has connected directly to the client. Neither the client nor the realservers have any way of telling that a director has intervened in the connection. An LVS is not a beowulf - a beowulf is a group of machines each of which is cooperatively calculating a small part of a larger problem. It is not a cluster - a cluster of machines is a set of machines which cooperatively distribute processing. The realservers in an LVS do not cooperate - they have no knowlege of any other realservers in the LVS. All a realserver knows about is that it gets connections from a client. http://www.linuxtopia.org/online_books/linux_system_administration/redhat_cluster_configuration_and_management/s1-lvs-block-diagram.html Piranha Keepalived Ultra Monkey surealived Linux-HA heartbeat package Mon ipvsman Net-SNMP-LVS-Module LVSM lvs-kiss SCOP LVS webmin module iptoip lvs-snmp http://www.linuxvirtualserver.org/software/index.html Here's a typical LVS-NAT setup. ________ | | | client | (local or on internet) |________| | (router) DIRECTOR_GW | -- | L Virtual IP i ____|_____ n | | (director can have 1 or 2 NICs) u | director | x |__________| DIP V | i | r -----------------+---------------- t | | | u | | | a RIP1 RIP2 RIP3 l ____________ ____________ ____________ | | | | | | S | realserver | | realserver | | realserver | e |____________| |____________| |____________| r v e r http://www.austintek.com/LVS/LVS-HOWTO/mini-HOWTO/LVS-mini-HOWTO.html#what There are some tools to help you configure an LVS. tools which include director failover, e.g. Ultra Monkey. by Horms, which handles director failover, but has to be setup by hand. In Apr 2005, Horms released UltraMonkey v3 (http://www.ultramonkey.org/download/3) keepalived by Alexandre Cassen, which sets everything up for you and is at the keepalived site. It handles director and realserver failure. php based web interface to lvs/ldirectord CLI-controlled LVS demon lvs-kiss which uses ipvsadm for load balancing and fail-over. UltraMonkey Ultra Monkey is a project to create load balanced and highly available network services. For example a cluster of web servers that appear as a single web server to end-users. The service may be for end-users across the world connected via the internet, or for enterprise users connected via an intranet. Ultra Monkey makes use of the Linux operating system to provide a flexible solution that can be tailored to a wide range of needs. From small clusters of only two nodes to large systems serving thousands of connections per second. UltraMonkey - heartbeat http://www.ultramonkey.org/ http://www.ultramonkey.org/3/topologies/ha-lb-eg.html http://www.ultramonkey.org/about.shtml http://www.linuxvirtualserver.org/docs/ha/ultramonkey.html High Availability Using Piranha to build highly available LVS systems Using Keepalived to build highly available LVS systems Using UltraMonkey to build highly available LVS systems Using heartbeat+mon+coda to build highly available LVS systems Using heartbeat+ldirectord to build highly available LVS systems Ref - http://www.linuxvirtualserver.org/HighAvailability.html HAProxy + HearBeat http://www.howtoforge.com/setting-up-a-high-availability-load-balancer-with-haproxy-heartbeat-on-debian-lenny http://www.webhostingtalk.com/showthread.php?t=627783 http://haproxy.1wt.eu/download/1.2/doc/haproxy-en.txt --- On Wed, 10/3/10, ashraf mohammed <[email protected]> wrote: From: ashraf mohammed <[email protected]> Subject: [LinuxVadaPav] help on cluster server To: [email protected], [email protected] Date: Wednesday, 10 March, 2010, 3:51 PM HI guys i want to know wat is cluster server on linux .And how to configure it. and can i do clustering of server on my home pc.i am using vmware 6.0.5. i can add as many server as a want in vmware. So can i learn clustering at home. i want to learn basic clustering first. so guys can u help out with some ideas. how to achieve this... or any documentation which will help me in better way MOHAMMED ASHRAF MOB: 9870161983 The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. http://in.yahoo. com/ [Non-text portions of this message have been removed] Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! http://downloads.yahoo.com/in/internetexplorer/ [Non-text portions of this message have been removed]
