Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
Indeed, SAN replication could be another way to partially address this. To make it work, one should be able to add sort of external resource in the cluster monitoring the synchronization status between the source LUNs and the target ones, and by the way automatically invert the synchronization in case your resource or service fails over another node on the other site. This can be tricky and your SAN arrays must allow you to do this (HDS/HP command devices, etc...) IMHO, LVM mirror is the simplest way to achieve this if latency constraints are acceptable. When I say partially, there is always the quorum issue, as on a 4 nodes cluster, equally located on 2 sites, in case of a site failure, the 2 remaining nodes are not quorate. Brem 2009/6/10 Tom Lanyon t...@netspot.com.au On 05/06/2009, at 6:52 PM, brem belguebli wrote: Hello, That sounds pretty much to the question I've asked to this mailing-list last May ( https://www.redhat.com/archives/linux-cluster/2009-May/msg00093.html). We are in the same setup, already doing Geo-cluster with other technos and we are looking at RHCS to provide us the same service level. Latency could be a problem indeed if too high , but in a lot of cases (many companies for which I've worked), datacenters are a few tens of kilometers far, with a latency max close to 1 ms, which is not a problem. Let's consider this kind of setup, 2 datacenters far from each other by 1 ms delay, each hosting a SAN array, each of them connected to 2 SAN fabrics extended between the 2 sites. What reason would prevent us from building Geo-clusters without having to rely on a database replication mechanism, as the setup I would like to implement would also be used to provide NFS services that are disaster recovery proof. Obviously, such setup should rely on LVM mirroring to allow a node hosting a service to be able to write to both local and distant SAN LUN's. Brem I have been wondering whether the same could be done (cross-site RHCS) using SAN replication and multipath, avoiding LVM mirroring. This is going to depend strongly on the storage replication failover time; if the IO to shared storage devices is queued for too long, the cluster will stop. Does anyone have any experience with how quick this would need to happen for RHCS to tolerate it? I have been meaning to test this but have not had a chance... Tom -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
On 05/06/2009, at 6:52 PM, brem belguebli wrote: Hello, That sounds pretty much to the question I've asked to this mailing- list last May (https://www.redhat.com/archives/linux-cluster/2009-May/msg00093.html ). We are in the same setup, already doing Geo-cluster with other technos and we are looking at RHCS to provide us the same service level. Latency could be a problem indeed if too high , but in a lot of cases (many companies for which I've worked), datacenters are a few tens of kilometers far, with a latency max close to 1 ms, which is not a problem. Let's consider this kind of setup, 2 datacenters far from each other by 1 ms delay, each hosting a SAN array, each of them connected to 2 SAN fabrics extended between the 2 sites. What reason would prevent us from building Geo-clusters without having to rely on a database replication mechanism, as the setup I would like to implement would also be used to provide NFS services that are disaster recovery proof. Obviously, such setup should rely on LVM mirroring to allow a node hosting a service to be able to write to both local and distant SAN LUN's. Brem I have been wondering whether the same could be done (cross-site RHCS) using SAN replication and multipath, avoiding LVM mirroring. This is going to depend strongly on the storage replication failover time; if the IO to shared storage devices is queued for too long, the cluster will stop. Does anyone have any experience with how quick this would need to happen for RHCS to tolerate it? I have been meaning to test this but have not had a chance... Tom -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
Hello, Here's a link to illustrate the kind of setup I'm trying to setup with RHCS. http://brehak.blogspot.com/2009/06/disaster-recovery-setup.html Regards 2009/6/5, Steven Dake sd...@redhat.com: On Fri, 2009-06-05 at 19:40 +0200, brem belguebli wrote: 2009/6/5, Jon Schulz jsch...@soapstonenetworks.com: Yes I would be interested to see what products you are currently using to achieve this. In my proposed setup we are actually completely database transaction driven. The problem is the people higher up want active database - database replication which will be problematic I know. Still we also use DB (Oracle, Sybase) replication mechanisms to address accidental data corruption, as mirroring being synchonous, if something happens (someone intentionnaly alters the DB or filesystem corruption) it will be on both legs of the mirror. Outside of the data side of the equation, how tolerant is the cluster network/heartbeat to latency assuming no packet loss? Or more to the point, at what point does everyone in their past experience see the heartbeat network become unreliable, latency wise. E.g. anything over 30ms? The default configured timers for failure detection are quite high and retransmit many times for failed packets (for lossy networks). 30msec latency would pose no major problem, except performance. If you used posix locking and your machine-machine latency was 30msec, each posix lock would take 30.03 msec to grant or more, which may not meet your performance requirements. I can't recommend wan connections with totem (the protocol used in rhcs) because of the performance characteristics. If the performance of posix locks is not a high requirement, it should be functional. Regards -steve -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
Hello, That sounds pretty much to the question I've asked to this mailing-list last May (https://www.redhat.com/archives/linux-cluster/2009-May/msg00093.html). We are in the same setup, already doing Geo-cluster with other technos and we are looking at RHCS to provide us the same service level. Latency could be a problem indeed if too high , but in a lot of cases (many companies for which I've worked), datacenters are a few tens of kilometers far, with a latency max close to 1 ms, which is not a problem. Let's consider this kind of setup, 2 datacenters far from each other by 1 ms delay, each hosting a SAN array, each of them connected to 2 SAN fabrics extended between the 2 sites. What reason would prevent us from building Geo-clusters without having to rely on a database replication mechanism, as the setup I would like to implement would also be used to provide NFS services that are disaster recovery proof. Obviously, such setup should rely on LVM mirroring to allow a node hosting a service to be able to write to both local and distant SAN LUN's. Brem 2009/6/3, Fajar A. Nugraha fa...@fajar.net: On Wed, Jun 3, 2009 at 6:36 AM, Jon Schulz jsch...@soapstonenetworks.com wrote: I'm in the process of doing a concept review with the redhat cluster suite. I've been given a requirement that cluster nodes are able to be located in geographically separated data centers. I realize that this is not an ideal scenario due to latency issues. For most purposes, RHCS would require that all nodes have access to the same storage/disk. That pretty much ruled out the DR feature that one might expect to get from having nodes in geographically separated data centers. I'd suggest you refine your requirements. Perhaps what you need is something like MySQL cluster replication, where there are two geographically separated data centers, each having its own cluster, and the two clusters replicate each other's data asynchronously. -- Fajar -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
On Fri, Jun 5, 2009 at 4:22 PM, brem belgueblibrem.belgue...@gmail.com wrote: We are in the same setup, already doing Geo-cluster with other technos and we are looking at RHCS to provide us the same service level. Usually the concepts are the same. What solution are you using? How does it work, replication or real cluster? Let's consider this kind of setup, 2 datacenters far from each other by 1 ms delay, each hosting a SAN array, each of them connected to 2 SAN fabrics extended between the 2 sites. What reason would prevent us from building Geo-clusters without having to rely on a database replication mechanism, as the setup I would like to implement would also be used to provide NFS services that are disaster recovery proof. Obviously, such setup should rely on LVM mirroring to allow a node hosting a service to be able to write to both local and distant SAN LUN's. Does LVM mirroring work with clustered LVM? -- Fajar -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
RE: [Linux-cluster] Networking guidelines for RHCS across datacenters
I have no relation to this company, but I have heard good stories from people who worked with their products: If you're database is oracle, mysql or postgres check out products on www.continuent.com Best Regards, Jeremy Eder, RHCE, VCP -Original Message- From: linux-cluster-boun...@redhat.com [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Jon Schulz Sent: Friday, June 05, 2009 10:38 AM To: linux clustering Subject: RE: [Linux-cluster] Networking guidelines for RHCS across datacenters Yes I would be interested to see what products you are currently using to achieve this. In my proposed setup we are actually completely database transaction driven. The problem is the people higher up want active database - database replication which will be problematic I know. Outside of the data side of the equation, how tolerant is the cluster network/heartbeat to latency assuming no packet loss? Or more to the point, at what point does everyone in their past experience see the heartbeat network become unreliable, latency wise. E.g. anything over 30ms? Most of my experiences with rhcs and linux-ha have always been with the cluster network being within the same LAN :( -Original Message- From: linux-cluster-boun...@redhat.com [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Fajar A. Nugraha Sent: Friday, June 05, 2009 5:47 AM To: linux clustering Subject: Re: [Linux-cluster] Networking guidelines for RHCS across datacenters On Fri, Jun 5, 2009 at 4:22 PM, brem belgueblibrem.belgue...@gmail.com wrote: We are in the same setup, already doing Geo-cluster with other technos and we are looking at RHCS to provide us the same service level. Usually the concepts are the same. What solution are you using? How does it work, replication or real cluster? Let's consider this kind of setup, 2 datacenters far from each other by 1 ms delay, each hosting a SAN array, each of them connected to 2 SAN fabrics extended between the 2 sites. What reason would prevent us from building Geo-clusters without having to rely on a database replication mechanism, as the setup I would like to implement would also be used to provide NFS services that are disaster recovery proof. Obviously, such setup should rely on LVM mirroring to allow a node hosting a service to be able to write to both local and distant SAN LUN's. Does LVM mirroring work with clustered LVM? -- Fajar -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
Fajar A. Nugraha wrote: Does LVM mirroring work with clustered LVM? Since 5.3 it works. Install lvm2-cluster. If you'd like to use mirrored volumes before 5.3 you can do so using lvm-tags (see filters in lvm.conf) but the mirror then is available only to one systeme at a time. Cheers Andreas -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
Hello, We are long term HP ServiceGuard on HP-UX users and since a few months HP ServiceGuard on Linux (aka SGLX). The first one (HP-UX) works by using their Cluster LVM (a clvmd-like daemon named cmlvmd on each node) allowing one node of the cluster to activate exclusively (vgchange -a e VGXX) on one node and use a non-clustered FS (vxfs) on top of the LV's. The LV's are mirrored (a leg on each SAN array, one local and the other distant). On Linux (SGLX) is a bit more tricky but when masterized it works well. It relies on non-clustered LVM, with the LVM2 hosttags feature (HA-LVM described by RH) built on top of MD raid1 devices with a cluster module that guarantees the raid device to be consistent on one node at a time. Unfortunately, HP just announced the discontinuation of SGLX, that's why we are looking towards RHCS to see if it can provide the same service, which doesn't seem to be obvious. Concerning LVM mirroring with Clustered LVM, I hope it does or will. The only thing I know about LVM mirror is that, soon (maybe around RH5u5) it will support online resizing without having to break the mirror. Brem 2009/6/5, Fajar A. Nugraha fa...@fajar.net: On Fri, Jun 5, 2009 at 4:22 PM, brem belgueblibrem.belgue...@gmail.com wrote: We are in the same setup, already doing Geo-cluster with other technos and we are looking at RHCS to provide us the same service level. Usually the concepts are the same. What solution are you using? How does it work, replication or real cluster? Let's consider this kind of setup, 2 datacenters far from each other by 1 ms delay, each hosting a SAN array, each of them connected to 2 SAN fabrics extended between the 2 sites. What reason would prevent us from building Geo-clusters without having to rely on a database replication mechanism, as the setup I would like to implement would also be used to provide NFS services that are disaster recovery proof. Obviously, such setup should rely on LVM mirroring to allow a node hosting a service to be able to write to both local and distant SAN LUN's. Does LVM mirroring work with clustered LVM? -- Fajar -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
2009/6/5, Jon Schulz jsch...@soapstonenetworks.com: Yes I would be interested to see what products you are currently using to achieve this. In my proposed setup we are actually completely database transaction driven. The problem is the people higher up want active database - database replication which will be problematic I know. Still we also use DB (Oracle, Sybase) replication mechanisms to address accidental data corruption, as mirroring being synchonous, if something happens (someone intentionnaly alters the DB or filesystem corruption) it will be on both legs of the mirror. Outside of the data side of the equation, how tolerant is the cluster network/heartbeat to latency assuming no packet loss? Or more to the point, at what point does everyone in their past experience see the heartbeat network become unreliable, latency wise. E.g. anything over 30ms? Most of my experiences with rhcs and linux-ha have always been with the cluster network being within the same LAN :( It is definitely the best solution in case you cannot rely on your network infrastructure. This is not completely my case :-) -Original Message- From: linux-cluster-boun...@redhat.com [mailto: linux-cluster-boun...@redhat.com] On Behalf Of Fajar A. Nugraha Sent: Friday, June 05, 2009 5:47 AM To: linux clustering Subject: Re: [Linux-cluster] Networking guidelines for RHCS across datacenters On Fri, Jun 5, 2009 at 4:22 PM, brem belgueblibrem.belgue...@gmail.com wrote: We are in the same setup, already doing Geo-cluster with other technos and we are looking at RHCS to provide us the same service level. Usually the concepts are the same. What solution are you using? How does it work, replication or real cluster? Let's consider this kind of setup, 2 datacenters far from each other by 1 ms delay, each hosting a SAN array, each of them connected to 2 SAN fabrics extended between the 2 sites. What reason would prevent us from building Geo-clusters without having to rely on a database replication mechanism, as the setup I would like to implement would also be used to provide NFS services that are disaster recovery proof. Obviously, such setup should rely on LVM mirroring to allow a node hosting a service to be able to write to both local and distant SAN LUN's. Does LVM mirroring work with clustered LVM? -- Fajar -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
On Fri, 2009-06-05 at 19:40 +0200, brem belguebli wrote: 2009/6/5, Jon Schulz jsch...@soapstonenetworks.com: Yes I would be interested to see what products you are currently using to achieve this. In my proposed setup we are actually completely database transaction driven. The problem is the people higher up want active database - database replication which will be problematic I know. Still we also use DB (Oracle, Sybase) replication mechanisms to address accidental data corruption, as mirroring being synchonous, if something happens (someone intentionnaly alters the DB or filesystem corruption) it will be on both legs of the mirror. Outside of the data side of the equation, how tolerant is the cluster network/heartbeat to latency assuming no packet loss? Or more to the point, at what point does everyone in their past experience see the heartbeat network become unreliable, latency wise. E.g. anything over 30ms? The default configured timers for failure detection are quite high and retransmit many times for failed packets (for lossy networks). 30msec latency would pose no major problem, except performance. If you used posix locking and your machine-machine latency was 30msec, each posix lock would take 30.03 msec to grant or more, which may not meet your performance requirements. I can't recommend wan connections with totem (the protocol used in rhcs) because of the performance characteristics. If the performance of posix locks is not a high requirement, it should be functional. Regards -steve -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Networking guidelines for RHCS across datacenters
On Wed, Jun 3, 2009 at 6:36 AM, Jon Schulz jsch...@soapstonenetworks.com wrote: I'm in the process of doing a concept review with the redhat cluster suite. I've been given a requirement that cluster nodes are able to be located in geographically separated data centers. I realize that this is not an ideal scenario due to latency issues. For most purposes, RHCS would require that all nodes have access to the same storage/disk. That pretty much ruled out the DR feature that one might expect to get from having nodes in geographically separated data centers. I'd suggest you refine your requirements. Perhaps what you need is something like MySQL cluster replication, where there are two geographically separated data centers, each having its own cluster, and the two clusters replicate each other's data asynchronously. -- Fajar -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster