I'm trying to plan a production platform where we might want to try for at least four nines, five if anybody wants to pay the price we'll have to set...
So I want to be able to have two independent 'disk boxes' with replicated content --- these could be SAN disk shelves or Linux hosts acting as iSCSI targets, with internal hardware RAID in either case --- and five to ten application hosts in a cluster using a shared file system to access the disks. It's not essential to have symmetrical read access to the two copies, one could be a passive standby, but active-active is neater. First question: Am I silly for wanting this? Now add to this that I might be asked to provide business continuity by replicating the file systems over a MAN... When looking around for similar setups, the closest thing I found was the ability to run GFS/OCFS2 over DRBD. But I don't really want my disks replicated five to ten times, two should be enough. What I really would like is: 1) Use LVM to make a mirrored VG out of the logical disks on the separate SAN boxes, activate that VG on all the application servers and run OCFS2 on that. In theory, OCFS2 or GFS should be able to tolerate the write/read order inconsistency that mirroring introduces, because they use distributed locking to preserve order when it matters. But I haven't read up on exactly what shared FSes expect from the underlying device, so I could be wrong. But if a write to one mirror fails, LVM would need to how to disable it on all cluster hosts before it signals completion (of that write or the next write barrier) to the shared FS instance on the local host. Will CLVM in RedHat do that? Will it ever appear in a mainline kernel? (The next half isn't really a ha-dev question, but I'm putting it in for completeness). 2) Use DRBD on raw RAID disk between two Lunix boxes, export that mirrored disk as a iSCSI target, and let the application boxes use iSCSI multipathing under the shared FS. A 2U Linux box with room for 6 Hotswap disks comes to about the same as a single-controller iSCSI shelf with room for 14 disks, so it's not quite as convenient as 1). And the iSCSI target code is not the most mature, I've heard. And I can't see where LVM would fit in there at all. And I've got two different sorts of Linux host now. So this is clearly a fallback solution. 2a) Put the disks in two SAN boxes, let one Linux box own each disk and do the same as above. Generates a lot of superfluous net traffic, and the 'disk master' resources probably cannot share hosts with anything else --- but they could still be 'regular' cluster hosts. LVM is still a problem, and running iSCSI on DRBD on iSCSI is silly, but is it viable? And for completeness: 3) I can throw extra money at the thing and get SAN boxes that will do the mirroring for me. Regards, Lars Mathiesen Sifira A/S _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
