Re: [ofa-general] SRP target/LVM HA configuration

Vu Pham Wed, 12 Mar 2008 15:21:24 -0700

I would second to Stanley - client hostsfail-over/load-balancing would be straightforward

For linux host you have several tools: sw raid, lvm ordm-multipath

-vu

I had looked at this configuration as well and decided to use the volume
management at the clients to mirror the data. Windows LDM mirrored
across 2 SRPT servers and Linux md RAID 1 mirrored.

This provides transparent failover and the SRP client/host will rebuild
the slices that went offline.
-----Original Message-----
From: [EMAIL PROTECTED][mailto:[EMAIL PROTECTED] On Behalf OfDaniel Pocock
Sent: Tuesday, March 11, 2008 4:26 PM
To: [email protected]
Subject: [ofa-general] SRP target/LVM HA configuration
I'm contemplating a HA configuration based on SRP and LVM (ormaybe EVMS).
There are many good resources based on NFS and drbd (seehttp://www.linux-ha.org/HaNFS) but it would be more flexible to workwith block level (e.g SRP) rather than file level (NFS). Obviously,SRP/RDMA offers a major performance benefit compared with drbd (whichuses IP).
Basically, I envisage the primary server having access to thesecondary(passive) server's disk using SRP, and putting both the local(primary)disk and SRP (secondary) disk into RAID1. The RAID1 setwould contain avolume group and multiple volumes - which would, in turn, beSRP targets(for VMware to use) or possibly NFS shares.
This leads me to a few issues:
- Read operations - would it be better for the primary toread from bothdisks, or just it's own disk? Using drbd, the secondary disk is notread unless the primary is down. However, given theperformance of SRP,I suspect that reading from both the local and SRP disk would give aboost to performance.
- Does it make sense to use md or LVM to combine a local diskand an SRPdisk into RAID1 (or potentially RAID5)? Are there technicalchallengesthere, given that one target is slightly faster than the other?
- Fail-over - when the secondary detects that the primary isdown, canit dynamically take the place of the failed SRP target? Will theend-user initiators (e.g. VMWare, see diagram below) be confused whenthe changeover occurs? Is there the possibility of datainconsistencyif some write operations had been acknowledged by the primary but notpropagated to the secondary's disk at the moment when thefailure occurred?
- Recovery - when the old primary comes back online as asecondary, itwill need to resync it's disk - is a partial resync possible,or is fullrebuild mandatory?
Diagram:


Disk--Primary Server-------------------SRP Initiator (e.g. VMware ESX)
| +------NFS client| .
       SRP                      .
   (RAID1 of primary's          .
   disk and secondary's         .
      disk)                     . (fail-over path to storage
        |                       .  when primary is down)
Disk--Secondary Server. . . . . .



_______________________________________________
general mailing list
[email protected]http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visithttp://openib.org/mailman/listinfo/openib-general
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] SRP target/LVM HA configuration

Reply via email to