Hi Laura, thanks for your reply. It seems the OSSs will share the disks created from a shared SAN. So the OSS-pairs can failover in a pre-defined manner if one node is down, coordinated by a HA manager.
This can certainly work on a limited scale. I'm curious if this static schema can scale to a large cluster with 100s of OSSs servers? regards, Shawn On Tue, Jul 18, 2023 at 1:25 PM Laura Hild <[email protected]> wrote: > I'm not familiar with using FLR to tolerate OSS failures. My site does > the HA pairs with shared storage method. It's sort of described in the > manual > > https://doc.lustre.org/lustre_manual.xhtml#configuringfailover > > but in more, Pacemaker-specific detail at > > > https://wiki.lustre.org/Creating_a_Framework_for_High_Availability_with_Pacemaker > > and > > > https://wiki.lustre.org/Creating_Pacemaker_Resources_for_Lustre_Storage_Services > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
