Shawn,
Lustre handles the largest filesystems in the world, hundreds of PB in size, so
there are definitely Lustre filesystems with hundreds of servers.
In large storage clusters the servers failover in pairs or quads, since the
storage is typically not on a single global SAN for all nodes to
Hi Laura, thanks for your reply.
It seems the OSSs will share the disks created from a shared SAN. So the
OSS-pairs can failover in a pre-defined manner if one node is down,
coordinated by a HA manager.
This can certainly work on a limited scale. I'm curious if this static
schema can scale to
I'm not familiar with using FLR to tolerate OSS failures. My site does the HA
pairs with shared storage method. It's sort of described in the manual
https://doc.lustre.org/lustre_manual.xhtml#configuringfailover
but in more, Pacemaker-specific detail at
If I want to tolerate an OSS node failure (power cut, etc), what config is
needed in lustre? Multiple replicas, two nodes with HA mode, or some
other mechanisms? Thanks.
Shawn
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org