Hello Jan,

You can use the Pacemaker / Corosync high-availability software stack for this: 
specifically, ordering constraints [1] can be used.

Unfortunately, Pacemaker is probably over-the-top if you don't need HA -- its 
configuration is complex and difficult to get right, and it significantly 
complicates system administration. One downside of Pacemaker is that it is not 
easy to decouple the Pacemaker service from the Lustre services, meaning if you 
stop the Pacemaker service, it will try to stop all of the Lustre services. 
This might make it inappropriate for use cases that don't involve HA.

Given those downsides, if others in the community have suggestions on simpler 
means to accomplish this, I'd love to see other tools that can be used here 
(especially officially supported ones, if they exist).

[1] 
https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/constraints.html#specifying-the-order-in-which-resources-should-start-stop

- Thomas Bertschinger

________________________________________
From: lustre-discuss <[email protected]> on behalf of Jan 
Andersen <[email protected]>
Sent: Wednesday, December 6, 2023 3:27 AM
To: lustre
Subject: [EXTERNAL] [lustre-discuss] Coordinating cluster start and shutdown?

Are there any tools for coordinating the start and shutdown of lustre 
filesystem, so that the OSS systems don't attempt to mount disks before the MGT 
and MDT are online?
_______________________________________________
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to