14.04.2025 19:43, Artem wrote:
Dear gurus, I need your advice.
We want to build a pacemaker cluster with the following resources.
Could you please evaluate the idea and give feedback?
Pairs of nodes with NVMe disks. Disks are shared from one node to
another via nvmet. Persistent udev names and partition ids.
MD raid1 is made on top of pairs of disks from different nodes. I
suspect it must be clustered MD, and it'll require dlm?
2 or 4 clustered VLM volume groups are made on top of MD devices.
Pacemaker location preference rules for half of VGs to one node and
another half to another node.
Striped LVs on top of VG with FS for Lustre MDT and OST. 2 main nodes
in Corosync, other OST nodes are configured as remote resources.
OS network is separate from iBMC, and firewall rules deny this
traffic, so I decided to use SBD for fencing.
SBD requires a shared independent device. Using disks local to each
cluster node for SBD defeats its purpose.
I only found some pieces of such a stack documented, different OS,
different years ago. Now I'm trying to make it work together. At the
moment the clustered MD cannot be created as it fails to create a
lockspace (due to dlm error?). And dlm-clone doesn't want to start
either on main nodes or (as it should) on remote nodes. OS = RHEL9.
May be such setup is too complicated? I try to avoid split brain
situations and uncoordinated writes by 2 mdadm processes on different
nodes in all failure scenarios.
I know that a common approach is to use JBODs of SAN arrays. But we
don't have it for this project.
Thanks in advance.
Kindest regards,
Artem
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/