On 2020-02-10 00:06, Strahil Nikolov wrote:
On February 10, 2020 2:07:01 AM GMT+02:00, Dan Swartzendruber
<[email protected]> wrote:
I have a 2-node CentOS7 cluster running ZFS. The two nodes (vsphere
appliances on different hosts) access 2 SAS SSD in a Supermicro JBOD
with 2 mini-SAS connectors. It all works fine - failover and all. My
quandary was how to implement fencing. I was able to get both of the
vmware SOAP and REST fencing agents to work - it just isn't reliable
enough. If the vcenter server appliance is busy, fencing requests
timeout. I know I can increase the timeouts, but in at least one test
run, even a minute wasn't enough, and my concern is that too long
switching over, and vmware will put the datastore in APD, hosing
guests.
I confirmed that both SSD work properly with the fence-scsi agent.
Fencing the host who actively owns the ZFS pool also works perfectly
(ZFS flushes data to the datastore every 5 seconds or so, so
withdrawing
the SCSI-3 persistent reservations causes a fatal write error to the
pool, and setting the pool in failmode=panic will cause the fenced
cluster node to reboot automatically.) The problem (maybe it isn't
really one?) is that fencing the node that does *not* own the pool has
no effect, since it holds no reservations on the devices in the pool.)
I'd love to be sure this isn't an issue at all.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
Hi Dan,
You can configure multiple fencing mechanisms in your cluster.
For example, you can set the first fencing mechanism to be via VmWare
and if it fails (being busy or currrently unavailable), then the scsi
fencing can kick in to ensure a failover can be done.
What you observe is normal - no scsi reservations -> no fencing.
That's why major vendors require , when using
fence_multipath/fence_scsi, the shared storage to be a dependency (a
File system in use by the application) and not just an add-on.
I personally don't like scsi reservations, as there is no guarantee
that other resources (services, IPs, etc) are actually down , but the
risk is low.
In your case fence_scsi stonith can be a second layer of protection.
Best Regards,
Strahil Nikolov
Okay, thanks. I'll look into multi-level then.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/