SBD can use iSCSI (for example target is also the quorum node), disk partition or lvm LV, so I guess it can also use a ZFS volume dedicated for the SBD (10MB is enough). In your case IPMI is quite suitable.
About the power fencing when persistent reservations are removed -> it's just a script started by the watchdog.service on the node itself.It should be usable on all Linuxes and many UNIX-like OSes. Best Regards, Strahil Nikolov На 30 юли 2020 г. 12:05:39 GMT+03:00, Gabriele Bulfon <[email protected]> написа: >Reading sbd from SuSE I saw that it requires a special block to write >informations, I don't think this is possibile here. > >It's a dual node ZFS storage running our own XStreamOS/illumos >distribution, and here we're trying to add HA capabilities. >We can move IPs, ZFS Pools and COMSTAR/iSCSI/FC, and now looking for a >stable way to manage stonith. > >The hardware system is this: > >https://www.supermicro.com/products/system/1u/1029/SYS-1029TP-DC0R.cfm > >and it features a shared SAS3 backplane, so both nodes can see all the >discs concurrently. > >Gabriele > > >Sonicle S.r.l. >: >http://www.sonicle.com >Music: >http://www.gabrielebulfon.com >Quantum Mechanics : >http://www.cdbaby.com/cd/gabrielebulfon >Da: >Reid Wahl >A: >Cluster Labs - All topics related to open-source clustering welcomed >Data: >30 luglio 2020 6.38.58 CEST >Oggetto: >Re: [ClusterLabs] Antw: [EXT] Stonith failing >I don't know of a stonith method that acts upon a filesystem directly. >You'd generally want to act upon the power state of the node or upon >the underlying shared storage. > >What kind of hardware or virtualization platform are these systems >running on? If there is a hardware watchdog timer, then sbd is >possible. The fence_sbd agent (poison-pill fencing via block device) >requires shared block storage, but sbd itself only requires a hardware >watchdog timer. > >Additionally, there may be an existing fence agent that can connect to >the controller you mentioned. What kind of controller is it? >On Wed, Jul 29, 2020 at 5:24 AM Gabriele Bulfon >[email protected] >wrote: >Thanks a lot for the extensive explanation! >Any idea about a ZFS stonith? > >Gabriele > > >Sonicle S.r.l. >: >http://www.sonicle.com >Music: >http://www.gabrielebulfon.com >Quantum Mechanics : >http://www.cdbaby.com/cd/gabrielebulfon >Da: >Reid Wahl >[email protected] >A: >Cluster Labs - All topics related to open-source clustering welcomed >[email protected] >Data: >29 luglio 2020 11.39.35 CEST >Oggetto: >Re: [ClusterLabs] Antw: [EXT] Stonith failing >"As it stated in the comments, we don't want to halt or boot via ssh, >only reboot." > >Generally speaking, a stonith reboot action consists of the following >basic sequence of events: >Execute the fence agent with the "off" action. >Poll the power status of the fenced node until it is powered off. >Execute the fence agent with the "on" action. >Poll the power status of the fenced node until it is powered on. >So a custom fence agent that supports reboots, actually needs to >support off and on actions. > > >As Andrei noted, ssh is **not** a reliable method by which to ensure a >node gets rebooted or stops using cluster-managed resources. You can't >depend on the ability to SSH to an unhealthy node that needs to be >fenced. > >The only way to guarantee that an unhealthy or unresponsive node stops >all access to shared resources is to power off or reboot the node. (In >the case of resources that rely on shared storage, I/O fencing instead >of power fencing can also work, but that's not ideal.) > >As others have said, SBD is a great option. Use it if you can. There >are also power fencing methods (one example is fence_ipmilan, but the >options available depend on your hardware or virt platform) that are >reliable under most circumstances. > >You said that when you stop corosync on node 2, Pacemaker tries to >fence node 2. There are a couple of possible reasons for that. One >possibility is that you stopped or killed corosync without stopping >Pacemaker first. (If you use pcs, then try `pcs cluster stop`.) Another >possibility is that resources failed to stop during cluster shutdown on >node 2, causing node 2 to be fenced. >On Wed, Jul 29, 2020 at 12:47 AM Andrei Borzenkov >[email protected] >wrote: > >On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon >[email protected] >wrote: >That one was taken from a specific implementation on Solaris 11. >The situation is a dual node server with shared storage controller: >both nodes see the same disks concurrently. >Here we must be sure that the two nodes are not going to import/mount >the same zpool at the same time, or we will encounter data corruption: > >ssh based "stonith" cannot guarantee it. > >node 1 will be perferred for pool 1, node 2 for pool 2, only in case >one of the node goes down or is taken offline the resources should be >first free by the leaving node and taken by the other node. > >Would you suggest one of the available stonith in this case? > > >IPMI, managed PDU, SBD ... >In practice, the only stonith method that works in case of complete >node outage including any power supply is SBD. >_______________________________________________ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users >ClusterLabs home: >https://www.clusterlabs.org/ >-- >Regards, >Reid Wahl, RHCA >Software Maintenance Engineer, Red Hat >CEE - Platform Support Delivery - ClusterHA >_______________________________________________Manage your >subscription: >https://lists.clusterlabs.org/mailman/listinfo/users >ClusterLabs home: >https://www.clusterlabs.org/ >_______________________________________________ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users >ClusterLabs home: >https://www.clusterlabs.org/ >-- >Regards, >Reid Wahl, RHCA >Software Maintenance Engineer, Red Hat >CEE - Platform Support Delivery - ClusterHA >_______________________________________________Manage your >subscription:https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs >home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
