For Groovy:
# fence_mpath
node 1: clusterg01
node 2: clusterg02
node 3: clusterg03
primitive fence-mpath-clusterg01 stonith:fence_mpath \
params pcmk_on_timeout=70 pcmk_off_timeout=70 pcmk_host_list=clusterg01
pcmk_monitor_action=metadata pcmk_
meta provides=unfencing target-role=Started
primitive fence-mpath-clusterg02 stonith:fence_mpath \
params pcmk_on_timeout=70 pcmk_off_timeout=70 pcmk_host_list=clusterg02
pcmk_monitor_action=metadata pcmk_
meta provides=unfencing target-role=Started
primitive fence-mpath-clusterg03 stonith:fence_mpath \
params pcmk_on_timeout=70 pcmk_off_timeout=70 pcmk_host_list=clusterg03
pcmk_monitor_action=metadata pcmk_
meta provides=unfencing target-role=Started
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=2.0.3-4b1f869f0f \
cluster-infrastructure=corosync \
cluster-name=clusterg \
stonith-enabled=true \
no-quorum-policy=stop \
last-lrm-refresh=1590773755
--
$ crm status
Cluster Summary:
* Stack: corosync
* Current DC: clusterg01 (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Mon Jun 1 04:17:28 2020
* Last change: Mon Jun 1 04:07:10 2020 by root via cibadmin on clusterg03
* 3 nodes configured
* 3 resource instances configured
Node List:
* Online: [ clusterg01 clusterg02 clusterg03 ]
Full List of Resources:
* fence-mpath-clusterg01 (stonith:fence_mpath): Started clusterg01
* fence-mpath-clusterg02 (stonith:fence_mpath): Started clusterg02
* fence-mpath-clusterg03 (stonith:fence_mpath): Started clusterg03
--
(k)rafaeldtinoco@clusterg02:~$ sudo mpathpersist --in -r /dev/mapper/volume01
PR generation=0x11, Reservation follows:
Key = 0x59450001
scope = LU_SCOPE, type = Write Exclusive, registrants only
(k)rafaeldtinoco@clusterg02:~$ sudo mpathpersist --in -k /dev/mapper/volume01
PR generation=0x11, 12 registered reservation keys follow:
0x59450001
0x59450001
0x59450001
0x59450001
0x59450000
0x59450000
0x59450000
0x59450000
0x59450002
0x59450002
0x59450002
0x59450002
-- when removing communication in between all nodes and clusterg01:
(k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k /dev/mapper/volume01
PR generation=0x12, 8 registered reservation keys follow:
0x59450001
0x59450001
0x59450001
0x59450001
0x59450002
0x59450002
0x59450002
0x59450002
(k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -r /dev/mapper/volume01
PR generation=0x12, Reservation follows:
Key = 0x59450001
scope = LU_SCOPE, type = Write Exclusive, registrants only
and
Node List:
* Node clusterg01: UNCLEAN (offline)
* Online: [ clusterg02 clusterg03 ]
Full List of Resources:
* fence-mpath-clusterg01 (stonith:fence_mpath): Started [ clusterg01
clusterg02 ]
* fence-mpath-clusterg02 (stonith:fence_mpath): Started clusterg03
* fence-mpath-clusterg03 (stonith:fence_mpath): Started clusterg03
Pending Fencing Actions:
* reboot of clusterg01 pending: client=pacemaker-controld.906,
origin=clusterg02
and watchdog on host clusterg01 rebooted it. After reboot, only a single path
has
came set the reservation, not all of them:
(k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k /dev/mapper/volume01
PR generation=0x13, 9 registered reservation keys follow:
0x59450001
0x59450001
0x59450001
0x59450001
0x59450002
0x59450002
0x59450002
0x59450002
0x59450000
I had to stop "fence-mpath-clusterg01" fence agent and restore all
reservations manually with:
(k)rafaeldtinoco@clusterg01:~$ sudo fence_mpath -v -d
/dev/mapper/volume01 -n 59450000 -o on
and start resource "fence-mpath-clusterg01" again. This is the problem
with multipathed devices about having automatic recovery... sometimes it
is better to have manual intervention only and stick with the faulty
node until you manually reboot it and fix reservations.
** Description changed:
- Whenever trying to configure fence_scsi using Ubuntu Bionic the
- following error happens:
+ This bug's intent is to check if fence_scsi and fence_mpath agents are
+ working in all supported Ubuntu versions. This is needed because both
+ agents are very prone to errors and, depending the way they are
+ configured, a vast set of errors can occur.
+
+ # fence-agents:
+
+ Both agents, fence_scsi and fence_mpath, are prone to errors
+
+ ## fence_scsi:
+
+ You may find the following cluster resource manager errors:
Failed Actions:
- * fence_clubionicpriv01_start_0 on clubionic01 'unknown error' (1): call=8,
status=Error, exitreason='',
- last-rc-change='Mon Feb 24 03:20:28 2020', queued=0ms, exec=1132ms
+ * fence_bionic_start_0 on clubionic01 'unknown error' (1): call=8,
status=Error, exitreason='', last-rc-change='Mon Feb 24 03:20:28 2020',
queued=0ms, exec=1132ms
And the logs show:
Feb 24 03:20:31 clubionic02 fence_scsi[14072]: Failed: Cannot open file
"/var/run/cluster/fence_scsi.key"
Feb 24 03:20:31 clubionic02 fence_scsi[14072]: Please use '-h' for usage
- That happens because the key to be used by fence_scsi agent does not
- exist.
+ The fence_scsi agent is responsible for creating those files on the fly
+ and this error might be related to how the fence agent was configured in
+ pacemaker.
- The fence agent is responsible for creating those files on the fly and
- this error might be related to how the fence agent was configured in
- pacemaker.
+ ## fence-mpath
+
+ You may find very difficult to configure fence_mpath to work flawless,
+ try to follow comments from this bug.
** Summary changed:
- fence_scsi cannot open /var/run/cluster/fence_scsi.key (does not exist) after
nodes are rebooted
+ fence_scsi and fence_mpath configuration issues (e.g.
/var/run/cluster/fence_scsi.key)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1864404
Title:
fence_scsi and fence_mpath configuration issues (e.g.
/var/run/cluster/fence_scsi.key)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fence-agents/+bug/1864404/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs