Hi all, Mind the slew of questions, well into testing now and finding lots of issues. This one is two questions... :)
I set a server to be unamaged in pacemaker while the server was running. Then I tried to remove the resource, and it refused saying it couldn't stop it, and to use '--force'. So I did, and the node got fenced. Now, the resource was setup with; pcs resource create srv07-el6 ocf:alteeve:server name="srv07-el6" \ meta allow-migrate="true" target-role="started" \ op monitor interval="60" start timeout="INFINITY" \ on-fail="block" stop timeout="INFINITY" on-fail="block" \ migrate_to timeout="INFINITY" I would have expected the 'stop timeout="INFINITY" on-fail="block"' to prevent fencing if the server failed to stop (question 1) and that if a resource was unmanaged, that the resource wouldn't even try to stop (question 2). Can someone help me understand what happened here? digimer More below; ==== [root@el8-a01n01 ~]# pcs resource remove srv01-test Attempting to stop: srv01-test... Warning: 'srv01-test' is unmanaged Error: Unable to stop: srv01-test before deleting (re-run with --force to force deletion) [root@el8-a01n01 ~]# pcs resource remove srv01-test --force Deleting Resource - srv01-test [root@el8-a01n01 ~]# client_loop: send disconnect: Broken pipe ==== As you can see, the node was fenced. The logs on that node were; ==== Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-execd[1872]: warning: srv01-test_stop_0 process (PID 113779) timed out Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-execd[1872]: warning: srv01-test_stop_0[113779] timed out after 20000ms Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-controld[1875]: error: Result of stop operation for srv01-test on el8-a01n01: Timed Out Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-controld[1875]: notice: el8-a01n01-srv01-test_stop_0:37 [ The server: [srv01-test] is indeed running. It will be shut down now.\n ] Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting fail-count-srv01-test#stop_0[el8-a01n01]: (unset) -> INFINITY Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting last-failure-srv01-test#stop_0[el8-a01n01]: (unset) -> 1610935435 Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting fail-count-srv01-test#stop_0[el8-a01n01]: INFINITY -> (unset) Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting last-failure-srv01-test#stop_0[el8-a01n01]: 1610935435 -> (unset) client_loop: send disconnect: Broken pipe ==== On the peer node, the logs showed; ==== Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 58, saving inputs in /var/lib/pacemaker/pengine/pe-input-100.bz2 Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 58 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-100.bz2): Complete Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 59, saving inputs in /var/lib/pacemaker/pengine/pe-input-101.bz2 Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 59 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-101.bz2): Complete Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Detected active orphan srv01-test running on el8-a01n01 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Clearing failure of srv01-test on el8-a01n02 because resource parameters have changed Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Removing srv01-test from el8-a01n01 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Removing srv01-test from el8-a01n02 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Stop srv01-test ( el8-a01n01 ) due to node availability Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 60, saving inputs in /var/lib/pacemaker/pengine/pe-input-102.bz2 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Initiating stop operation srv01-test_stop_0 on el8-a01n01 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 60 aborted by deletion of lrm_rsc_op[@id='srv01-test_last_failure_0']: Resource operation removal Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 60 action 11 (srv01-test_stop_0 on el8-a01n01): expected 'ok' but got 'error' Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 60 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-102.bz2): Complete Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-attrd[490048]: notice: Setting fail-count-srv01-test#stop_0[el8-a01n01]: (unset) -> INFINITY Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-attrd[490048]: notice: Setting last-failure-srv01-test#stop_0[el8-a01n01]: (unset) -> 1610935435 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Unexpected result (error) was recorded for stop of srv01-test on el8-a01n01 at Jan 18 02:03:35 2021 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Unexpected result (error) was recorded for stop of srv01-test on el8-a01n01 at Jan 18 02:03:35 2021 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Cluster node el8-a01n01 will be fenced: srv01-test failed there Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Detected active orphan srv01-test running on el8-a01n01 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Scheduling Node el8-a01n01 for STONITH Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Stop of failed resource srv01-test is implicit after el8-a01n01 is fenced Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Fence (reboot) el8-a01n01 'srv01-test failed there' Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Move virsh_node2_pulsar ( el8-a01n01 -> el8-a01n02 ) Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Stop srv01-test ( el8-a01n01 ) due to node availability Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Calculated transition 61 (with warnings), saving inputs in /var/lib/pacemaker/pengine/pe-warn-1.bz2 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Unexpected result (error) was recorded for stop of srv01-test on el8-a01n01 at Jan 18 02:03:35 2021 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Unexpected result (error) was recorded for stop of srv01-test on el8-a01n01 at Jan 18 02:03:35 2021 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Cluster node el8-a01n01 will be fenced: srv01-test failed there Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Detected active orphan srv01-test running on el8-a01n01 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Forcing srv01-test away from el8-a01n01 after 1000000 failures (max=1000000) Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Clearing failure of srv01-test on el8-a01n01 because it is orphaned Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Scheduling Node el8-a01n01 for STONITH Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Stop of failed resource srv01-test is implicit after el8-a01n01 is fenced Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Fence (reboot) el8-a01n01 'srv01-test failed there' Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Move virsh_node2_pulsar ( el8-a01n01 -> el8-a01n02 ) Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Stop srv01-test ( el8-a01n01 ) due to node availability Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Calculated transition 62 (with warnings), saving inputs in /var/lib/pacemaker/pengine/pe-warn-2.bz2 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Requesting fencing (reboot) of node el8-a01n01 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Initiating start operation virsh_node2_pulsar_start_0 locally on el8-a01n02 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Client pacemaker-controld.490050.72911c98 wants to fence (reboot) 'el8-a01n01' with device '(any)' Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Requesting peer fencing (reboot) targeting el8-a01n01 Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-attrd[490048]: notice: Setting fail-count-srv01-test#stop_0[el8-a01n01]: INFINITY -> (unset) Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-attrd[490048]: notice: Setting last-failure-srv01-test#stop_0[el8-a01n01]: 1610935435 -> (unset) Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: virsh_node2_pulsar is not eligible to fence (reboot) el8-a01n01: static-list Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: virsh_node1_pulsar is eligible to fence (reboot) el8-a01n01: static-list Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 62 aborted by deletion of lrm_rsc_op[@id='srv01-test_last_failure_0']: Resource operation removal Jan 18 02:03:55 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Requesting that el8-a01n02 perform 'reboot' action targeting el8-a01n01 using 'virsh_node1_pulsar' Jan 18 02:03:56 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Result of start operation for virsh_node2_pulsar on el8-a01n02: ok Jan 18 02:03:57 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Operation 'reboot' [646769] (call 4 from pacemaker-controld.490050) for host 'el8-a01n01' with device 'virsh_node1_pulsar' returned: 0 (OK) Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-attrd[490048]: notice: Node el8-a01n01 state is now lost Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-attrd[490048]: notice: Removing all el8-a01n01 attributes for peer loss Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Node el8-a01n01 state is now lost Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-based[490045]: notice: Node el8-a01n01 state is now lost Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-based[490045]: notice: Purged 1 peer with id=1 and/or uname=el8-a01n01 from the membership cache Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Node el8-a01n01 state is now lost Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Purged 1 peer with id=1 and/or uname=el8-a01n01 from the membership cache Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-attrd[490048]: notice: Purged 1 peer with id=1 and/or uname=el8-a01n01 from the membership cache Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Action 'reboot' targeting el8-a01n01 using virsh_node1_pulsar on behalf of pacemaker-controld.490050@el8-a01n02: OK Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-fenced[490046]: notice: Operation 'reboot' targeting el8-a01n01 on el8-a01n02 for [email protected]: OK Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Stonith operation 4/2:62:0:e827eea0-dedc-4200-a207-c4095621b3c6: OK (0) Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Peer el8-a01n01 was terminated (reboot) by el8-a01n02 on behalf of pacemaker-controld.490050: OK Jan 18 02:03:58 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 62 (Complete=5, Pending=0, Fired=0, Skipped=1, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-warn-2.bz2): Stopped Jan 18 02:03:59 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Removing srv01-test from el8-a01n02 Jan 18 02:03:59 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 63, saving inputs in /var/lib/pacemaker/pengine/pe-input-103.bz2 Jan 18 02:03:59 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Initiating monitor operation virsh_node2_pulsar_monitor_60000 locally on el8-a01n02 Jan 18 02:03:59 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Initiating delete operation srv01-test_delete_0 locally on el8-a01n02 Jan 18 02:03:59 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 63 aborted by deletion of lrm_resource[@id='srv01-test']: Resource state removal Jan 18 02:04:00 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Result of monitor operation for virsh_node2_pulsar on el8-a01n02: ok Jan 18 02:04:00 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 63 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-103.bz2): Complete Jan 18 02:04:00 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 64, saving inputs in /var/lib/pacemaker/pengine/pe-input-104.bz2 Jan 18 02:04:00 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 64 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-104.bz2): Complete Jan 18 02:04:00 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE ==== -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
