Re: [ClusterLabs] controlling cluster behavior on startup
On Tue, Jan 30, 2024 at 2:21 PM Walker, Chris wrote: > >>> However, now it seems to wait that amount of time before it elects a > >>> DC, even when quorum is acquired earlier. In my log snippet below, > >>> with dc-deadtime 300s, > >> > >> The dc-deadtime is not waiting for quorum, but for another DC to show > >> up. If all nodes show up, it can proceed, but otherwise it has to wait. > > > I believe all the nodes showed up by 14:17:04, but it still waited until > 14:19:26 to elect a DC: > > > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher12 is now membe (was in > unknown state) > > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher11 is now membe (was in > unknown state) > > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > > > This is a cluster with 2 nodes, gopher11 and gopher12. > > This is our experience with dc-deadtime too: even if both nodes in the > cluster show up, dc-deadtime must elapse before the cluster starts. This > was discussed on this list a while back ( > https://www.mail-archive.com/users@clusterlabs.org/msg03897.html) and an > RFE came out of it (https://bugs.clusterlabs.org/show_bug.cgi?id=5310). > > > > I’ve worked around this by having an ExecStartPre directive for Corosync > that does essentially: > > > > while ! systemctl -H ${peer} is-active corosync; do sleep 5; done > > > > With this in place, the nodes wait for each other before starting Corosync > and Pacemaker. We can then use the default 20s dc-deadtime so that the DC > election happens quickly once both nodes are up. > Actually wait-for-all coming per default with 2-node should lead to quorum being delayed till both nodes showed up. And if we make the cluster not ignore quorum it shouldn't start fencing before it sees the peer - right? Running a 2-node-cluster ignoring quorum or without wait-for-all is a delicate thing anyway I would say and shouldn't work in a generic case. Not saying it is an issue here - guess there just isn't enough info about the cluster to say. So you shouldn't need this raised dc-deadtime and thus wouldn't experience large startup-delays. Regards, Klaus > Thanks, > > Chris > > > > *From: *Users on behalf of Faaland, Olaf > P. via Users > *Date: *Monday, January 29, 2024 at 7:46 PM > *To: *Ken Gaillot , Cluster Labs - All topics > related to open-source clustering welcomed > *Cc: *Faaland, Olaf P. > *Subject: *Re: [ClusterLabs] controlling cluster behavior on startup > > >> However, now it seems to wait that amount of time before it elects a > >> DC, even when quorum is acquired earlier. In my log snippet below, > >> with dc-deadtime 300s, > > > > The dc-deadtime is not waiting for quorum, but for another DC to show > > up. If all nodes show up, it can proceed, but otherwise it has to wait. > > I believe all the nodes showed up by 14:17:04, but it still waited until > 14:19:26 to elect a DC: > > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher12 is now membe (was in > unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher11 is now membe (was in > unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > > This is a cluster with 2 nodes, gopher11 and gopher12. > > Am I misreading that? > > thanks, > Olaf > > > From: Ken Gaillot > Sent: Monday, January 29, 2024 3:49 PM > To: Faaland, Olaf P.; Cluster Labs - All topics related to open-source > clustering welcomed > Subject: Re: [ClusterLabs] controlling cluster behavior on startup > > On Mon, 2024-01-29 at 22:48 +, Faaland, Olaf P. wrote: > > Thank you, Ken. > > > > I changed my configuration management system to put an initial > > cib.xml into /var/lib/pacemaker/cib/, which sets all the property > > values I was setting via pcs commands, including dc-deadtime. I > > removed those "pcs property set" commands from the ones that are run > > at s
Re: [ClusterLabs] controlling cluster behavior on startup
On Tue, 2024-01-30 at 13:20 +, Walker, Chris wrote: > >>> However, now it seems to wait that amount of time before it > elects a > >>> DC, even when quorum is acquired earlier. In my log snippet > below, > >>> with dc-deadtime 300s, > >> > >> The dc-deadtime is not waiting for quorum, but for another DC to > show > >> up. If all nodes show up, it can proceed, but otherwise it has to > wait. > > > I believe all the nodes showed up by 14:17:04, but it still waited > until 14:19:26 to elect a DC: > > > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher12 is now membe > (was in unknown state) > > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher11 is now membe > (was in unknown state) > > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 > members=2 > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) > info: Input I_ELECTION_DC received in state S_ELECTION from > election_win_cb > > > This is a cluster with 2 nodes, gopher11 and gopher12. > > This is our experience with dc-deadtime too: even if both nodes in > the cluster show up, dc-deadtime must elapse before the cluster > starts. This was discussed on this list a while back ( > https://www.mail-archive.com/users@clusterlabs.org/msg03897.html) and > an RFE came out of it ( > https://bugs.clusterlabs.org/show_bug.cgi?id=5310). Ah, I misremembered, I thought we had done that :( > > I’ve worked around this by having an ExecStartPre directive for > Corosync that does essentially: > > while ! systemctl -H ${peer} is-active corosync; do sleep 5; done > > With this in place, the nodes wait for each other before starting > Corosync and Pacemaker. We can then use the default 20s dc-deadtime > so that the DC election happens quickly once both nodes are up. That makes sense > Thanks, > Chris > > From: Users on behalf of Faaland, > Olaf P. via Users > Date: Monday, January 29, 2024 at 7:46 PM > To: Ken Gaillot , Cluster Labs - All topics > related to open-source clustering welcomed > Cc: Faaland, Olaf P. > Subject: Re: [ClusterLabs] controlling cluster behavior on startup > > >> However, now it seems to wait that amount of time before it elects > a > >> DC, even when quorum is acquired earlier. In my log snippet > below, > >> with dc-deadtime 300s, > > > > The dc-deadtime is not waiting for quorum, but for another DC to > show > > up. If all nodes show up, it can proceed, but otherwise it has to > wait. > > I believe all the nodes showed up by 14:17:04, but it still waited > until 14:19:26 to elect a DC: > > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher12 is now membe > (was in unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher11 is now membe > (was in unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 > members=2 > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > > This is a cluster with 2 nodes, gopher11 and gopher12. > > Am I misreading that? > > thanks, > Olaf > > > From: Ken Gaillot > Sent: Monday, January 29, 2024 3:49 PM > To: Faaland, Olaf P.; Cluster Labs - All topics related to open- > source clustering welcomed > Subject: Re: [ClusterLabs] controlling cluster behavior on startup > > On Mon, 2024-01-29 at 22:48 +, Faaland, Olaf P. wrote: > > Thank you, Ken. > > > > I changed my configuration management system to put an initial > > cib.xml into /var/lib/pacemaker/cib/, which sets all the property > > values I was setting via pcs commands, including dc-deadtime. I > > removed those "pcs property set" commands from the ones that are > run > > at startup time. > > > > That worked in the sense that after Pacemaker start, the node waits > > my newly specified dc-deadtime of 300s before giving up on the > > partner node and fencing it, if the partner never appears as a > > member. > > > > However, now it seems to wait that amount of time before it elects > a > > DC, even when quorum is acquired earlier. In my log snippet below, > > with dc-deadtime 300s, > > The dc-deadtim
Re: [ClusterLabs] controlling cluster behavior on startup
>>> However, now it seems to wait that amount of time before it elects a >>> DC, even when quorum is acquired earlier. In my log snippet below, >>> with dc-deadtime 300s, >> >> The dc-deadtime is not waiting for quorum, but for another DC to show >> up. If all nodes show up, it can proceed, but otherwise it has to wait. > I believe all the nodes showed up by 14:17:04, but it still waited until > 14:19:26 to elect a DC: > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] (peer_update_callback) > info: Cluster node gopher12 is now membe (was in unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] (peer_update_callback) > info: Cluster node gopher11 is now membe (was in unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: Input > I_ELECTION_DC received in state S_ELECTION from election_win_cb > This is a cluster with 2 nodes, gopher11 and gopher12. This is our experience with dc-deadtime too: even if both nodes in the cluster show up, dc-deadtime must elapse before the cluster starts. This was discussed on this list a while back (https://www.mail-archive.com/users@clusterlabs.org/msg03897.html) and an RFE came out of it (https://bugs.clusterlabs.org/show_bug.cgi?id=5310). I’ve worked around this by having an ExecStartPre directive for Corosync that does essentially: while ! systemctl -H ${peer} is-active corosync; do sleep 5; done With this in place, the nodes wait for each other before starting Corosync and Pacemaker. We can then use the default 20s dc-deadtime so that the DC election happens quickly once both nodes are up. Thanks, Chris From: Users on behalf of Faaland, Olaf P. via Users Date: Monday, January 29, 2024 at 7:46 PM To: Ken Gaillot , Cluster Labs - All topics related to open-source clustering welcomed Cc: Faaland, Olaf P. Subject: Re: [ClusterLabs] controlling cluster behavior on startup >> However, now it seems to wait that amount of time before it elects a >> DC, even when quorum is acquired earlier. In my log snippet below, >> with dc-deadtime 300s, > > The dc-deadtime is not waiting for quorum, but for another DC to show > up. If all nodes show up, it can proceed, but otherwise it has to wait. I believe all the nodes showed up by 14:17:04, but it still waited until 14:19:26 to elect a DC: Jan 29 14:14:25 gopher12 pacemaker-controld [123697] (peer_update_callback) info: Cluster node gopher12 is now membe (was in unknown state) Jan 29 14:17:04 gopher12 pacemaker-controld [123697] (peer_update_callback) info: Cluster node gopher11 is now membe (was in unknown state) Jan 29 14:17:04 gopher12 pacemaker-controld [123697] (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: Input I_ELECTION_DC received in state S_ELECTION from election_win_cb This is a cluster with 2 nodes, gopher11 and gopher12. Am I misreading that? thanks, Olaf From: Ken Gaillot Sent: Monday, January 29, 2024 3:49 PM To: Faaland, Olaf P.; Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] controlling cluster behavior on startup On Mon, 2024-01-29 at 22:48 +, Faaland, Olaf P. wrote: > Thank you, Ken. > > I changed my configuration management system to put an initial > cib.xml into /var/lib/pacemaker/cib/, which sets all the property > values I was setting via pcs commands, including dc-deadtime. I > removed those "pcs property set" commands from the ones that are run > at startup time. > > That worked in the sense that after Pacemaker start, the node waits > my newly specified dc-deadtime of 300s before giving up on the > partner node and fencing it, if the partner never appears as a > member. > > However, now it seems to wait that amount of time before it elects a > DC, even when quorum is acquired earlier. In my log snippet below, > with dc-deadtime 300s, The dc-deadtime is not waiting for quorum, but for another DC to show up. If all nodes show up, it can proceed, but otherwise it has to wait. > > 14:14:24 Pacemaker starts on gopher12 > 14:17:04 quorum is acquired > 14:19:26 Election Trigger just popped (start time + dc-deadtime > seconds) > 14:19:26 gopher12 wins the election > > Is there other configuration that needs to be present in the cib at > startup time? > > thanks, > Olaf > > === log extract using new system of installing partial cib.xml before > startup > Jan 29 14:14:24 gopher12 pacemakerd [123690] > (main)notice: Starting Pacemaker 2.1.7-1.t4 | build
Re: [ClusterLabs] controlling cluster behavior on startup
>> However, now it seems to wait that amount of time before it elects a >> DC, even when quorum is acquired earlier. In my log snippet below, >> with dc-deadtime 300s, > > The dc-deadtime is not waiting for quorum, but for another DC to show > up. If all nodes show up, it can proceed, but otherwise it has to wait. I believe all the nodes showed up by 14:17:04, but it still waited until 14:19:26 to elect a DC: Jan 29 14:14:25 gopher12 pacemaker-controld [123697] (peer_update_callback) info: Cluster node gopher12 is now membe (was in unknown state) Jan 29 14:17:04 gopher12 pacemaker-controld [123697] (peer_update_callback) info: Cluster node gopher11 is now membe (was in unknown state) Jan 29 14:17:04 gopher12 pacemaker-controld [123697] (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: Input I_ELECTION_DC received in state S_ELECTION from election_win_cb This is a cluster with 2 nodes, gopher11 and gopher12. Am I misreading that? thanks, Olaf From: Ken Gaillot Sent: Monday, January 29, 2024 3:49 PM To: Faaland, Olaf P.; Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] controlling cluster behavior on startup On Mon, 2024-01-29 at 22:48 +, Faaland, Olaf P. wrote: > Thank you, Ken. > > I changed my configuration management system to put an initial > cib.xml into /var/lib/pacemaker/cib/, which sets all the property > values I was setting via pcs commands, including dc-deadtime. I > removed those "pcs property set" commands from the ones that are run > at startup time. > > That worked in the sense that after Pacemaker start, the node waits > my newly specified dc-deadtime of 300s before giving up on the > partner node and fencing it, if the partner never appears as a > member. > > However, now it seems to wait that amount of time before it elects a > DC, even when quorum is acquired earlier. In my log snippet below, > with dc-deadtime 300s, The dc-deadtime is not waiting for quorum, but for another DC to show up. If all nodes show up, it can proceed, but otherwise it has to wait. > > 14:14:24 Pacemaker starts on gopher12 > 14:17:04 quorum is acquired > 14:19:26 Election Trigger just popped (start time + dc-deadtime > seconds) > 14:19:26 gopher12 wins the election > > Is there other configuration that needs to be present in the cib at > startup time? > > thanks, > Olaf > > === log extract using new system of installing partial cib.xml before > startup > Jan 29 14:14:24 gopher12 pacemakerd [123690] > (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 > features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default- > concurrent-fencing generated-manpages monotonic nagios ncurses remote > systemd > Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] > (attrd_start_election_if_needed) info: Starting an election to > determine the writer > Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] > (election_check) info: election-attrd won by local node > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher12 is now member > (was in unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 > members=2 > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (crm_timer_popped)info: Election Trigger just popped | > input=I_DC_TIMEOUT time=30ms > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (do_log) warning: Input I_DC_TIMEOUT received in state S_PENDING > from crm_timer_popped > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (do_state_transition) info: State transition S_PENDING -> > S_ELECTION | input=I_DC_TIMEOUT cause=C_TIMER_POPPED > origin=crm_timer_popped > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (election_check) info: election-DC won by local node > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (do_state_transition) notice: State transition S_ELECTION -> > S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL > origin=election_win_cb > Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] > (recurring_op_for_active) info: Start 10s-interval monitor > for gopher11_zpool on gopher11 > Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] > (recurring_op_for_active) info: Start 10s-interval monitor > for gopher12_zpool on gopher12 > > > === in
Re: [ClusterLabs] controlling cluster behavior on startup
On Mon, 2024-01-29 at 14:35 -0800, Reid Wahl wrote: > > > On Monday, January 29, 2024, Ken Gaillot wrote: > > On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users > wrote: > >> Hi, > >> > >> I have configured clusters of node pairs, so each cluster has 2 > >> nodes. The cluster members are statically defined in > corosync.conf > >> before corosync or pacemaker is started, and quorum {two_node: 1} > is > >> set. > >> > >> When both nodes are powered off and I power them on, they do not > >> start pacemaker at exactly the same time. The time difference may > be > >> a few minutes depending on other factors outside the nodes. > >> > >> My goals are (I call the first node to start pacemaker "node1"): > >> 1) I want to control how long pacemaker on node1 waits before > fencing > >> node2 if node2 does not start pacemaker. > >> 2) If node1 is part-way through that waiting period, and node2 > starts > >> pacemaker so they detect each other, I would like them to proceed > >> immediately to probing resource state and starting resources which > >> are down, not wait until the end of that "grace period". > >> > >> It looks from the documentation like dc-deadtime is how #1 is > >> controlled, and #2 is expected normal behavior. However, I'm > seeing > >> fence actions before dc-deadtime has passed. > >> > >> Am I misunderstanding Pacemaker's expected behavior and/or how dc- > >> deadtime should be used? > > > > You have everything right. The problem is that you're starting with > an > > empty configuration every time, so the default dc-deadtime is being > > used for the first election (before you can set the desired value). > > Why would there be fence actions before dc-deadtime expires though? There isn't -- after the (default) dc-deadtime pops, the node elects itself DC and runs the scheduler, which considers the other node unseen and in need of startup fencing. The dc-deadtime has been raised in the meantime, but that no longer matters. > > > > > I can't think of anything you can do to get around that, since the > > controller starts the timer as soon as it starts up. Would it be > > possible to bake an initial configuration into the PXE image? > > > > When the timer value changes, we could stop the existing timer and > > restart it. There's a risk that some external automation could make > > repeated changes to the timeout, thus never letting it expire, but > that > > seems preferable to your problem. I've created an issue for that: > > > > https://projects.clusterlabs.org/T764 > > > > BTW there's also election-timeout. I'm not sure offhand how that > > interacts; it might be necessary to raise that one as well. > > > >> > >> One possibly unusual aspect of this cluster is that these two > nodes > >> are stateless - they PXE boot from an image on another server - > and I > >> build the cluster configuration at boot time with a series of pcs > >> commands, because the nodes have no local storage for this > >> purpose. The commands are: > >> > >> ['pcs', 'cluster', 'start'] > >> ['pcs', 'property', 'set', 'stonith-action=off'] > >> ['pcs', 'property', 'set', 'cluster-recheck-interval=60'] > >> ['pcs', 'property', 'set', 'start-failure-is-fatal=false'] > >> ['pcs', 'property', 'set', 'dc-deadtime=300'] > >> ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman', > >> 'ip=192.168.64.65', 'pcmk_host_check=static-list', > >> 'pcmk_host_list=gopher11,gopher12'] > >> ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman', > >> 'ip=192.168.64.65', 'pcmk_host_check=static-list', > >> 'pcmk_host_list=gopher11,gopher12'] > >> ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool', > >> 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11', > 'op', > >> 'start', 'timeout=805'] > >> ... > >> ['pcs', 'property', 'set', 'no-quorum-policy=ignore'] > > > > BTW you don't need to change no-quorum-policy when you're using > > two_node with Corosync. > > > >> > >> I could, instead, generate a CIB so that when Pacemaker is > started, > >> it has a full config. Is that better? > >> > >> thanks, > >> Olaf > >> > >> === corosync.conf: > >> totem { > >> version: 2 > >> cluster_name: gopher11 > >> secauth: off > >> transport: udpu > >> } > >> nodelist { > >> node { > >> ring0_addr: gopher11 > >> name: gopher11 > >> nodeid: 1 > >> } > >> node { > >> ring0_addr: gopher12 > >> name: gopher12 > >> nodeid: 2 > >> } > >> } > >> quorum { > >> provider: corosync_votequorum > >> two_node: 1 > >> } > >> > >> === Log excerpt > >> > >> Here's an except from Pacemaker logs that reflect what I'm > >> seeing. These are from gopher12, the node that came up first. > The > >> other node, which is not yet up, is gopher11. > >> > >> Jan 25 17:55:38 gopher12 pacemakerd [116033] > >> (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 > >> features:agent-manpages ascii-docs compat-2.0 corosync-ge-2
Re: [ClusterLabs] controlling cluster behavior on startup
On Mon, 2024-01-29 at 22:48 +, Faaland, Olaf P. wrote: > Thank you, Ken. > > I changed my configuration management system to put an initial > cib.xml into /var/lib/pacemaker/cib/, which sets all the property > values I was setting via pcs commands, including dc-deadtime. I > removed those "pcs property set" commands from the ones that are run > at startup time. > > That worked in the sense that after Pacemaker start, the node waits > my newly specified dc-deadtime of 300s before giving up on the > partner node and fencing it, if the partner never appears as a > member. > > However, now it seems to wait that amount of time before it elects a > DC, even when quorum is acquired earlier. In my log snippet below, > with dc-deadtime 300s, The dc-deadtime is not waiting for quorum, but for another DC to show up. If all nodes show up, it can proceed, but otherwise it has to wait. > > 14:14:24 Pacemaker starts on gopher12 > 14:17:04 quorum is acquired > 14:19:26 Election Trigger just popped (start time + dc-deadtime > seconds) > 14:19:26 gopher12 wins the election > > Is there other configuration that needs to be present in the cib at > startup time? > > thanks, > Olaf > > === log extract using new system of installing partial cib.xml before > startup > Jan 29 14:14:24 gopher12 pacemakerd [123690] > (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 > features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default- > concurrent-fencing generated-manpages monotonic nagios ncurses remote > systemd > Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] > (attrd_start_election_if_needed) info: Starting an election to > determine the writer > Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] > (election_check) info: election-attrd won by local node > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback)info: Cluster node gopher12 is now member > (was in unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 > members=2 > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (crm_timer_popped)info: Election Trigger just popped | > input=I_DC_TIMEOUT time=30ms > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (do_log) warning: Input I_DC_TIMEOUT received in state S_PENDING > from crm_timer_popped > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (do_state_transition) info: State transition S_PENDING -> > S_ELECTION | input=I_DC_TIMEOUT cause=C_TIMER_POPPED > origin=crm_timer_popped > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (election_check) info: election-DC won by local node > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > (do_state_transition) notice: State transition S_ELECTION -> > S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL > origin=election_win_cb > Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] > (recurring_op_for_active) info: Start 10s-interval monitor > for gopher11_zpool on gopher11 > Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] > (recurring_op_for_active) info: Start 10s-interval monitor > for gopher12_zpool on gopher12 > > > === initial cib.xml contents > num_updates="0" admin_epoch="0" cib-last-written="Mon Jan 29 11:07:06 > 2024" update-origin="gopher12" update-client="root" update- > user="root" have-quorum="0" dc-uuid="2"> > > > > name="stonith-action" value="off"/> > > > name="cluster-infrastructure" value="corosync"/> > name="cluster-name" value="gopher11"/> > name="cluster-recheck-interval" value="60"/> > name="start-failure-is-fatal" value="false"/> > > > > > > > > > > > > > > From: Ken Gaillot > Sent: Monday, January 29, 2024 10:51 AM > To: Cluster Labs - All topics related to open-source clustering > welcomed > Cc: Faaland, Olaf P. > Subject: Re: [ClusterLabs] controlling cluster behavior on startup > > On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users wrote: > > Hi, > > > > I have configured clusters of node pairs, so each cluster has 2 > > nod
Re: [ClusterLabs] controlling cluster behavior on startup
Thank you, Ken. I changed my configuration management system to put an initial cib.xml into /var/lib/pacemaker/cib/, which sets all the property values I was setting via pcs commands, including dc-deadtime. I removed those "pcs property set" commands from the ones that are run at startup time. That worked in the sense that after Pacemaker start, the node waits my newly specified dc-deadtime of 300s before giving up on the partner node and fencing it, if the partner never appears as a member. However, now it seems to wait that amount of time before it elects a DC, even when quorum is acquired earlier. In my log snippet below, with dc-deadtime 300s, 14:14:24 Pacemaker starts on gopher12 14:17:04 quorum is acquired 14:19:26 Election Trigger just popped (start time + dc-deadtime seconds) 14:19:26 gopher12 wins the election Is there other configuration that needs to be present in the cib at startup time? thanks, Olaf === log extract using new system of installing partial cib.xml before startup Jan 29 14:14:24 gopher12 pacemakerd [123690] (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default-concurrent-fencing generated-manpages monotonic nagios ncurses remote systemd Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] (attrd_start_election_if_needed) info: Starting an election to determine the writer Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] (election_check) info: election-attrd won by local node Jan 29 14:14:25 gopher12 pacemaker-controld [123697] (peer_update_callback) info: Cluster node gopher12 is now member (was in unknown state) Jan 29 14:17:04 gopher12 pacemaker-controld [123697] (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (crm_timer_popped) info: Election Trigger just popped | input=I_DC_TIMEOUT time=30ms Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) warning: Input I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_state_transition) info: State transition S_PENDING -> S_ELECTION | input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (election_check) info: election-DC won by local node Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: Input I_ELECTION_DC received in state S_ELECTION from election_win_cb Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_state_transition) notice: State transition S_ELECTION -> S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=election_win_cb Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] (recurring_op_for_active) info: Start 10s-interval monitor for gopher11_zpool on gopher11 Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] (recurring_op_for_active) info: Start 10s-interval monitor for gopher12_zpool on gopher12 === initial cib.xml contents From: Ken Gaillot Sent: Monday, January 29, 2024 10:51 AM To: Cluster Labs - All topics related to open-source clustering welcomed Cc: Faaland, Olaf P. Subject: Re: [ClusterLabs] controlling cluster behavior on startup On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users wrote: > Hi, > > I have configured clusters of node pairs, so each cluster has 2 > nodes. The cluster members are statically defined in corosync.conf > before corosync or pacemaker is started, and quorum {two_node: 1} is > set. > > When both nodes are powered off and I power them on, they do not > start pacemaker at exactly the same time. The time difference may be > a few minutes depending on other factors outside the nodes. > > My goals are (I call the first node to start pacemaker "node1"): > 1) I want to control how long pacemaker on node1 waits before fencing > node2 if node2 does not start pacemaker. > 2) If node1 is part-way through that waiting period, and node2 starts > pacemaker so they detect each other, I would like them to proceed > immediately to probing resource state and starting resources which > are down, not wait until the end of that "grace period". > > It looks from the documentation like dc-deadtime is how #1 is > controlled, and #2 is expected normal behavior. However, I'm seeing > fence actions before dc-deadtime has passed. > > Am I misunderstanding Pacemaker's expected behavior and/or how dc- > deadtime should be used? You have everything right. The problem is that you're starting with an empty configuration every time, so the default dc-deadtime is being used for
Re: [ClusterLabs] controlling cluster behavior on startup
On Monday, January 29, 2024, Ken Gaillot wrote: > On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users wrote: >> Hi, >> >> I have configured clusters of node pairs, so each cluster has 2 >> nodes. The cluster members are statically defined in corosync.conf >> before corosync or pacemaker is started, and quorum {two_node: 1} is >> set. >> >> When both nodes are powered off and I power them on, they do not >> start pacemaker at exactly the same time. The time difference may be >> a few minutes depending on other factors outside the nodes. >> >> My goals are (I call the first node to start pacemaker "node1"): >> 1) I want to control how long pacemaker on node1 waits before fencing >> node2 if node2 does not start pacemaker. >> 2) If node1 is part-way through that waiting period, and node2 starts >> pacemaker so they detect each other, I would like them to proceed >> immediately to probing resource state and starting resources which >> are down, not wait until the end of that "grace period". >> >> It looks from the documentation like dc-deadtime is how #1 is >> controlled, and #2 is expected normal behavior. However, I'm seeing >> fence actions before dc-deadtime has passed. >> >> Am I misunderstanding Pacemaker's expected behavior and/or how dc- >> deadtime should be used? > > You have everything right. The problem is that you're starting with an > empty configuration every time, so the default dc-deadtime is being > used for the first election (before you can set the desired value). Why would there be fence actions before dc-deadtime expires though? > > I can't think of anything you can do to get around that, since the > controller starts the timer as soon as it starts up. Would it be > possible to bake an initial configuration into the PXE image? > > When the timer value changes, we could stop the existing timer and > restart it. There's a risk that some external automation could make > repeated changes to the timeout, thus never letting it expire, but that > seems preferable to your problem. I've created an issue for that: > > https://projects.clusterlabs.org/T764 > > BTW there's also election-timeout. I'm not sure offhand how that > interacts; it might be necessary to raise that one as well. > >> >> One possibly unusual aspect of this cluster is that these two nodes >> are stateless - they PXE boot from an image on another server - and I >> build the cluster configuration at boot time with a series of pcs >> commands, because the nodes have no local storage for this >> purpose. The commands are: >> >> ['pcs', 'cluster', 'start'] >> ['pcs', 'property', 'set', 'stonith-action=off'] >> ['pcs', 'property', 'set', 'cluster-recheck-interval=60'] >> ['pcs', 'property', 'set', 'start-failure-is-fatal=false'] >> ['pcs', 'property', 'set', 'dc-deadtime=300'] >> ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman', >> 'ip=192.168.64.65', 'pcmk_host_check=static-list', >> 'pcmk_host_list=gopher11,gopher12'] >> ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman', >> 'ip=192.168.64.65', 'pcmk_host_check=static-list', >> 'pcmk_host_list=gopher11,gopher12'] >> ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool', >> 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11', 'op', >> 'start', 'timeout=805'] >> ... >> ['pcs', 'property', 'set', 'no-quorum-policy=ignore'] > > BTW you don't need to change no-quorum-policy when you're using > two_node with Corosync. > >> >> I could, instead, generate a CIB so that when Pacemaker is started, >> it has a full config. Is that better? >> >> thanks, >> Olaf >> >> === corosync.conf: >> totem { >> version: 2 >> cluster_name: gopher11 >> secauth: off >> transport: udpu >> } >> nodelist { >> node { >> ring0_addr: gopher11 >> name: gopher11 >> nodeid: 1 >> } >> node { >> ring0_addr: gopher12 >> name: gopher12 >> nodeid: 2 >> } >> } >> quorum { >> provider: corosync_votequorum >> two_node: 1 >> } >> >> === Log excerpt >> >> Here's an except from Pacemaker logs that reflect what I'm >> seeing. These are from gopher12, the node that came up first. The >> other node, which is not yet up, is gopher11. >> >> Jan 25 17:55:38 gopher12 pacemakerd [116033] >> (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 >> features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default- >> concurrent-fencing generated-manpages monotonic nagios ncurses remote >> systemd >> Jan 25 17:55:39 gopher12 pacemaker-controld [116040] >> (peer_update_callback)info: Cluster node gopher12 is now member >> (was in unknown state) >> Jan 25 17:55:43 gopher12 pacemaker-based [116035] >> (cib_perform_op) info: ++ >> /cib/configuration/crm_config/cluster_property_set[@id='cib- >> bootstrap-options']: > name="dc-deadtime" value="300"/> >> Jan 25 17:56:00 gopher12 pacemaker-controld [116040] >> (crm_timer_popped)info: Election Trigger ju
Re: [ClusterLabs] controlling cluster behavior on startup
On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users wrote: > Hi, > > I have configured clusters of node pairs, so each cluster has 2 > nodes. The cluster members are statically defined in corosync.conf > before corosync or pacemaker is started, and quorum {two_node: 1} is > set. > > When both nodes are powered off and I power them on, they do not > start pacemaker at exactly the same time. The time difference may be > a few minutes depending on other factors outside the nodes. > > My goals are (I call the first node to start pacemaker "node1"): > 1) I want to control how long pacemaker on node1 waits before fencing > node2 if node2 does not start pacemaker. > 2) If node1 is part-way through that waiting period, and node2 starts > pacemaker so they detect each other, I would like them to proceed > immediately to probing resource state and starting resources which > are down, not wait until the end of that "grace period". > > It looks from the documentation like dc-deadtime is how #1 is > controlled, and #2 is expected normal behavior. However, I'm seeing > fence actions before dc-deadtime has passed. > > Am I misunderstanding Pacemaker's expected behavior and/or how dc- > deadtime should be used? You have everything right. The problem is that you're starting with an empty configuration every time, so the default dc-deadtime is being used for the first election (before you can set the desired value). I can't think of anything you can do to get around that, since the controller starts the timer as soon as it starts up. Would it be possible to bake an initial configuration into the PXE image? When the timer value changes, we could stop the existing timer and restart it. There's a risk that some external automation could make repeated changes to the timeout, thus never letting it expire, but that seems preferable to your problem. I've created an issue for that: https://projects.clusterlabs.org/T764 BTW there's also election-timeout. I'm not sure offhand how that interacts; it might be necessary to raise that one as well. > > One possibly unusual aspect of this cluster is that these two nodes > are stateless - they PXE boot from an image on another server - and I > build the cluster configuration at boot time with a series of pcs > commands, because the nodes have no local storage for this > purpose. The commands are: > > ['pcs', 'cluster', 'start'] > ['pcs', 'property', 'set', 'stonith-action=off'] > ['pcs', 'property', 'set', 'cluster-recheck-interval=60'] > ['pcs', 'property', 'set', 'start-failure-is-fatal=false'] > ['pcs', 'property', 'set', 'dc-deadtime=300'] > ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman', > 'ip=192.168.64.65', 'pcmk_host_check=static-list', > 'pcmk_host_list=gopher11,gopher12'] > ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman', > 'ip=192.168.64.65', 'pcmk_host_check=static-list', > 'pcmk_host_list=gopher11,gopher12'] > ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool', > 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11', 'op', > 'start', 'timeout=805'] > ... > ['pcs', 'property', 'set', 'no-quorum-policy=ignore'] BTW you don't need to change no-quorum-policy when you're using two_node with Corosync. > > I could, instead, generate a CIB so that when Pacemaker is started, > it has a full config. Is that better? > > thanks, > Olaf > > === corosync.conf: > totem { > version: 2 > cluster_name: gopher11 > secauth: off > transport: udpu > } > nodelist { > node { > ring0_addr: gopher11 > name: gopher11 > nodeid: 1 > } > node { > ring0_addr: gopher12 > name: gopher12 > nodeid: 2 > } > } > quorum { > provider: corosync_votequorum > two_node: 1 > } > > === Log excerpt > > Here's an except from Pacemaker logs that reflect what I'm > seeing. These are from gopher12, the node that came up first. The > other node, which is not yet up, is gopher11. > > Jan 25 17:55:38 gopher12 pacemakerd [116033] > (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 > features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default- > concurrent-fencing generated-manpages monotonic nagios ncurses remote > systemd > Jan 25 17:55:39 gopher12 pacemaker-controld [116040] > (peer_update_callback)info: Cluster node gopher12 is now member > (was in unknown state) > Jan 25 17:55:43 gopher12 pacemaker-based [116035] > (cib_perform_op) info: ++ > /cib/configuration/crm_config/cluster_property_set[@id='cib- > bootstrap-options']: name="dc-deadtime" value="300"/> > Jan 25 17:56:00 gopher12 pacemaker-controld [116040] > (crm_timer_popped)info: Election Trigger just popped | > input=I_DC_TIMEOUT time=30ms > Jan 25 17:56:01 gopher12 pacemaker-based [116035] > (cib_perform_op) info: ++ > /cib/configuration/crm_config/cluster_property_set[@id='cib- > bootstrap-options']: > Jan 25 17:56:01 gopher12
[ClusterLabs] controlling cluster behavior on startup
Hi, I have configured clusters of node pairs, so each cluster has 2 nodes. The cluster members are statically defined in corosync.conf before corosync or pacemaker is started, and quorum {two_node: 1} is set. When both nodes are powered off and I power them on, they do not start pacemaker at exactly the same time. The time difference may be a few minutes depending on other factors outside the nodes. My goals are (I call the first node to start pacemaker "node1"): 1) I want to control how long pacemaker on node1 waits before fencing node2 if node2 does not start pacemaker. 2) If node1 is part-way through that waiting period, and node2 starts pacemaker so they detect each other, I would like them to proceed immediately to probing resource state and starting resources which are down, not wait until the end of that "grace period". It looks from the documentation like dc-deadtime is how #1 is controlled, and #2 is expected normal behavior. However, I'm seeing fence actions before dc-deadtime has passed. Am I misunderstanding Pacemaker's expected behavior and/or how dc-deadtime should be used? One possibly unusual aspect of this cluster is that these two nodes are stateless - they PXE boot from an image on another server - and I build the cluster configuration at boot time with a series of pcs commands, because the nodes have no local storage for this purpose. The commands are: ['pcs', 'cluster', 'start'] ['pcs', 'property', 'set', 'stonith-action=off'] ['pcs', 'property', 'set', 'cluster-recheck-interval=60'] ['pcs', 'property', 'set', 'start-failure-is-fatal=false'] ['pcs', 'property', 'set', 'dc-deadtime=300'] ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman', 'ip=192.168.64.65', 'pcmk_host_check=static-list', 'pcmk_host_list=gopher11,gopher12'] ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman', 'ip=192.168.64.65', 'pcmk_host_check=static-list', 'pcmk_host_list=gopher11,gopher12'] ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool', 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11', 'op', 'start', 'timeout=805'] ... ['pcs', 'property', 'set', 'no-quorum-policy=ignore'] I could, instead, generate a CIB so that when Pacemaker is started, it has a full config. Is that better? thanks, Olaf === corosync.conf: totem { version: 2 cluster_name: gopher11 secauth: off transport: udpu } nodelist { node { ring0_addr: gopher11 name: gopher11 nodeid: 1 } node { ring0_addr: gopher12 name: gopher12 nodeid: 2 } } quorum { provider: corosync_votequorum two_node: 1 } === Log excerpt Here's an except from Pacemaker logs that reflect what I'm seeing. These are from gopher12, the node that came up first. The other node, which is not yet up, is gopher11. Jan 25 17:55:38 gopher12 pacemakerd [116033] (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default-concurrent-fencing generated-manpages monotonic nagios ncurses remote systemd Jan 25 17:55:39 gopher12 pacemaker-controld [116040] (peer_update_callback) info: Cluster node gopher12 is now member (was in unknown state) Jan 25 17:55:43 gopher12 pacemaker-based [116035] (cib_perform_op) info: ++ /cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options']: Jan 25 17:56:00 gopher12 pacemaker-controld [116040] (crm_timer_popped) info: Election Trigger just popped | input=I_DC_TIMEOUT time=30ms Jan 25 17:56:01 gopher12 pacemaker-based [116035] (cib_perform_op) info: ++ /cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options']: Jan 25 17:56:01 gopher12 pacemaker-controld [116040] (abort_transition_graph) info: Transition 0 aborted by cib-bootstrap-options-no-quorum-policy doing create no-quorum-policy=ignore: Configuration change | cib=0.26.0 source=te_update_diff_v2:464 path=/cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options'] complete=true Jan 25 17:56:01 gopher12 pacemaker-controld [116040] (controld_execute_fence_action) notice: Requesting fencing (off) targeting node gopher11 | action=11 timeout=60 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/