Re: [ClusterLabs] controlling cluster behavior on startup

2024-01-29 Thread Faaland, Olaf P. via Users
itial cib.xml contents
>  num_updates="0" admin_epoch="0" cib-last-written="Mon Jan 29 11:07:06
> 2024" update-origin="gopher12" update-client="root" update-
> user="root" have-quorum="0" dc-uuid="2">
>   
> 
>   
>  name="stonith-action" value="off"/>
> 
> 
>  name="cluster-infrastructure" value="corosync"/>
>  name="cluster-name" value="gopher11"/>
>  name="cluster-recheck-interval" value="60"/>
>  name="start-failure-is-fatal" value="false"/>
> 
>   
> 
> 
>   
>   
> 
> 
> 
>   
> 
>
> 
> From: Ken Gaillot 
> Sent: Monday, January 29, 2024 10:51 AM
> To: Cluster Labs - All topics related to open-source clustering
> welcomed
> Cc: Faaland, Olaf P.
> Subject: Re: [ClusterLabs] controlling cluster behavior on startup
>
> On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users wrote:
> > Hi,
> >
> > I have configured clusters of node pairs, so each cluster has 2
> > nodes.  The cluster members are statically defined in corosync.conf
> > before corosync or pacemaker is started, and quorum {two_node: 1}
> > is
> > set.
> >
> > When both nodes are powered off and I power them on, they do not
> > start pacemaker at exactly the same time.  The time difference may
> > be
> > a few minutes depending on other factors outside the nodes.
> >
> > My goals are (I call the first node to start pacemaker "node1"):
> > 1) I want to control how long pacemaker on node1 waits before
> > fencing
> > node2 if node2 does not start pacemaker.
> > 2) If node1 is part-way through that waiting period, and node2
> > starts
> > pacemaker so they detect each other, I would like them to proceed
> > immediately to probing resource state and starting resources which
> > are down, not wait until the end of that "grace period".
> >
> > It looks from the documentation like dc-deadtime is how #1 is
> > controlled, and #2 is expected normal behavior.  However, I'm
> > seeing
> > fence actions before dc-deadtime has passed.
> >
> > Am I misunderstanding Pacemaker's expected behavior and/or how dc-
> > deadtime should be used?
>
> You have everything right. The problem is that you're starting with
> an
> empty configuration every time, so the default dc-deadtime is being
> used for the first election (before you can set the desired value).
>
> I can't think of anything you can do to get around that, since the
> controller starts the timer as soon as it starts up. Would it be
> possible to bake an initial configuration into the PXE image?
>
> When the timer value changes, we could stop the existing timer and
> restart it. There's a risk that some external automation could make
> repeated changes to the timeout, thus never letting it expire, but
> that
> seems preferable to your problem. I've created an issue for that:
>
>
> https://urldefense.us/v3/__https://projects.clusterlabs.org/T764__;!!G2kpM7uM-TzIFchu!0LU3msm_lT0kftiloTf7Qo4NM7JdSzgjqRk4ViRx8L8DbWSwdnp07tzNUVbSB7uaLL5DHsvPBb0d3U93x6U$
>
> BTW there's also election-timeout. I'm not sure offhand how that
> interacts; it might be necessary to raise that one as well.
>
> > One possibly unusual aspect of this cluster is that these two nodes
> > are stateless - they PXE boot from an image on another server - and
> > I
> > build the cluster configuration at boot time with a series of pcs
> > commands, because the nodes have no local storage for this
> > purpose.  The commands are:
> >
> > ['pcs', 'cluster', 'start']
> > ['pcs', 'property', 'set', 'stonith-action=off']
> > ['pcs', 'property', 'set', 'cluster-recheck-interval=60']
> > ['pcs', 'property', 'set', 'start-failure-is-fatal=false']
> > ['pcs', 'property', 'set', 'dc-deadtime=300']
> > ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman',
> > 'ip=192.168.64.65', 'pcmk_host_check=static-list',
> > 'pcmk_host_list=gopher11,gopher12']
> > ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence

Re: [ClusterLabs] controlling cluster behavior on startup

2024-01-29 Thread Faaland, Olaf P. via Users
Thank you, Ken.

I changed my configuration management system to put an initial cib.xml into 
/var/lib/pacemaker/cib/, which sets all the property values I was setting via 
pcs commands, including dc-deadtime.  I removed those "pcs property set" 
commands from the ones that are run at startup time.

That worked in the sense that after Pacemaker start, the node waits my newly 
specified dc-deadtime of 300s before giving up on the partner node and fencing 
it, if the partner never appears as a member.

However, now it seems to wait that amount of time before it elects a DC, even 
when quorum is acquired earlier.  In my log snippet below, with dc-deadtime 
300s,

14:14:24 Pacemaker starts on gopher12
14:17:04 quorum is acquired
14:19:26 Election Trigger just popped (start time + dc-deadtime seconds)
14:19:26 gopher12 wins the election

Is there other configuration that needs to be present in the cib at startup 
time?

thanks,
Olaf

=== log extract using new system of installing partial cib.xml before startup
Jan 29 14:14:24 gopher12 pacemakerd  [123690] (main)notice: 
Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 features:agent-manpages ascii-docs 
compat-2.0 corosync-ge-2 default-concurrent-fencing generated-manpages 
monotonic nagios ncurses remote systemd
Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] 
(attrd_start_election_if_needed)  info: Starting an election to determine the 
writer
Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] (election_check)  info: 
election-attrd won by local node
Jan 29 14:14:25 gopher12 pacemaker-controld  [123697] (peer_update_callback)
info: Cluster node gopher12 is now member (was in unknown state)
Jan 29 14:17:04 gopher12 pacemaker-controld  [123697] (quorum_notification_cb)  
notice: Quorum acquired | membership=54 members=2
Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (crm_timer_popped)
info: Election Trigger just popped | input=I_DC_TIMEOUT time=30ms
Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log)  warning: Input 
I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped
Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_state_transition) 
info: State transition S_PENDING -> S_ELECTION | input=I_DC_TIMEOUT 
cause=C_TIMER_POPPED origin=crm_timer_popped
Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (election_check)  info: 
election-DC won by local node
Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log)  info: Input 
I_ELECTION_DC received in state S_ELECTION from election_win_cb
Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_state_transition) 
notice: State transition S_ELECTION -> S_INTEGRATION | input=I_ELECTION_DC 
cause=C_FSA_INTERNAL origin=election_win_cb
Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] (recurring_op_for_active) 
info: Start 10s-interval monitor for gopher11_zpool on gopher11
Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] (recurring_op_for_active) 
info: Start 10s-interval monitor for gopher12_zpool on gopher12


=== initial cib.xml contents

  

  








  


  
  



  



From: Ken Gaillot 
Sent: Monday, January 29, 2024 10:51 AM
To: Cluster Labs - All topics related to open-source clustering welcomed
Cc: Faaland, Olaf P.
Subject: Re: [ClusterLabs] controlling cluster behavior on startup

On Mon, 2024-01-29 at 18:05 +0000, Faaland, Olaf P. via Users wrote:
> Hi,
>
> I have configured clusters of node pairs, so each cluster has 2
> nodes.  The cluster members are statically defined in corosync.conf
> before corosync or pacemaker is started, and quorum {two_node: 1} is
> set.
>
> When both nodes are powered off and I power them on, they do not
> start pacemaker at exactly the same time.  The time difference may be
> a few minutes depending on other factors outside the nodes.
>
> My goals are (I call the first node to start pacemaker "node1"):
> 1) I want to control how long pacemaker on node1 waits before fencing
> node2 if node2 does not start pacemaker.
> 2) If node1 is part-way through that waiting period, and node2 starts
> pacemaker so they detect each other, I would like them to proceed
> immediately to probing resource state and starting resources which
> are down, not wait until the end of that "grace period".
>
> It looks from the documentation like dc-deadtime is how #1 is
> controlled, and #2 is expected normal behavior.  However, I'm seeing
> fence actions before dc-deadtime has passed.
>
> Am I misunderstanding Pacemaker's expected behavior and/or how dc-
> deadtime should be used?

You have everything right. The problem is that you're starting with an
empty configuration every time, so the default dc-deadtime is being
used for 

[ClusterLabs] controlling cluster behavior on startup

2024-01-29 Thread Faaland, Olaf P. via Users
Hi,

I have configured clusters of node pairs, so each cluster has 2 nodes.  The 
cluster members are statically defined in corosync.conf before corosync or 
pacemaker is started, and quorum {two_node: 1} is set.

When both nodes are powered off and I power them on, they do not start 
pacemaker at exactly the same time.  The time difference may be a few minutes 
depending on other factors outside the nodes.

My goals are (I call the first node to start pacemaker "node1"):
1) I want to control how long pacemaker on node1 waits before fencing node2 if 
node2 does not start pacemaker.
2) If node1 is part-way through that waiting period, and node2 starts pacemaker 
so they detect each other, I would like them to proceed immediately to probing 
resource state and starting resources which are down, not wait until the end of 
that "grace period".

It looks from the documentation like dc-deadtime is how #1 is controlled, and 
#2 is expected normal behavior.  However, I'm seeing fence actions before 
dc-deadtime has passed.

Am I misunderstanding Pacemaker's expected behavior and/or how dc-deadtime 
should be used?

One possibly unusual aspect of this cluster is that these two nodes are 
stateless - they PXE boot from an image on another server - and I build the 
cluster configuration at boot time with a series of pcs commands, because the 
nodes have no local storage for this purpose.  The commands are:

['pcs', 'cluster', 'start']
['pcs', 'property', 'set', 'stonith-action=off']
['pcs', 'property', 'set', 'cluster-recheck-interval=60']
['pcs', 'property', 'set', 'start-failure-is-fatal=false']
['pcs', 'property', 'set', 'dc-deadtime=300']
['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman', 
'ip=192.168.64.65', 'pcmk_host_check=static-list', 
'pcmk_host_list=gopher11,gopher12']
['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman', 
'ip=192.168.64.65', 'pcmk_host_check=static-list', 
'pcmk_host_list=gopher11,gopher12']
['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool', 
'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11', 'op', 'start', 
'timeout=805']
...
['pcs', 'property', 'set', 'no-quorum-policy=ignore']

I could, instead, generate a CIB so that when Pacemaker is started, it has a 
full config.  Is that better?

thanks,
Olaf

=== corosync.conf:
totem {
version: 2
cluster_name: gopher11
secauth: off
transport: udpu
}
nodelist {
node {
ring0_addr: gopher11
name: gopher11
nodeid: 1
}
node {
ring0_addr: gopher12
name: gopher12
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}

=== Log excerpt

Here's an except from Pacemaker logs that reflect what I'm seeing.  These are 
from gopher12, the node that came up first.  The other node, which is not yet 
up, is gopher11.

Jan 25 17:55:38 gopher12 pacemakerd  [116033] (main)notice: 
Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 features:agent-manpages ascii-docs 
compat-2.0 corosync-ge-2 default-concurrent-fencing generated-manpages 
monotonic nagios ncurses remote systemd
Jan 25 17:55:39 gopher12 pacemaker-controld  [116040] (peer_update_callback)
info: Cluster node gopher12 is now member (was in unknown state)
Jan 25 17:55:43 gopher12 pacemaker-based [116035] (cib_perform_op)  info: 
++ 
/cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options']:
  
Jan 25 17:56:00 gopher12 pacemaker-controld  [116040] (crm_timer_popped)
info: Election Trigger just popped | input=I_DC_TIMEOUT time=30ms
Jan 25 17:56:01 gopher12 pacemaker-based [116035] (cib_perform_op)  info: 
++ 
/cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options']:
  
Jan 25 17:56:01 gopher12 pacemaker-controld  [116040] (abort_transition_graph)  
info: Transition 0 aborted by cib-bootstrap-options-no-quorum-policy doing 
create no-quorum-policy=ignore: Configuration change | cib=0.26.0 
source=te_update_diff_v2:464 
path=/cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options']
 complete=true
Jan 25 17:56:01 gopher12 pacemaker-controld  [116040] 
(controld_execute_fence_action)   notice: Requesting fencing (off) targeting 
node gopher11 | action=11 timeout=60


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/