Re: [ClusterLabs] trigger something at ?
On 31/01/2024 16:37, lejeczek via Users wrote: On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 16:02:12 +0100 lejeczek via Users wrote: On 29/01/2024 17:22, Ken Gaillot wrote: On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: Hi guys. Is it possible to trigger some... action - I'm thinking specifically at shutdown/start. If not within the cluster then - if you do that - perhaps outside. I would like to create/remove constraints, when cluster starts & stops, respectively. many thanks, L. You could use node status alerts for that, but it's risky for alert agents to change the configuration (since that may result in more alerts and potentially some sort of infinite loop). Pacemaker has no concept of a full cluster start/stop, only node start/stop. You could approximate that by checking whether the node receiving the alert is the only active node. Another possibility would be to write a resource agent that does what you want and order everything else after it. However it's even more risky for a resource agent to modify the configuration. Finally you could write a systemd unit to do what you want and order it after pacemaker. What's wrong with leaving the constraints permanently configured? yes, that would be for a node start/stop I struggle with using constraints to move pgsql (PAF) master onto a given node - seems that co/locating paf's master results in troubles (replication brakes) at/after node shutdown/reboot (not always, but way too often) What? What's wrong with colocating PAF's masters exactly? How does it brake any replication? What's these constraints you are dealing with? Could you share your configuration? Constraints beyond/above of what is required by PAF agent itself, say... you have multiple pgSQL cluster with PAF - thus multiple (separate, for each pgSQL cluster) masters and you want to spread/balance those across HA cluster (or in other words - avoid having more that 1 pgsql master per HA node) These below, I've tried, those move the master onto chosen node but.. then the issues I mentioned. -> $ pcs constraint location PGSQL-PAF-5438-clone prefers ubusrv1=1002 or -> $ pcs constraint colocation set PGSQL-PAF-5435-clone PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master require-all=false setoptions score=-1000 Wanted to share an observation - not a measurement of anything, I did not take those - of different, latest pgSQL version which I put in place of version 14 which I've been using all this time. (also with that upgrade - from Postgres own repos - came update of PAF) So, with pgSQL ver. 16 and the same of everything else - now paf/pgSQL resources behave a lot lot better, survives just fine all those cases - with ! extra constraints of course - where previously it had replication failures. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On 01/02/2024 15:02, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 18:23:40 +0100 lejeczek via Users wrote: On 31/01/2024 17:13, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 16:37:21 +0100 lejeczek via Users wrote: On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 16:02:12 +0100 lejeczek via Users wrote: On 29/01/2024 17:22, Ken Gaillot wrote: On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: Hi guys. Is it possible to trigger some... action - I'm thinking specifically at shutdown/start. If not within the cluster then - if you do that - perhaps outside. I would like to create/remove constraints, when cluster starts & stops, respectively. many thanks, L. You could use node status alerts for that, but it's risky for alert agents to change the configuration (since that may result in more alerts and potentially some sort of infinite loop). Pacemaker has no concept of a full cluster start/stop, only node start/stop. You could approximate that by checking whether the node receiving the alert is the only active node. Another possibility would be to write a resource agent that does what you want and order everything else after it. However it's even more risky for a resource agent to modify the configuration. Finally you could write a systemd unit to do what you want and order it after pacemaker. What's wrong with leaving the constraints permanently configured? yes, that would be for a node start/stop I struggle with using constraints to move pgsql (PAF) master onto a given node - seems that co/locating paf's master results in troubles (replication brakes) at/after node shutdown/reboot (not always, but way too often) What? What's wrong with colocating PAF's masters exactly? How does it brake any replication? What's these constraints you are dealing with? Could you share your configuration? Constraints beyond/above of what is required by PAF agent itself, say... you have multiple pgSQL cluster with PAF - thus multiple (separate, for each pgSQL cluster) masters and you want to spread/balance those across HA cluster (or in other words - avoid having more that 1 pgsql master per HA node) ok These below, I've tried, those move the master onto chosen node but.. then the issues I mentioned. You just mentioned it breaks the replication, but there so little information about your architecture and configuration, it's impossible to imagine how this could break the replication. Could you add details about the issues ? -> $ pcs constraint location PGSQL-PAF-5438-clone prefers ubusrv1=1002 or -> $ pcs constraint colocation set PGSQL-PAF-5435-clone PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master require-all=false setoptions score=-1000 I suppose "collocation" constraint is the way to go, not the "location" one. This should be easy to replicate, 3 x VMs, Ubuntu 22.04 in my case No, this is not easy to replicate. I have no idea how you setup your PostgreSQL replication, neither I have your full pacemaker configuration. Please provide either detailed setupS and/or ansible and/or terraform and/or vagrant, then a detailed scenario showing how it breaks. This is how you can help and motivate devs to reproduce your issue and work on it. I will not try to poke around for hours until I find an issue that might not even be the same than yours. How about you start with the basics - strange inclination to complicate things when they are not, I hear from you - that's what I did while "stumbled" upon these "issues" How about just: a) do vanilla-default pgSQL in Ubuntu (or perhaps any other OS of your choice), I use _pg_createcluster_ b) follow PAF official guide (a single PAF resource should suffice) Have a healthy pgSQL cluster, OS _reboot_ nodes - play with that, all should be ok, moving around/electing master should work a ok. Then... add, play with "additional" co/location constraints, then OS reboots,- things should begin braking. I have 3-node HA cluster & 3-node PAF resource = 1 master + 2 slaves. Only thing I deliberately set, to alleviate pgsql replication was _wal_keep_size_ - I increased that, but this is subjective. It's fine with me if you don't feel like doing this. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On Thu, 2024-02-01 at 14:31 +0100, lejeczek via Users wrote: > > On 31/01/2024 18:11, Ken Gaillot wrote: > > On Wed, 2024-01-31 at 16:37 +0100, lejeczek via Users wrote: > > > On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: > > > > On Wed, 31 Jan 2024 16:02:12 +0100 > > > > lejeczek via Users wrote: > > > > > > > > > On 29/01/2024 17:22, Ken Gaillot wrote: > > > > > > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users > > > > > > wrote: > > > > > > > Hi guys. > > > > > > > > > > > > > > Is it possible to trigger some... action - I'm thinking > > > > > > > specifically > > > > > > > at shutdown/start. > > > > > > > If not within the cluster then - if you do that - perhaps > > > > > > > outside. > > > > > > > I would like to create/remove constraints, when cluster > > > > > > > starts & > > > > > > > stops, respectively. > > > > > > > > > > > > > > many thanks, L. > > > > > > > > > > > > > You could use node status alerts for that, but it's risky > > > > > > for > > > > > > alert > > > > > > agents to change the configuration (since that may result > > > > > > in > > > > > > more > > > > > > alerts and potentially some sort of infinite loop). > > > > > > > > > > > > Pacemaker has no concept of a full cluster start/stop, only > > > > > > node > > > > > > start/stop. You could approximate that by checking whether > > > > > > the > > > > > > node > > > > > > receiving the alert is the only active node. > > > > > > > > > > > > Another possibility would be to write a resource agent that > > > > > > does what > > > > > > you want and order everything else after it. However it's > > > > > > even > > > > > > more > > > > > > risky for a resource agent to modify the configuration. > > > > > > > > > > > > Finally you could write a systemd unit to do what you want > > > > > > and > > > > > > order it > > > > > > after pacemaker. > > > > > > > > > > > > What's wrong with leaving the constraints permanently > > > > > > configured? > > > > > yes, that would be for a node start/stop > > > > > I struggle with using constraints to move pgsql (PAF) master > > > > > onto a given node - seems that co/locating paf's master > > > > > results in troubles (replication brakes) at/after node > > > > > shutdown/reboot (not always, but way too often) > > > > What? What's wrong with colocating PAF's masters exactly? How > > > > does > > > > it brake any > > > > replication? What's these constraints you are dealing with? > > > > > > > > Could you share your configuration? > > > Constraints beyond/above of what is required by PAF agent > > > itself, say... > > > you have multiple pgSQL cluster with PAF - thus multiple > > > (separate, for each pgSQL cluster) masters and you want to > > > spread/balance those across HA cluster > > > (or in other words - avoid having more that 1 pgsql master > > > per HA node) > > > These below, I've tried, those move the master onto chosen > > > node but.. then the issues I mentioned. > > > > > > -> $ pcs constraint location PGSQL-PAF-5438-clone prefers > > > ubusrv1=1002 > > > or > > > -> $ pcs constraint colocation set PGSQL-PAF-5435-clone > > > PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master > > > require-all=false setoptions score=-1000 > > > > > Anti-colocation sets tend to be tricky currently -- if the first > > resource can't be assigned to a node, none of them can. We have an > > idea > > for a better implementation: > > > > https://projects.clusterlabs.org/T383 > > > > In the meantime, a possible workaround is to use placement- > > strategy=balanced and define utilization for the clones only. The > > promoted roles will each get a slight additional utilization, and > > the > > cluster should spread them out across nodes whenever possible. I > > don't > > know if that will avoid the replication issues but it may be worth > > a > > try. > using _balanced_ causes a small mayhem to PAF/pgsql: > > -> $ pcs property > Cluster Properties: > REDIS-6380_REPL_INFO: ubusrv3 > REDIS-6381_REPL_INFO: ubusrv2 > REDIS-6382_REPL_INFO: ubusrv2 > REDIS-6385_REPL_INFO: ubusrv1 > REDIS_REPL_INFO: ubusrv1 > cluster-infrastructure: corosync > cluster-name: ubusrv > dc-version: 2.1.2-ada5c3b36e2 > have-watchdog: false > last-lrm-refresh: 1706711588 > placement-strategy: default > stonith-enabled: false > > -> $ pcs resource utilization PGSQL-PAF-5438 cpu="20" > > -> $ pcs property set placement-strategy=balanced # when > resource stops: > I change it back: > -> $ pcs property set placement-strategy=default > and pgSQL/paf works again > > I've not used _utilization_ nor _placement-strategy_ before, > thus chance that I'm missing something is solid. > See: https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/utilization.html You have to define the node capacities as well as the resource utilization. I'd define the capacities high enough to run all the resources (assuming you want that capability if all other nodes are in standby or whatever).
Re: [ClusterLabs] trigger something at ?
On Wed, 31 Jan 2024 18:23:40 +0100 lejeczek via Users wrote: > On 31/01/2024 17:13, Jehan-Guillaume de Rorthais wrote: > > On Wed, 31 Jan 2024 16:37:21 +0100 > > lejeczek via Users wrote: > > > >> > >> On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: > >>> On Wed, 31 Jan 2024 16:02:12 +0100 > >>> lejeczek via Users wrote: > >>> > On 29/01/2024 17:22, Ken Gaillot wrote: > > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: > >> Hi guys. > >> > >> Is it possible to trigger some... action - I'm thinking specifically > >> at shutdown/start. > >> If not within the cluster then - if you do that - perhaps outside. > >> I would like to create/remove constraints, when cluster starts & > >> stops, respectively. > >> > >> many thanks, L. > >> > > You could use node status alerts for that, but it's risky for alert > > agents to change the configuration (since that may result in more > > alerts and potentially some sort of infinite loop). > > > > Pacemaker has no concept of a full cluster start/stop, only node > > start/stop. You could approximate that by checking whether the node > > receiving the alert is the only active node. > > > > Another possibility would be to write a resource agent that does what > > you want and order everything else after it. However it's even more > > risky for a resource agent to modify the configuration. > > > > Finally you could write a systemd unit to do what you want and order it > > after pacemaker. > > > > What's wrong with leaving the constraints permanently configured? > yes, that would be for a node start/stop > I struggle with using constraints to move pgsql (PAF) master > onto a given node - seems that co/locating paf's master > results in troubles (replication brakes) at/after node > shutdown/reboot (not always, but way too often) > >>> What? What's wrong with colocating PAF's masters exactly? How does it > >>> brake any replication? What's these constraints you are dealing with? > >>> > >>> Could you share your configuration? > >> Constraints beyond/above of what is required by PAF agent > >> itself, say... > >> you have multiple pgSQL cluster with PAF - thus multiple > >> (separate, for each pgSQL cluster) masters and you want to > >> spread/balance those across HA cluster > >> (or in other words - avoid having more that 1 pgsql master > >> per HA node) > > ok > > > >> These below, I've tried, those move the master onto chosen > >> node but.. then the issues I mentioned. > > You just mentioned it breaks the replication, but there so little > > information about your architecture and configuration, it's impossible to > > imagine how this could break the replication. > > > > Could you add details about the issues ? > > > >> -> $ pcs constraint location PGSQL-PAF-5438-clone prefers > >> ubusrv1=1002 > >> or > >> -> $ pcs constraint colocation set PGSQL-PAF-5435-clone > >> PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master > >> require-all=false setoptions score=-1000 > > I suppose "collocation" constraint is the way to go, not the "location" > > one. > This should be easy to replicate, 3 x VMs, Ubuntu 22.04 in > my case No, this is not easy to replicate. I have no idea how you setup your PostgreSQL replication, neither I have your full pacemaker configuration. Please provide either detailed setupS and/or ansible and/or terraform and/or vagrant, then a detailed scenario showing how it breaks. This is how you can help and motivate devs to reproduce your issue and work on it. I will not try to poke around for hours until I find an issue that might not even be the same than yours. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On 31/01/2024 18:11, Ken Gaillot wrote: On Wed, 2024-01-31 at 16:37 +0100, lejeczek via Users wrote: On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 16:02:12 +0100 lejeczek via Users wrote: On 29/01/2024 17:22, Ken Gaillot wrote: On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: Hi guys. Is it possible to trigger some... action - I'm thinking specifically at shutdown/start. If not within the cluster then - if you do that - perhaps outside. I would like to create/remove constraints, when cluster starts & stops, respectively. many thanks, L. You could use node status alerts for that, but it's risky for alert agents to change the configuration (since that may result in more alerts and potentially some sort of infinite loop). Pacemaker has no concept of a full cluster start/stop, only node start/stop. You could approximate that by checking whether the node receiving the alert is the only active node. Another possibility would be to write a resource agent that does what you want and order everything else after it. However it's even more risky for a resource agent to modify the configuration. Finally you could write a systemd unit to do what you want and order it after pacemaker. What's wrong with leaving the constraints permanently configured? yes, that would be for a node start/stop I struggle with using constraints to move pgsql (PAF) master onto a given node - seems that co/locating paf's master results in troubles (replication brakes) at/after node shutdown/reboot (not always, but way too often) What? What's wrong with colocating PAF's masters exactly? How does it brake any replication? What's these constraints you are dealing with? Could you share your configuration? Constraints beyond/above of what is required by PAF agent itself, say... you have multiple pgSQL cluster with PAF - thus multiple (separate, for each pgSQL cluster) masters and you want to spread/balance those across HA cluster (or in other words - avoid having more that 1 pgsql master per HA node) These below, I've tried, those move the master onto chosen node but.. then the issues I mentioned. -> $ pcs constraint location PGSQL-PAF-5438-clone prefers ubusrv1=1002 or -> $ pcs constraint colocation set PGSQL-PAF-5435-clone PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master require-all=false setoptions score=-1000 Anti-colocation sets tend to be tricky currently -- if the first resource can't be assigned to a node, none of them can. We have an idea for a better implementation: https://projects.clusterlabs.org/T383 In the meantime, a possible workaround is to use placement- strategy=balanced and define utilization for the clones only. The promoted roles will each get a slight additional utilization, and the cluster should spread them out across nodes whenever possible. I don't know if that will avoid the replication issues but it may be worth a try. using _balanced_ causes a small mayhem to PAF/pgsql: -> $ pcs property Cluster Properties: REDIS-6380_REPL_INFO: ubusrv3 REDIS-6381_REPL_INFO: ubusrv2 REDIS-6382_REPL_INFO: ubusrv2 REDIS-6385_REPL_INFO: ubusrv1 REDIS_REPL_INFO: ubusrv1 cluster-infrastructure: corosync cluster-name: ubusrv dc-version: 2.1.2-ada5c3b36e2 have-watchdog: false last-lrm-refresh: 1706711588 placement-strategy: default stonith-enabled: false -> $ pcs resource utilization PGSQL-PAF-5438 cpu="20" -> $ pcs property set placement-strategy=balanced # when resource stops: I change it back: -> $ pcs property set placement-strategy=default and pgSQL/paf works again I've not used _utilization_ nor _placement-strategy_ before, thus chance that I'm missing something is solid. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On 31/01/2024 17:13, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 16:37:21 +0100 lejeczek via Users wrote: On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 16:02:12 +0100 lejeczek via Users wrote: On 29/01/2024 17:22, Ken Gaillot wrote: On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: Hi guys. Is it possible to trigger some... action - I'm thinking specifically at shutdown/start. If not within the cluster then - if you do that - perhaps outside. I would like to create/remove constraints, when cluster starts & stops, respectively. many thanks, L. You could use node status alerts for that, but it's risky for alert agents to change the configuration (since that may result in more alerts and potentially some sort of infinite loop). Pacemaker has no concept of a full cluster start/stop, only node start/stop. You could approximate that by checking whether the node receiving the alert is the only active node. Another possibility would be to write a resource agent that does what you want and order everything else after it. However it's even more risky for a resource agent to modify the configuration. Finally you could write a systemd unit to do what you want and order it after pacemaker. What's wrong with leaving the constraints permanently configured? yes, that would be for a node start/stop I struggle with using constraints to move pgsql (PAF) master onto a given node - seems that co/locating paf's master results in troubles (replication brakes) at/after node shutdown/reboot (not always, but way too often) What? What's wrong with colocating PAF's masters exactly? How does it brake any replication? What's these constraints you are dealing with? Could you share your configuration? Constraints beyond/above of what is required by PAF agent itself, say... you have multiple pgSQL cluster with PAF - thus multiple (separate, for each pgSQL cluster) masters and you want to spread/balance those across HA cluster (or in other words - avoid having more that 1 pgsql master per HA node) ok These below, I've tried, those move the master onto chosen node but.. then the issues I mentioned. You just mentioned it breaks the replication, but there so little information about your architecture and configuration, it's impossible to imagine how this could break the replication. Could you add details about the issues ? -> $ pcs constraint location PGSQL-PAF-5438-clone prefers ubusrv1=1002 or -> $ pcs constraint colocation set PGSQL-PAF-5435-clone PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master require-all=false setoptions score=-1000 I suppose "collocation" constraint is the way to go, not the "location" one. This should be easy to replicate, 3 x VMs, Ubuntu 22.04 in my case -> $ pcs resource config PGSQL-PAF-5438-clone Clone: PGSQL-PAF-5438-clone Meta Attrs: failure-timeout=60s master-max=1 notify=true promotable=true Resource: PGSQL-PAF-5438 (class=ocf provider=heartbeat type=pgsqlms) Attributes: bindir=/usr/lib/postgresql/16/bin datadir=/var/lib/postgresql/16/paf-5438 maxlag=1000 pgdata=/etc/postgresql/16/paf-5438 pgport=5438 Operations: demote interval=0s timeout=120s (PGSQL-PAF-5438-demote-interval-0s) methods interval=0s timeout=5 (PGSQL-PAF-5438-methods-interval-0s) monitor interval=15s role=Master timeout=10s (PGSQL-PAF-5438-monitor-interval-15s) monitor interval=16s role=Slave timeout=10s (PGSQL-PAF-5438-monitor-interval-16s) notify interval=0s timeout=60s (PGSQL-PAF-5438-notify-interval-0s) promote interval=0s timeout=30s (PGSQL-PAF-5438-promote-interval-0s) reload interval=0s timeout=20 (PGSQL-PAF-5438-reload-interval-0s) start interval=0s timeout=60s (PGSQL-PAF-5438-start-interval-0s) stop interval=0s timeout=60s (PGSQL-PAF-5438-stop-interval-0s) so, regarding PAF - 1 master + 2 slaves, have a healthy pqSQL/PAF cluster to begin with, then make resource prefer a specific node (with simplest variant of constraints I tried): -> $ pcs constraint location PGSQL-PAF-5438-clone prefers ubusrv1=1002 and play with it, rebooting node(s) with OS' _reboot_ I at some point, get HA/resource unable to start pgSQL, unable to elect a master (logs saying with replication broken) and I have to "fix" pgSQL cluster outside of PAF, using _pg_basebackup_ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On Wed, 2024-01-31 at 16:37 +0100, lejeczek via Users wrote: > > On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: > > On Wed, 31 Jan 2024 16:02:12 +0100 > > lejeczek via Users wrote: > > > > > On 29/01/2024 17:22, Ken Gaillot wrote: > > > > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: > > > > > Hi guys. > > > > > > > > > > Is it possible to trigger some... action - I'm thinking > > > > > specifically > > > > > at shutdown/start. > > > > > If not within the cluster then - if you do that - perhaps > > > > > outside. > > > > > I would like to create/remove constraints, when cluster > > > > > starts & > > > > > stops, respectively. > > > > > > > > > > many thanks, L. > > > > > > > > > You could use node status alerts for that, but it's risky for > > > > alert > > > > agents to change the configuration (since that may result in > > > > more > > > > alerts and potentially some sort of infinite loop). > > > > > > > > Pacemaker has no concept of a full cluster start/stop, only > > > > node > > > > start/stop. You could approximate that by checking whether the > > > > node > > > > receiving the alert is the only active node. > > > > > > > > Another possibility would be to write a resource agent that > > > > does what > > > > you want and order everything else after it. However it's even > > > > more > > > > risky for a resource agent to modify the configuration. > > > > > > > > Finally you could write a systemd unit to do what you want and > > > > order it > > > > after pacemaker. > > > > > > > > What's wrong with leaving the constraints permanently > > > > configured? > > > yes, that would be for a node start/stop > > > I struggle with using constraints to move pgsql (PAF) master > > > onto a given node - seems that co/locating paf's master > > > results in troubles (replication brakes) at/after node > > > shutdown/reboot (not always, but way too often) > > What? What's wrong with colocating PAF's masters exactly? How does > > it brake any > > replication? What's these constraints you are dealing with? > > > > Could you share your configuration? > Constraints beyond/above of what is required by PAF agent > itself, say... > you have multiple pgSQL cluster with PAF - thus multiple > (separate, for each pgSQL cluster) masters and you want to > spread/balance those across HA cluster > (or in other words - avoid having more that 1 pgsql master > per HA node) > These below, I've tried, those move the master onto chosen > node but.. then the issues I mentioned. > > -> $ pcs constraint location PGSQL-PAF-5438-clone prefers > ubusrv1=1002 > or > -> $ pcs constraint colocation set PGSQL-PAF-5435-clone > PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master > require-all=false setoptions score=-1000 > Anti-colocation sets tend to be tricky currently -- if the first resource can't be assigned to a node, none of them can. We have an idea for a better implementation: https://projects.clusterlabs.org/T383 In the meantime, a possible workaround is to use placement- strategy=balanced and define utilization for the clones only. The promoted roles will each get a slight additional utilization, and the cluster should spread them out across nodes whenever possible. I don't know if that will avoid the replication issues but it may be worth a try. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On Wed, 31 Jan 2024 16:37:21 +0100 lejeczek via Users wrote: > > > On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: > > On Wed, 31 Jan 2024 16:02:12 +0100 > > lejeczek via Users wrote: > > > >> > >> On 29/01/2024 17:22, Ken Gaillot wrote: > >>> On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: > Hi guys. > > Is it possible to trigger some... action - I'm thinking specifically > at shutdown/start. > If not within the cluster then - if you do that - perhaps outside. > I would like to create/remove constraints, when cluster starts & > stops, respectively. > > many thanks, L. > > >>> You could use node status alerts for that, but it's risky for alert > >>> agents to change the configuration (since that may result in more > >>> alerts and potentially some sort of infinite loop). > >>> > >>> Pacemaker has no concept of a full cluster start/stop, only node > >>> start/stop. You could approximate that by checking whether the node > >>> receiving the alert is the only active node. > >>> > >>> Another possibility would be to write a resource agent that does what > >>> you want and order everything else after it. However it's even more > >>> risky for a resource agent to modify the configuration. > >>> > >>> Finally you could write a systemd unit to do what you want and order it > >>> after pacemaker. > >>> > >>> What's wrong with leaving the constraints permanently configured? > >> yes, that would be for a node start/stop > >> I struggle with using constraints to move pgsql (PAF) master > >> onto a given node - seems that co/locating paf's master > >> results in troubles (replication brakes) at/after node > >> shutdown/reboot (not always, but way too often) > > What? What's wrong with colocating PAF's masters exactly? How does it brake > > any replication? What's these constraints you are dealing with? > > > > Could you share your configuration? > Constraints beyond/above of what is required by PAF agent > itself, say... > you have multiple pgSQL cluster with PAF - thus multiple > (separate, for each pgSQL cluster) masters and you want to > spread/balance those across HA cluster > (or in other words - avoid having more that 1 pgsql master > per HA node) ok > These below, I've tried, those move the master onto chosen > node but.. then the issues I mentioned. You just mentioned it breaks the replication, but there so little information about your architecture and configuration, it's impossible to imagine how this could break the replication. Could you add details about the issues ? > -> $ pcs constraint location PGSQL-PAF-5438-clone prefers > ubusrv1=1002 > or > -> $ pcs constraint colocation set PGSQL-PAF-5435-clone > PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master > require-all=false setoptions score=-1000 I suppose "collocation" constraint is the way to go, not the "location" one. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote: On Wed, 31 Jan 2024 16:02:12 +0100 lejeczek via Users wrote: On 29/01/2024 17:22, Ken Gaillot wrote: On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: Hi guys. Is it possible to trigger some... action - I'm thinking specifically at shutdown/start. If not within the cluster then - if you do that - perhaps outside. I would like to create/remove constraints, when cluster starts & stops, respectively. many thanks, L. You could use node status alerts for that, but it's risky for alert agents to change the configuration (since that may result in more alerts and potentially some sort of infinite loop). Pacemaker has no concept of a full cluster start/stop, only node start/stop. You could approximate that by checking whether the node receiving the alert is the only active node. Another possibility would be to write a resource agent that does what you want and order everything else after it. However it's even more risky for a resource agent to modify the configuration. Finally you could write a systemd unit to do what you want and order it after pacemaker. What's wrong with leaving the constraints permanently configured? yes, that would be for a node start/stop I struggle with using constraints to move pgsql (PAF) master onto a given node - seems that co/locating paf's master results in troubles (replication brakes) at/after node shutdown/reboot (not always, but way too often) What? What's wrong with colocating PAF's masters exactly? How does it brake any replication? What's these constraints you are dealing with? Could you share your configuration? Constraints beyond/above of what is required by PAF agent itself, say... you have multiple pgSQL cluster with PAF - thus multiple (separate, for each pgSQL cluster) masters and you want to spread/balance those across HA cluster (or in other words - avoid having more that 1 pgsql master per HA node) These below, I've tried, those move the master onto chosen node but.. then the issues I mentioned. -> $ pcs constraint location PGSQL-PAF-5438-clone prefers ubusrv1=1002 or -> $ pcs constraint colocation set PGSQL-PAF-5435-clone PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master require-all=false setoptions score=-1000 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On Wed, 31 Jan 2024 16:02:12 +0100 lejeczek via Users wrote: > > > On 29/01/2024 17:22, Ken Gaillot wrote: > > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: > >> Hi guys. > >> > >> Is it possible to trigger some... action - I'm thinking specifically > >> at shutdown/start. > >> If not within the cluster then - if you do that - perhaps outside. > >> I would like to create/remove constraints, when cluster starts & > >> stops, respectively. > >> > >> many thanks, L. > >> > > You could use node status alerts for that, but it's risky for alert > > agents to change the configuration (since that may result in more > > alerts and potentially some sort of infinite loop). > > > > Pacemaker has no concept of a full cluster start/stop, only node > > start/stop. You could approximate that by checking whether the node > > receiving the alert is the only active node. > > > > Another possibility would be to write a resource agent that does what > > you want and order everything else after it. However it's even more > > risky for a resource agent to modify the configuration. > > > > Finally you could write a systemd unit to do what you want and order it > > after pacemaker. > > > > What's wrong with leaving the constraints permanently configured? > yes, that would be for a node start/stop > I struggle with using constraints to move pgsql (PAF) master > onto a given node - seems that co/locating paf's master > results in troubles (replication brakes) at/after node > shutdown/reboot (not always, but way too often) What? What's wrong with colocating PAF's masters exactly? How does it brake any replication? What's these constraints you are dealing with? Could you share your configuration? ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On 29/01/2024 17:22, Ken Gaillot wrote: On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: Hi guys. Is it possible to trigger some... action - I'm thinking specifically at shutdown/start. If not within the cluster then - if you do that - perhaps outside. I would like to create/remove constraints, when cluster starts & stops, respectively. many thanks, L. You could use node status alerts for that, but it's risky for alert agents to change the configuration (since that may result in more alerts and potentially some sort of infinite loop). Pacemaker has no concept of a full cluster start/stop, only node start/stop. You could approximate that by checking whether the node receiving the alert is the only active node. Another possibility would be to write a resource agent that does what you want and order everything else after it. However it's even more risky for a resource agent to modify the configuration. Finally you could write a systemd unit to do what you want and order it after pacemaker. What's wrong with leaving the constraints permanently configured? yes, that would be for a node start/stop I struggle with using constraints to move pgsql (PAF) master onto a given node - seems that co/locating paf's master results in troubles (replication brakes) at/after node shutdown/reboot (not always, but way too often) Ideally I'm hoping that: at node stop, stopping node could check if it's PAF's master and if yes so then remove given constraints ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On Mon, Jan 29, 2024 at 5:22 PM Ken Gaillot wrote: > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: > > Hi guys. > > > > Is it possible to trigger some... action - I'm thinking specifically > > at shutdown/start. > > If not within the cluster then - if you do that - perhaps outside. > > I would like to create/remove constraints, when cluster starts & > > stops, respectively. > > > > many thanks, L. > > > > You could use node status alerts for that, but it's risky for alert > agents to change the configuration (since that may result in more > alerts and potentially some sort of infinite loop). > > Pacemaker has no concept of a full cluster start/stop, only node > start/stop. You could approximate that by checking whether the node > receiving the alert is the only active node. > > Another possibility would be to write a resource agent that does what > you want and order everything else after it. However it's even more > risky for a resource agent to modify the configuration. > > Finally you could write a systemd unit to do what you want and order it > after pacemaker. > > What's wrong with leaving the constraints permanently configured? > My guts feeling tells me there is something wrong with the constraints that probably will hit you as well when recovering from a problem. But maybe it would be easier with some kind of example. Klaus > -- > Ken Gaillot > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] trigger something at ?
On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: > Hi guys. > > Is it possible to trigger some... action - I'm thinking specifically > at shutdown/start. > If not within the cluster then - if you do that - perhaps outside. > I would like to create/remove constraints, when cluster starts & > stops, respectively. > > many thanks, L. > You could use node status alerts for that, but it's risky for alert agents to change the configuration (since that may result in more alerts and potentially some sort of infinite loop). Pacemaker has no concept of a full cluster start/stop, only node start/stop. You could approximate that by checking whether the node receiving the alert is the only active node. Another possibility would be to write a resource agent that does what you want and order everything else after it. However it's even more risky for a resource agent to modify the configuration. Finally you could write a systemd unit to do what you want and order it after pacemaker. What's wrong with leaving the constraints permanently configured? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] trigger something at ?
Hi guys. Is it possible to trigger some... action - I'm thinking specifically at shutdown/start. If not within the cluster then - if you do that - perhaps outside. I would like to create/remove constraints, when cluster starts & stops, respectively. many thanks, L.___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/