On 08/03/2022 10:21, Jehan-Guillaume de Rorthais wrote:
op start timeout=60s \ op stop timeout=60s \ op promote timeout=30s >> \ op demote timeout=120s \ op monitor interval=15s
timeout=10s >> role="Master" meta master-max=1 \ op monitor interval=16s >> timeout=10s role="Slave" \ op notify timeout=60s meta notify=true > Because "op" appears, we are back in resource ("pgsqld") context, > anything after is interpreted as ressource and operation attributes, > even the "meta notify=true". That's why your pgsqld-clone doesn't > have the meta attribute "notify=true" set. Here is one-liner that should do - add, as per 'debug-' suggestion, 'master-max=1'

-> $ pcs resource create pgsqld ocf:heartbeat:pgsqlms bindir=/usr/bin pgdata=/var/lib/pgsql/data op start timeout=60s op stop timeout=60s op promote timeout=30s op demote timeout=120s op monitor interval=15s timeout=10s role="Master" op monitor interval=16s timeout=10s role="Slave" op notify timeout=60s promotable notify=true master-max=1 && pcs constraint colocation add HA-10-1-1-226 with master pgsqld-clone INFINITY && pcs constraint order promote pgsqld-clone then start HA-10-1-1-226 symmetrical=false kind=Mandatory && pcs constraint order demote pgsqld-clone then stop HA-10-1-1-226 symmetrical=false kind=Mandatory

but ... ! this "issue" is reproducible! So now you have working 'pgsqlms', then do:

-> $ pcs resource delete pgsqld

'-clone' should get removed too, so now no 'pgsqld' resource(s) but cluster - weirdly in my mind - leaves node attributes on. I see 'master-pgsqld' with each node and do not see why 'node attributes' should be kept(certainly shown) for non-existent resources(to which only resources those attrs are instinct) So, you want to "clean" that for, perhaps for now you are not going to have/use 'pgsqlms', you can do that with:

-> $ pcs node attribute node1 master-pgsqld="" # same for remaining nodes

now .. ! repeat your one-liner which worked just a moment ago and you should get exact same or similar errors(while all nodes are stuck on 'slave'

-> $ pcs resource debug-promote pgsqld
crm_resource: Error performing operation: Error occurred
Operation force-promote for pgsqld (ocf:heartbeat:pgsqlms) returned 1 (error: Can not get current node LSN location)
/tmp:5432 - accepting connections

ocf-exit-reason:Can not get current node LSN location
--------------------
You have to 'cib-push' to "fix" this very problem.
In my(admin's) opinion this is a 100% candidate for a bug - whether in PCS or PAF - perhaps authors may wish to comment?

many thanks, L.



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to