Hello community, I would be very happy to use some help from you.
I have configured PostgreSQL cluster with Pacemaker+PAF. The pacemaker configuration is the following (from https://clusterlabs.github.io/PAF/Quick_Start-CentOS-7.html) # pgsqld pcs -f cluster1.xml resource create pgsqld ocf:heartbeat:pgsqlms \ bindir=/usr/pgsql-9.6/bin pgdata=/var/lib/pgsql/9.6/data \ op start timeout=60s \ op stop timeout=60s \ op promote timeout=30s \ op demote timeout=120s \ op monitor interval=15s timeout=10s role="Master" \ op monitor interval=16s timeout=10s role="Slave" \ op notify timeout=60s # pgsql-ha pcs -f cluster1.xml resource master pgsql-ha pgsqld notify=true pcs -f cluster1.xml resource create pgsql-master-ip ocf:heartbeat:IPaddr2 \ ip=192.168.122.50 cidr_netmask=24 op monitor interval=10s pcs -f cluster1.xml constraint colocation add pgsql-master-ip with master pgsql-ha INFINITY pcs -f cluster1.xml constraint order promote pgsql-ha then start pgsql-master-ip symmetrical=false kind=Mandatory pcs -f cluster1.xml constraint order demote pgsql-ha then stop pgsql-master-ip symmetrical=false kind=Mandatory I use fence_xvm fencing agent, with the following configuration: pcs -f cluster1.xml stonith create fence1 fence_xvm pcmk_host_check="static-list" pcmk_host_list="srv1" port="srv-m1" multicast_address=224.0.0.2 pcs -f cluster1.xml stonith create fence2 fence_xvm pcmk_host_check="static-list" pcmk_host_list="srv2" port="srv-m2" multicast_address=224.0.0.2 pcs -f cluster1.xml constraint location fence1 avoids srv1=INFINITY pcs -f cluster1.xml constraint location fence2 avoids srv2=INFINITY The cluster is behaving in strange way. When I manually fence the master node (or ungracefully shutdown), after unfencing/starting, the node has status Failed/blocked and the node is constantly fenced(restarted) by the fencing agent. Should the fencing recover the cluster as Master/Slave without problem? The error log say that the demote action on the node has failed: warning: Action 10 (pgsqld_demote_0) on server1 failed (target: 0 vs. rc: 1): Error warning: Processing failed op demote for pgsqld:1 on server1: unknown error (1) warning: Forcing pgsqld:1 to stop after a failed demote action Is this a cluster misconfiguration? Any idea would be greatly appreciated. Thank you in advance, Aleksandra
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
