Hi Brett 2012/12/5 Brett Maton <brett.ma...@googlemail.com>: > Ok, almost there :) > > I'm having some trouble with VIPs either not starting or starting on the > wrong node (so something isn't right :)). > > Lab04 should be the master (vipMaster), lab05 slave (vipSlave) > > (Postgres is up and running as a replication slave on lab05, although it's > being reported as stopped...) > > Output from crm_mon -Af > > Last updated: Wed Dec 5 09:35:58 2012 > Last change: Wed Dec 5 09:35:57 2012 via crm_attribute on lab04 > Stack: openais > Current DC: lab04 - partition with quorum > Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14 > 2 Nodes configured, 2 expected votes > 6 Resources configured. > ============ > > Online: [ lab05 lab04 ] > > Master/Slave Set: msPostgreSQL [pgsql] > Masters: [ lab04 ] > Stopped: [ pgsql:1 ] > vipSlave (ocf::heartbeat:IPaddr2): Started lab04 > Clone Set: clnPingCheck [pingCheck] > Started: [ lab04 ] > Stopped: [ pingCheck:1 ] > vipMaster (ocf::heartbeat:IPaddr2): Started lab04 > > Node Attributes: > * Node lab05: > + master-pgsql:0 : -INFINITY > + master-pgsql:1 : 100 > + pgsql-data-status : STREAMING|SYNC > + pgsql-status : STOP > * Node lab04: > + master-pgsql:0 : 1000 > + pgsql-data-status : LATEST > + pgsql-master-baseline : 000000000A000200 > + pgsql-status : PRI > + pingNodes : 200 > > Migration summary: > * Node lab04: > * Node lab05: >
It seems that it isn't normal status because pgsql:1 is stopped and pgsql-data-staus="STREAMING|SYNC". Did you start PostgreSQL in lab05 manually ? If yes, it confuses RA. > How do I migrate vipSalve to node lab05? If you use my sample configuration, it's impossible because of this configuration ---- location rsc_location-1 vip-slave \ rule 200: pgsql-status eq "HS:sync" \ rule 100: pgsql-status eq "PRI" \ rule -inf: not_defined pgsql-status \ rule -inf: pgsql-status ne "HS:sync" and pgsql-status ne "PRI" ---- This means that vip-slave can't run if pgsql-status isn't "HS:sync" and "PRI". > I've tried > # crm resource migrate vipSlave lab05 > > I did find this in the corosync log > Dec 05 09:35:58 [2064] lab04 pengine: notice: unpack_rsc_op: > Operation monitor found resource vipMaster active on lab04 > Dec 05 09:35:58 [2064] lab04 pengine: notice: unpack_rsc_op: > Operation monitor found resource pgsql:0 active in master mode on lab04 > Dec 05 09:35:58 [2064] lab04 pengine: notice: unpack_rsc_op: > Operation monitor found resource vipSlave active on lab04 > Dec 05 09:35:58 [2064] lab04 pengine: notice: unpack_rsc_op: > Operation monitor found resource pingCheck:0 active on lab04 > Dec 05 09:35:58 [2064] lab04 pengine: notice: unpack_rsc_op: > Operation monitor found resource pgsql:1 active on lab05 > Dec 05 09:35:58 [2064] lab04 pengine: warning: common_apply_stickiness: > Forcing clnPingCheck away from lab05 after 1 failures (max=1) > Dec 05 09:35:58 [2064] lab04 pengine: warning: common_apply_stickiness: > Forcing clnPingCheck away from lab05 after 1 failures (max=1) > > If it helps, pingCheck config: > > primitive pingCheck ocf:pacemaker:ping \ > params \ > name="pingNodes" \ > host_list="192.168.0.12 192.168.0.13" \ > multiplier="100" \ > op start interval="0" timeout="60s" on-fail="restart" \ > op monitor interval="10" timeout="60s" on-fail="restart" \ > op stop interval="0" timeout="60s" on-fail="ignore" > > Thanks again, > Brett > Thanks, Takatoshi MATSUO _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org