Hold on. I now fix it. Just stop pgsql on c node and restart corosync on c node.
2013/12/20 Andrey Rogovsky <[email protected]> > I don't get answer and try manual cleanup c node few times > But this is not help me. I have this status: > Node Attributes: > * Node a.mydomain.com: > + master-pgsql:0 : 1000 > + pgsql-data-status : LATEST > + pgsql-master-baseline : 000000001C000160 > + pgsql-status : PRI > * Node c.mydomain.com: > + master-pgsql:1 : -INFINITY > + master-pgsql:2 : -INFINITY > + pgsql-data-status : STREAMING|ASYNC > + pgsql-status : STOP > * Node b.mydomain.com: > + master-pgsql:1 : -INFINITY > + pgsql-data-status : STREAMING|ASYNC > + pgsql-status : HS:async > > > May be someone know how to switch form stop to HS:async? > > > > 2013/12/14 Andrey Rogovsky <[email protected]> > >> Ok, I was stop pgsql on all nodes, delete lock file, start manual >> replication from b and c to a: >> root@a:~# sudo -u postgres psql >> could not change directory to "/root": Permission denied >> psql (9.3.2) >> Type "help" for help. >> >> postgres=# select client_addr,sync_state from pg_stat_replication; >> client_addr | sync_state >> --------------+------------ >> 192.168.10.3 | async >> 192.168.10.2 | async >> (2 rows) >> >> >> After I cleanup and got this: >> root@a:~# crm_mon -VAf -1 >> ============ >> Last updated: Sat Dec 14 20:24:04 2013 >> Last change: Sat Dec 14 20:23:57 2013 via crm_attribute on a.mydomain.com >> Stack: openais >> Current DC: a.mydomain.com - partition with quorum >> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff >> 3 Nodes configured, 3 expected votes >> 6 Resources configured. >> ============ >> >> Online: [ a.mydomain.com c.mydomain.com b.mydomain.com ] >> >> Resource Group: master >> pgsql-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com >> Master/Slave Set: msPostgresql [pgsql] >> Masters: [ a.mydomain.com ] >> Slaves: [ c.mydomain.com ] >> Stopped: [ pgsql:2 ] >> apache-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com >> apache (ocf::heartbeat:apache): Started a.mydomain.com >> >> Node Attributes: >> * Node a.mydomain.com: >> + master-pgsql:0 : 1000 >> + pgsql-data-status : LATEST >> + pgsql-master-baseline : 000000001C000160 >> + pgsql-status : PRI >> * Node c.mydomain.com: >> + master-pgsql:1 : -INFINITY >> + master-pgsql:2 : -INFINITY >> + pgsql-data-status : STREAMING|ASYNC >> + pgsql-status : STOP >> * Node b.mydomain.com: >> + master-pgsql:1 : -INFINITY >> + pgsql-data-status : DISCONNECT >> + pgsql-status : STOP >> >> Migration summary: >> * Node a.mydomain.com: >> * Node b.mydomain.com: >> pgsql:1: migration-threshold=1 fail-count=1000000 >> * Node c.mydomain.com: >> >> Failed actions: >> pgsql:1_start_0 (node=b.mydomain.com, call=86, rc=1, >> status=complete): unknown error >> >> Also it breac sync on b node: >> postgres=# select client_addr,sync_state from pg_stat_replication; >> client_addr | sync_state >> --------------+------------ >> 192.168.10.3 | async >> (1 row) >> >> postgres=# >> >> Okay. I cleanup again. And... >> >> ============ >> Last updated: Sat Dec 14 20:26:13 2013 >> Last change: Sat Dec 14 20:26:08 2013 via crm_attribute on a.mydomain.com >> Stack: openais >> Current DC: a.mydomain.com - partition with quorum >> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff >> 3 Nodes configured, 3 expected votes >> 6 Resources configured. >> ============ >> >> Online: [ a.mydomain.com c.mydomain.com b.mydomain.com ] >> >> Resource Group: master >> pgsql-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com >> Master/Slave Set: msPostgresql [pgsql] >> Masters: [ a.mydomain.com ] >> Slaves: [ b.mydomain.com c.mydomain.com ] >> apache-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com >> apache (ocf::heartbeat:apache): Started a.mydomain.com >> >> Node Attributes: >> * Node a.mydomain.com: >> + master-pgsql:0 : 1000 >> + pgsql-data-status : LATEST >> + pgsql-master-baseline : 000000001C000160 >> + pgsql-status : PRI >> * Node c.mydomain.com: >> + master-pgsql:1 : -INFINITY >> + master-pgsql:2 : -INFINITY >> + pgsql-data-status : STREAMING|ASYNC >> + pgsql-status : STOP >> * Node b.mydomain.com: >> + master-pgsql:1 : -INFINITY >> + pgsql-data-status : STREAMING|ASYNC >> + pgsql-status : HS:async >> >> Migration summary: >> * Node a.mydomain.com: >> * Node b.mydomain.com: >> * Node c.mydomain.com: >> >> and: >> postgres=# select client_addr,sync_state from pg_stat_replication; >> client_addr | sync_state >> --------------+------------ >> 192.168.10.3 | async >> 192.168.10.2 | async >> (2 rows) >> >> postgres=# >> >> So, problem is in double master-pgsql on c node. How I can fix it? >> >> >> >> 2013/12/14 Takehiro Matsushima <[email protected]> >> >>> > About your questions: >>> > >>> > I have two questions. >>> > >>> > 1. As you can see - now I have not hawe two master status in one node >>> > 2. My node_list contains is a.mydomain.com b.mydomain.com >>> c.mydomain.com >>> >>> Thank you, it is no problem. >>> >>> >>> > I try start pgsql and cleanup pgsq. And got same error. Why RA down >>> pgsql >>> > on a node? >>> > I try cleanup few times and got this: >>> > ... >>> > Migration summary: >>> > * Node a.mydomain.com: >>> > * Node b.mydomain.com: >>> > pgsql:1: migration-threshold=1 fail-count=1000000 >>> > * Node c.mydomain.com: >>> > >>> > Failed actions: >>> > pgsql:1_start_0 (node=b.mydomain.com, call=64, rc=1, >>> status=complete): >>> > unknown error >>> >>> Did you remove "PGSQL.lock" file before cleanup? >>> If there is this lock file, PostgreSQL cannot start on the node, of >>> course as master, as slave too. >>> _______________________________________________ >>> Linux-HA mailing list >>> [email protected] >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> >> >> > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
