Hold on. I now fix it. Just stop pgsql on c node and restart corosync on c
node.



2013/12/20 Andrey Rogovsky <[email protected]>

> I don't get answer and try manual cleanup c node few times
> But this is not help me. I have this status:
> Node Attributes:
> * Node a.mydomain.com:
>     + master-pgsql:0                   : 1000
>     + pgsql-data-status               : LATEST
>     + pgsql-master-baseline           : 000000001C000160
>     + pgsql-status                     : PRI
> * Node c.mydomain.com:
>     + master-pgsql:1                   : -INFINITY
>     + master-pgsql:2                   : -INFINITY
>     + pgsql-data-status               : STREAMING|ASYNC
>     + pgsql-status                     : STOP
> * Node b.mydomain.com:
>     + master-pgsql:1                   : -INFINITY
>     + pgsql-data-status               : STREAMING|ASYNC
>     + pgsql-status                     : HS:async
>
>
> May be someone know how to switch form stop to HS:async?
>
>
>
> 2013/12/14 Andrey Rogovsky <[email protected]>
>
>> Ok, I was stop pgsql on all nodes, delete lock file, start manual
>> replication from b and c to a:
>> root@a:~#  sudo -u postgres psql
>> could not change directory to "/root": Permission denied
>> psql (9.3.2)
>> Type "help" for help.
>>
>> postgres=# select client_addr,sync_state from pg_stat_replication;
>>  client_addr  | sync_state
>> --------------+------------
>>  192.168.10.3 | async
>>  192.168.10.2 | async
>> (2 rows)
>>
>>
>> After I cleanup and got this:
>> root@a:~# crm_mon  -VAf -1
>> ============
>> Last updated: Sat Dec 14 20:24:04 2013
>> Last change: Sat Dec 14 20:23:57 2013 via crm_attribute on a.mydomain.com
>> Stack: openais
>> Current DC: a.mydomain.com - partition with quorum
>> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
>> 3 Nodes configured, 3 expected votes
>> 6 Resources configured.
>> ============
>>
>> Online: [ a.mydomain.com c.mydomain.com b.mydomain.com ]
>>
>>  Resource Group: master
>>      pgsql-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com
>>  Master/Slave Set: msPostgresql [pgsql]
>>      Masters: [ a.mydomain.com ]
>>      Slaves: [ c.mydomain.com ]
>>      Stopped: [ pgsql:2 ]
>>  apache-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com
>>  apache (ocf::heartbeat:apache): Started a.mydomain.com
>>
>> Node Attributes:
>> * Node a.mydomain.com:
>>     + master-pgsql:0                   : 1000
>>      + pgsql-data-status               : LATEST
>>     + pgsql-master-baseline           : 000000001C000160
>>      + pgsql-status                     : PRI
>> * Node c.mydomain.com:
>>     + master-pgsql:1                   : -INFINITY
>>      + master-pgsql:2                   : -INFINITY
>>     + pgsql-data-status               : STREAMING|ASYNC
>>     + pgsql-status                     : STOP
>> * Node b.mydomain.com:
>>     + master-pgsql:1                   : -INFINITY
>>     + pgsql-data-status               : DISCONNECT
>>     + pgsql-status                     : STOP
>>
>> Migration summary:
>> * Node a.mydomain.com:
>> * Node b.mydomain.com:
>>    pgsql:1: migration-threshold=1 fail-count=1000000
>> * Node c.mydomain.com:
>>
>> Failed actions:
>>     pgsql:1_start_0 (node=b.mydomain.com, call=86, rc=1,
>> status=complete): unknown error
>>
>> Also it breac sync on b node:
>> postgres=# select client_addr,sync_state from pg_stat_replication;
>>  client_addr  | sync_state
>> --------------+------------
>>   192.168.10.3 | async
>> (1 row)
>>
>> postgres=#
>>
>> Okay. I cleanup again. And...
>>
>> ============
>> Last updated: Sat Dec 14 20:26:13 2013
>> Last change: Sat Dec 14 20:26:08 2013 via crm_attribute on a.mydomain.com
>> Stack: openais
>> Current DC: a.mydomain.com - partition with quorum
>> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
>> 3 Nodes configured, 3 expected votes
>> 6 Resources configured.
>> ============
>>
>> Online: [ a.mydomain.com c.mydomain.com b.mydomain.com ]
>>
>>  Resource Group: master
>>      pgsql-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com
>>  Master/Slave Set: msPostgresql [pgsql]
>>      Masters: [ a.mydomain.com ]
>>      Slaves: [ b.mydomain.com c.mydomain.com ]
>>   apache-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com
>>  apache (ocf::heartbeat:apache): Started a.mydomain.com
>>
>> Node Attributes:
>> * Node a.mydomain.com:
>>     + master-pgsql:0                   : 1000
>>      + pgsql-data-status               : LATEST
>>     + pgsql-master-baseline           : 000000001C000160
>>      + pgsql-status                     : PRI
>> * Node c.mydomain.com:
>>     + master-pgsql:1                   : -INFINITY
>>      + master-pgsql:2                   : -INFINITY
>>     + pgsql-data-status               : STREAMING|ASYNC
>>     + pgsql-status                     : STOP
>> * Node b.mydomain.com:
>>     + master-pgsql:1                   : -INFINITY
>>     + pgsql-data-status               : STREAMING|ASYNC
>>     + pgsql-status                     : HS:async
>>
>> Migration summary:
>> * Node a.mydomain.com:
>> * Node b.mydomain.com:
>> * Node c.mydomain.com:
>>
>> and:
>> postgres=# select client_addr,sync_state from pg_stat_replication;
>>  client_addr  | sync_state
>> --------------+------------
>>  192.168.10.3 | async
>>  192.168.10.2 | async
>> (2 rows)
>>
>> postgres=#
>>
>> So, problem is in double master-pgsql on c node. How I can fix it?
>>
>>
>>
>> 2013/12/14 Takehiro Matsushima <[email protected]>
>>
>>> > About your questions:
>>> >
>>> > I have two questions.
>>> >
>>> > 1. As you can see - now I have not hawe two master status in one node
>>> > 2. My node_list contains is a.mydomain.com b.mydomain.com
>>> c.mydomain.com
>>>
>>> Thank you, it is no problem.
>>>
>>>
>>> > I try start pgsql and cleanup pgsq. And got same error. Why RA down
>>> pgsql
>>> > on a node?
>>> > I try cleanup few times and got this:
>>> > ...
>>> > Migration summary:
>>> > * Node a.mydomain.com:
>>> > * Node b.mydomain.com:
>>> >    pgsql:1: migration-threshold=1 fail-count=1000000
>>> > * Node c.mydomain.com:
>>> >
>>> > Failed actions:
>>> >     pgsql:1_start_0 (node=b.mydomain.com, call=64, rc=1,
>>> status=complete):
>>> > unknown error
>>>
>>> Did you remove "PGSQL.lock" file before cleanup?
>>> If there is this lock file, PostgreSQL cannot start on the node, of
>>> course as master, as slave too.
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>
>>
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to