Re: [Linux-HA] How to tell master-slave group set one node to master?

Andrey Rogovsky Fri, 13 Dec 2013 23:01:34 -0800

Hi

I have similar config, but got error:




Dec 14 10:48:19 a pengine: [29591]: ERROR: unpack_rsc_op: Preventing
msPostgresql from re-starting on a.mydomain.com: operation monitor failed
'invalid parameter' (rc=2)
Dec 14 10:48:19 a pengine: [29591]: ERROR: unpack_rsc_op: Preventing
msPostgresql from re-starting on b.mydomain.com: operation monitor failed
'invalid parameter' (rc=2)
Dec 14 10:48:19 a pengine: [29591]: ERROR: unpack_rsc_op: Preventing
msPostgresql from re-starting on c.mydomain.com: operation monitor failed
'invalid parameter' (rc=2)

Online: [ a.mydomain.com c.mydomain.com b.mydomain.com ]

 apache-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com
 apache (ocf::heartbeat:apache): Started a.mydomain.com

Node Attributes:
* Node a.mydomain.com:
    + pgsql-data-status               : LATEST
* Node c.mydomain.com:
    + pgsql-data-status               : STREAMING|ASYNC
    + pgsql-status                     : HS:async
* Node b.mydomain.com:
    + pgsql-data-status               : STREAMING|ASYNC
    + pgsql-status                     : HS:async

Migration summary:
* Node a.mydomain.com:
* Node b.mydomain.com:
* Node c.mydomain.com:

Failed actions:
    pgsql:0_monitor_0 (node=a.mydomain.com, call=31, rc=2,
status=complete): invalid parameter
    pgsql:0_monitor_0 (node=b.mydomain.com, call=26, rc=2,
status=complete): invalid parameter
    pgsql:0_monitor_0 (node=c.mydomain.com, call=22, rc=2,
status=complete): invalid parameter

How I can debug monitor peration on pgsql primitive for understand what
parametr is invaild and why?

I just grep logs and found a lot of erors:
Dec 13 20:39:32 a lrmd: [29589]: info: RA output: (pgsql:0:probe:stderr)
/usr/lib/ocf/resource.d//heartbeat/pgsql: 1642:
/usr/lib/ocf/resource.d//heartbeat/pgsql: Bad substitution
Dec 13 20:40:30 a lrmd: [29589]: info: RA output: (pgsql:0:probe:stderr)
/usr/lib/ocf/resource.d//heartbeat/pgsql: 1642:
/usr/lib/ocf/resource.d//heartbeat/pgsql: Bad substitution
Dec 13 20:43:55 a lrmd: [29589]: info: RA output: (pgsql:0:probe:stderr)
/usr/lib/ocf/resource.d//heartbeat/pgsql: 1642:
/usr/lib/ocf/resource.d//heartbeat/pgsql: Bad substitution
Dec 13 20:48:04 a lrmd: [29589]: info: RA output: (pgsql:0:probe:stderr)
/usr/lib/ocf/resource.d//heartbeat/pgsql: 1642:
/usr/lib/ocf/resource.d//heartbeat/pgsql: Bad substitution
Dec 13 20:48:16 a lrmd: [29589]: info: RA output: (pgsql:0:probe:stderr)
/usr/lib/ocf/resource.d//heartbeat/pgsql: 1642:
/usr/lib/ocf/resource.d//heartbeat/pgsql: Bad substitution

But I can't udnerstand what this code will do:
                if grep -q "$rep_mode_string" $OCF_RESKEY_config; then
                    ocf_log info "deleting include directive from
$OCF_RESKEY_config"
                    sed -i "/${rep_mode_string//\//\\/}/d"
$OCF_RESKEY_config
                fi



2013/12/14 Takehiro Matsushima <[email protected]>

> I built it on Debian 7.2 in VirtualBox with reference
> http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster
> In postgresql.conf, "replication_timeout" is renamed to
> "wal_sender_timeout" since PostgreSQL 9.3.
> eth0 for service, eth1 for heartbeat and replication.
>
> It works well.
>
> My crm configuration is following;
> node pg1 \
>         attributes pgsql-data-status="LATEST"
> node pg2 \
>         attributes pgsql-data-status="STREAMING|ASYNC"
> node pg3 \
>         attributes pgsql-data-status="STREAMING|ASYNC"
> primitive pgsql ocf:heartbeat:pgsql \
>         params pgctl="/usr/lib/postgresql/9.3/bin/pg_ctl"
> psql="/usr/lib/postgresql/9.3/bin/psql"
> pgdata="/var/lib/postgresql/9.3/main/" start_opt="-p 5432"
> rep_mode="async" node_list="pg1 pg2 pg3"
> tmpdir="/var/lib/postgresql/tmp" restore_command="cp
> /var/lib/postgresql/9.3/main/pg_archive/%f %p"
> primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
> keepalives_count=5" master_ip="192.168.111.4"
> restart_on_promote="true"
> config="/etc/postgresql/9.3/main/postgresql.conf" \
>         op start interval="0s" timeout="60s" on-fail="restart" \
>         op monitor interval="4s" timeout="60s" on-fail="restart" \
>         op monitor interval="3s" role="Master" timeout="60s"
> on-fail="restart" \
>         op promote interval="0s" timeout="60s" on-fail="restart" \
>         op demote interval="0s" timeout="60s" on-fail="stop" \
>         op stop interval="0s" timeout="60s" on-fail="block" \
>         op notify interval="0s" timeout="60s"
> primitive vip-master ocf:heartbeat:IPaddr2 \
>         params ip="192.168.110.4" nic="eth0" cidr_netmask="24" \
>         op start interval="0s" timeout="60s" on-fail="stop" \
>         op monitor interval="10s" timeout="60s" on-fail="restart" \
>         op stop interval="0s" timeout="60s" on-fail="block"
> primitive vip-rep ocf:heartbeat:IPaddr2 \
>         params ip="192.168.111.4" nic="eth1" cidr_netmask="24" \
>         meta migration-threshold="0" \
>         op start interval="0s" timeout="60s" on-fail="restart" \
>         op monitor interval="10s" timeout="60s" on-fail="restart" \
>         op stop interval="0s" timeout="60s" on-fail="block"
> group master-group vip-master vip-rep
> ms msPostgresql pgsql \
>         meta master-max="1" master-node-max="1" clone-max="3"
> clone-node-max="1" notify="true"
> colocation rsc_colocation-1 inf: master-group msPostgresql:Master
> order rsc_order-1 0: msPostgresql:promote master-group:start
> symmetrical=false
> order rsc_order-2 0: msPostgresql:demote master-group:stop
> symmetrical=false
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
>         cluster-infrastructure="openais" \
>         stonith-enabled="false" \
>         no-quorum-policy="ignore" \
>         expected-quorum-votes="3" \
>         last-lrm-refresh="1386979195"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="INFINITY" \
>         migration-threshold="1"
>
>
> And "crm_mon -A" is following;
>
> ============
> Last updated: Sat Dec 14 09:00:12 2013
> Last change: Sat Dec 14 09:00:10 2013 via crm_attribute on pg1
> Stack: openais
> Current DC: pg1 - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 3 Nodes configured, 3 expected votes
> 5 Resources configured.
> ============
>
> Online: [ pg1 pg2 pg3 ]
>
>  Resource Group: master-group
>      vip-master (ocf::heartbeat:IPaddr2):       Started pg1
>      vip-rep    (ocf::heartbeat:IPaddr2):       Started pg1
>  Master/Slave Set: msPostgresql [pgsql]
>      Masters: [ pg1 ]
>      Slaves: [ pg2 pg3 ]
>
> Node Attributes:
> * Node pg1:
>     + master-pgsql:0                    : 1000
>     + pgsql-data-status                 : LATEST
>     + pgsql-master-baseline             : 0000000009000084
>     + pgsql-status                      : PRI
> * Node pg2:
>     + master-pgsql:1                    : -INFINITY
>     + pgsql-data-status                 : STREAMING|ASYNC
>     + pgsql-status                      : HS:async
> * Node pg3:
>     + master-pgsql:2                    : -INFINITY
>     + pgsql-data-status                 : STREAMING|ASYNC
>     + pgsql-status                      : HS:async
>
>
> "pkill postgresql" on pg1, then...
>
>  Resource Group: master-group
>      vip-master (ocf::heartbeat:IPaddr2):       Started pg2
>      vip-rep    (ocf::heartbeat:IPaddr2):       Started pg2
>  Master/Slave Set: msPostgresql [pgsql]
>      Masters: [ pg2 ]
>      Slaves: [ pg3 ]
>      Stopped: [ pgsql:0 ]
>
> Node Attributes:
> * Node pg1:
>     + master-pgsql:0                    : -INFINITY
>     + pgsql-data-status                 : DISCONNECT
>     + pgsql-status                      : STOP
> * Node pg2:
>     + master-pgsql:1                    : 1000
>     + pgsql-data-status                 : LATEST
>     + pgsql-master-baseline             : 00000000090000E4
>     + pgsql-status                      : PRI
> * Node pg3:
>     + master-pgsql:2                    : -INFINITY
>     + pgsql-data-status                 : DISCONNECT
>     + pgsql-status                      : HS:alone
>
> Failed actions:
>     pgsql:0_monitor_3000 (node=pg1, call=72, rc=7, status=complete): not
> running
>
>
>
> Master has not switched when I killed postgresql running on async slave
> node.
> It works very well.
>
> I hope this will help.
>
>
> 2013/12/14 Andrey Rogovsky <[email protected]>:
> > 1. Of course, I did it. For now postgresql replication is cleaned and
> used
> > async both servers.
> >
> >
> >
> > 2013/12/13 Takehiro Matsushima <[email protected]>
> >
> >> 1. Well, it means rebuilding PostgreSQL replication cluster by using
> >> pg_basebackup or rsync or something.
> >> 2. Thanks, but I'll try fist.
> >>
> >> 2013/12/14 Andrey Rogovsky <[email protected]>:
> >> > 1. You meant crm resource cleanup or something else?
> >> >
> >> > 2. If you want - I can give you logs.
> >> >
> >> >
> >> >
> >> > 2013/12/13 Takehiro Matsushima <[email protected]>
> >> >
> >> >> 1. Temporarily, how about cleanup completely all nodes once? like
> >> >> master is "a", slaves are "b" and "c".
> >> >>
> >> >> 2. It looks like it caused by RA... umm... I'll try building a
> cluster
> >> >> on Debian 7.
> >> >>
> >> >> 2013/12/14 Andrey Rogovsky <[email protected]>:
> >> >> > 1. How I can find status in the log? What exactly I need search in?
> >> >> >
> >> >> > 2. I did it and have this situation:
> >> >> > On a node:
> >> >> > root@a:~# sudo -u postgres psql
> >> >> > could not change directory to "/root": Permission denied
> >> >> > psql (9.3.2)
> >> >> > Type "help" for help.
> >> >> >
> >> >> > postgres=# select client_addr,sync_state from pg_stat_replication;
> >> >> >  client_addr  | sync_state
> >> >> > --------------+------------
> >> >> >  192.168.10.2 | async
> >> >> >  192.168.10.3 | async
> >> >> > (2 rows)
> >> >> >
> >> >> > So, pgsql is correct. But...
> >> >> > root@a:~# crm_mon -VAf -1
> >> >> > crm_mon[16456]: 2013/12/13_22:15:30 ERROR: unpack_rsc_op:
> Preventing
> >> >> > msPostgresql from re-starting on a.mydomain.com: operation monitor
> >> >> failed
> >> >> > 'invalid parameter' (rc=2)
> >> >> > crm_mon[16456]: 2013/12/13_22:15:30 ERROR: unpack_rsc_op:
> Preventing
> >> >> > msPostgresql from re-starting on b.mydomain.com: operation monitor
> >> >> failed
> >> >> > 'invalid parameter' (rc=2)
> >> >> > crm_mon[16456]: 2013/12/13_22:15:30 ERROR: unpack_rsc_op:
> Preventing
> >> >> > msPostgresql from re-starting on c.mydomain.com: operation monitor
> >> >> failed
> >> >> > 'invalid parameter' (rc=2)
> >> >> > ============
> >> >> > Last updated: Fri Dec 13 22:15:30 2013
> >> >> > Last change: Fri Dec 13 20:48:18 2013 via crmd on c.mydomain.com
> >> >> > Stack: openais
> >> >> > Current DC: a.mydomain.com - partition with quorum
> >> >> > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> >> >> > 3 Nodes configured, 3 expected votes
> >> >> > 6 Resources configured.
> >> >> > ============
> >> >> >
> >> >> > Online: [ a.mydomain.com c.mydomain.com b.mydomain.com ]
> >> >> >
> >> >> >  apache-master-ip (ocf::heartbeat:IPaddr2): Started a.mydomain.com
> >> >> >  apache (ocf::heartbeat:apache): Started a.mydomain.com
> >> >> >
> >> >> > Node Attributes:
> >> >> > * Node a.mydomain.com:
> >> >> >     + pgsql-data-status               : LATEST
> >> >> > * Node c.mydomain.com:
> >> >> >     + pgsql-data-status               : STREAMING|ASYNC
> >> >> >     + pgsql-status                     : HS:async
> >> >> > * Node b.mydomain.com:
> >> >> >     + pgsql-data-status               : STREAMING|ASYNC
> >> >> >     + pgsql-status                     : HS:async
> >> >> >
> >> >> > Migration summary:
> >> >> > * Node a.mydomain.com:
> >> >> > * Node b.mydomain.com:
> >> >> > * Node c.mydomain.com:
> >> >> >
> >> >> > Failed actions:
> >> >> >     pgsql:0_monitor_0 (node=a.mydomain.com, call=31, rc=2,
> >> >> > status=complete): invalid parameter
> >> >> >     pgsql:0_monitor_0 (node=b.mydomain.com, call=26, rc=2,
> >> >> > status=complete): invalid parameter
> >> >> >     pgsql:0_monitor_0 (node=c.mydomain.com, call=22, rc=2,
> >> >> > status=complete): invalid parameter
> >> >> > root@a:~#
> >> >> >
> >> >> > How I can fix it?
> >> >> >
> >> >> >
> >> >> >
> >> >> > 2013/12/13 Takehiro Matsushima <[email protected]>
> >> >> >
> >> >> >> 1. Excuse me, could you tell me status before a.mydomain.comfails?
> >> >> >>
> >> >> >> 2. Sorry, replace rep_mode="sync" with rep_mode="async" defined in
> >> >> >> primitive pgsql.
> >> >> >>
> >> >> >> 2013/12/14 Andrey Rogovsky <[email protected]>:
> >> >> >> > 1. If fall down:
> >> >> >> > ============
> >> >> >> > Last updated: Fri Dec 13 19:06:51 2013
> >> >> >> > Last change: Fri Dec 13 10:06:49 2013 via cibadmin on
> >> a.mydomain.com
> >> >> >> > Stack: openais
> >> >> >> > Current DC: c.mydomain.com - partition with quorum
> >> >> >> > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> >> >> >> > 3 Nodes configured, 3 expected votes
> >> >> >> > 6 Resources configured.
> >> >> >> > ============
> >> >> >> >
> >> >> >> > Online: [ a.mydomain.com c.mydomain.com b.mydomain.com ]
> >> >> >> >
> >> >> >> > Full list of resources:
> >> >> >> >
> >> >> >> >  Resource Group: master
> >> >> >> >      pgsql-master-ip (ocf::heartbeat:IPaddr2): Started
> >> b.mydomain.com
> >> >> >> >  Master/Slave Set: msPostgresql [pgsql]
> >> >> >> >      Masters: [ b.mydomain.com ]
> >> >> >> >      Slaves: [ c.mydomain.com ]
> >> >> >> >      Stopped: [ pgsql:0 ]
> >> >> >> >  apache-master-ip (ocf::heartbeat:IPaddr2): Started
> b.mydomain.com
> >> >> >> >  apache (ocf::heartbeat:apache): Started b.mydomain.com
> >> >> >> >
> >> >> >> > Node Attributes:
> >> >> >> > * Node a.mydomain.com:
> >> >> >> >     + master-pgsql:0                   : -INFINITY
> >> >> >> >     + master-pgsql:1                   : 1000
> >> >> >> >     + pgsql-data-status               : DISCONNECT
> >> >> >> >     + pgsql-status                     : STOP
> >> >> >> > * Node c.mydomain.com:
> >> >> >> >     + master-pgsql:2                   : 100
> >> >> >> >     + pgsql-data-status               : STREAMING|SYNC
> >> >> >> >     + pgsql-status                     : HS:sync
> >> >> >> > * Node b.mydomain.com:
> >> >> >> >     + master-pgsql:0                   : -INFINITY
> >> >> >> >     + master-pgsql:1                   : 1000
> >> >> >> >     + pgsql-data-status               : LATEST
> >> >> >> >     + pgsql-master-baseline           : 000000000F000090
> >> >> >> >     + pgsql-status                     : PRI
> >> >> >> >
> >> >> >> > Migration summary:
> >> >> >> > * Node a.mydomain.com:
> >> >> >> >    pgsql:0: migration-threshold=1 fail-count=1
> >> >> >> > * Node c.mydomain.com:
> >> >> >> > * Node b.mydomain.com:
> >> >> >> >
> >> >> >> > Failed actions:
> >> >> >> >     pgsql:0_monitor_4000 (node=a.mydomain.com, call=89, rc=7,
> >> >> >> > status=complete): not running
> >> >> >> >
> >> >> >> > This is in the log file on a node:
> >> >> >> > Dec 10 20:49:57 a pgsql[903]: INFO: Don't check
> >> >> >> > /var/lib/postgresql/9.3/main during probe
> >> >> >> > Dec 10 20:49:57 a crmd: [893]: info: process_lrm_event: LRM
> >> operation
> >> >> >> > pgsql-master-ip_monitor_0 (call=2, rc=7, cib-update=7,
> >> confirmed=true)
> >> >> >> not
> >> >> >> > running
> >> >> >> > Dec 10 20:49:57 a pgsql[903]: INFO: PostgreSQL is down
> >> >> >> > Dec 10 20:49:57 a lrmd: [890]: info: operation monitor[3] on
> >> pgsql:1
> >> >> for
> >> >> >> > client 893: pid 903 exited with return code 7
> >> >> >> > Dec 10 20:49:57 a crmd: [893]: info: process_lrm_event: LRM
> >> operation
> >> >> >> > pgsql:1_monitor_0 (call=3, rc=7, cib-update=8, confirmed=true)
> not
> >> >> >> running
> >> >> >> > Dec 10 20:49:57 a attrd: [891]: notice: attrd_trigger_update:
> >> Sending
> >> >> >> flush
> >> >> >> > op to all hosts for: probe_complete (true)
> >> >> >> > Dec 10 20:49:57 a lrmd: [890]: info: rsc:pgsql:1 start[4] (pid
> 986)
> >> >> >> > Dec 10 20:49:57 a pgsql[986]: INFO: Changing pgsql-status on
> >> >> >> > a.mydomain.com: ->STOP.
> >> >> >> > Dec 10 20:49:57 a attrd: [891]: notice: attrd_trigger_update:
> >> Sending
> >> >> >> flush
> >> >> >> > op to all hosts for: pgsql-status (STOP)
> >> >> >> > Dec 10 20:49:57 a attrd: [891]: notice: attrd_trigger_update:
> >> Sending
> >> >> >> flush
> >> >> >> > op to all hosts for: master-pgsql:1 (-INFINITY)
> >> >> >> > Dec 10 20:49:57 a pgsql[986]: INFO: Set all nodes into async
> mode.
> >> >> >> > Dec 10 20:49:57 a pgsql[986]: INFO: server starting
> >> >> >> > Dec 10 20:49:57 a pgsql[986]: INFO: PostgreSQL start command
> sent.
> >> >> >> > Dec 10 20:49:58 a lrmd: [890]: info: RA output:
> >> (pgsql:1:start:stderr)
> >> >> >> > psql: FATAL:  the database system is starting up
> >> >> >> > Dec 10 20:49:58 a pgsql[986]: WARNING: Can't get PostgreSQL
> >> recovery
> >> >> >> > status. rc=2
> >> >> >> > Dec 10 20:49:58 a pgsql[986]: WARNING: Connection error
> >> (connection to
> >> >> >> the
> >> >> >> > server went bad and the session was not interactive) occurred
> while
> >> >> >> > executing the psql command.
> >> >> >> > Dec 10 20:49:59 a pgsql[986]: INFO: PostgreSQL is started.
> >> >> >> > Dec 10 20:49:59 a pgsql[986]: INFO: Changing pgsql-status on
> >> >> >> > a.mydomain.com: ->HS:alone.
> >> >> >> > Dec 10 20:49:59 a attrd: [891]: notice: attrd_trigger_update:
> >> Sending
> >> >> >> flush
> >> >> >> > op to all hosts for: pgsql-status (HS:alone)
> >> >> >> > Dec 10 20:49:59 a lrmd: [890]: info: operation start[4] on
> pgsql:1
> >> for
> >> >> >> > client 893: pid 986 exited with return code 0
> >> >> >> > Dec 10 20:49:59 a crmd: [893]: info: process_lrm_event: LRM
> >> operation
> >> >> >> > pgsql:1_start_0 (call=4, rc=0, cib-update=9, confirmed=true) ok
> >> >> >> > Dec 10 20:49:59 a lrmd: [890]: info: rsc:pgsql:1 notify[5] (pid
> >> 1163)
> >> >> >> > Dec 10 20:49:59 a lrmd: [890]: info: operation notify[5] on
> pgsql:1
> >> >> for
> >> >> >> > client 893: pid 1163 exited with return code 0
> >> >> >> > Dec 10 20:49:59 a crmd: [893]: info: process_lrm_event: LRM
> >> operation
> >> >> >> > pgsql:1_notify_0 (call=5, rc=0, cib-update=0, confirmed=true) ok
> >> >> >> > Dec 10 20:49:59 a lrmd: [890]: info: rsc:pgsql:1 monitor[6] (pid
> >> 1207)
> >> >> >> > Dec 10 20:49:59 a attrd: [891]: notice: attrd_trigger_update:
> >> Sending
> >> >> >> flush
> >> >> >> > op to all hosts for: pgsql-status (HS:alone)
> >> >> >> >
> >> >> >> > I think it is wrong, becouse is 2 live nodes. One can stay as
> >> master.
> >> >> >> >
> >> >> >> > Also this is in postgresql log on a node:
> >> >> >> > 2013-12-06 10:56:53 MSK WARNING:  archive_mode enabled, yet
> >> >> >> archive_command
> >> >> >> > is not set
> >> >> >> > 2013-12-06 10:57:37 MSK LOG:  received SIGHUP, reloading
> >> configuration
> >> >> >> files
> >> >> >> > 2013-12-06 10:57:37 MSK LOG:  parameter "archive_command"
> changed
> >> to
> >> >> "cp
> >> >> >> %p
> >> >> >> > /var/lib/postgresql/9.3/pg_archive/%f"
> >> >> >> > 2013-12-06 10:57:43 MSK ERROR:  a backup is not in progress
> >> >> >> > 2013-12-06 10:57:43 MSK STATEMENT:  SELECT pg_stop_backup()
> >> >> >> > 2013-12-07 10:24:22 MSK LOG:  received fast shutdown request
> >> >> >> > 2013-12-07 10:24:22 MSK LOG:  aborting any active transactions
> >> >> >> > 2013-12-07 10:24:22 MSK LOG:  autovacuum launcher shutting down
> >> >> >> > 2013-12-07 10:24:22 MSK LOG:  shutting down
> >> >> >> > 2013-12-07 10:24:22 MSK LOG:  database system is shut down
> >> >> >> > 2013-12-07 10:24:29 MSK LOG:  database system was shut down at
> >> >> 2013-12-07
> >> >> >> > 10:24:22 MSK
> >> >> >> > 2013-12-07 10:24:29 MSK LOG:  autovacuum launcher started
> >> >> >> > 2013-12-07 10:24:29 MSK LOG:  database system is ready to accept
> >> >> >> connections
> >> >> >> > 2013-12-07 10:24:29 MSK LOG:  incomplete startup packet
> >> >> >> > 2013-12-07 10:24:34 MSK LOG:  received fast shutdown request
> >> >> >> > 2013-12-07 10:24:34 MSK LOG:  aborting any active transactions
> >> >> >> > 2013-12-07 10:24:34 MSK LOG:  autovacuum launcher shutting down
> >> >> >> > 2013-12-07 10:24:34 MSK LOG:  shutting down
> >> >> >> > 2013-12-07 10:24:34 MSK LOG:  database system is shut down
> >> >> >> > 2013-12-07 14:31:11 MSK LOG:  database system was shut down in
> >> >> recovery
> >> >> >> at
> >> >> >> > 2013-12-07 14:29:19 MSK
> >> >> >> > cp: cannot stat
> >> >> `/var/lib/postgresql/9.3/pg_archive/00000002.history': No
> >> >> >> > such file or directory
> >> >> >> > 2013-12-07 14:31:11 MSK LOG:  entering standby mode
> >> >> >> > cp: cannot stat
> >> >> >> > `/var/lib/postgresql/9.3/pg_archive/000000010000000000000007':
> No
> >> such
> >> >> >> file
> >> >> >> > or directory
> >> >> >> > 2013-12-07 14:31:11 MSK LOG:  consistent recovery state reached
> at
> >> >> >> 0/7000090
> >> >> >> > 2013-12-07 14:31:11 MSK LOG:  record with zero length at
> 0/7000090
> >> >> >> > 2013-12-07 14:31:11 MSK LOG:  database system is ready to accept
> >> read
> >> >> >> only
> >> >> >> > connections
> >> >> >> > 2013-12-07 14:31:12 MSK LOG:  incomplete startup packet
> >> >> >> > 2013-12-07 14:31:14 MSK FATAL:  could not connect to the primary
> >> >> server:
> >> >> >> > could not connect to server: No route to host
> >> >> >> >                 Is the server running on host "192.168.10.200"
> and
> >> >> >> accepting
> >> >> >> >                 TCP/IP connections on port 5432?
> >> >> >> >
> >> >> >> > Why master not got shutdown request? It is life.
> >> >> >> >
> >> >> >> >
> >> >> >> > 2. There is my config:
> >> >> >> > node a.mydomain.com \
> >> >> >> >         attributes pgsql-data-status="DISCONNECT"
> >> >> >> > node b.mydomain.com \
> >> >> >> >         attributes pgsql-data-status="LATEST"
> >> pgsql-status="HS:async"
> >> >> >> > node c.mydomain.com \
> >> >> >> >         attributes pgsql-data-status="STREAMING|SYNC"
> >> >> >> > pgsql-status="HS:async"
> >> >> >> > primitive apache ocf:heartbeat:apache \
> >> >> >> >         params configfile="/etc/apache2/apache2.conf" \
> >> >> >> >         op monitor interval="1min"
> >> >> >> > primitive apache-master-ip ocf:heartbeat:IPaddr2 \
> >> >> >> >         params ip="192.168.10.100" nic="peervpn0" \
> >> >> >> >         op monitor interval="30s"
> >> >> >> > primitive pgsql ocf:heartbeat:pgsql \
> >> >> >> >         params pgctl="/usr/lib/postgresql/9.3/bin/pg_ctl"
> >> >> >> > psql="/usr/bin/psql" pgdata="/var/lib/postgresql/9.3/main"
> >> >> start_opt="-p
> >> >> >> 543
> >> >> >> > 2" rep_mode="sync" node_list="a.mydomain.com b.mydomain.com
> >> >> >> c.mydomain.com"
> >> >> >> > restore_command="cp /v
> >> >> >> > ar/lib/postgresql/9.3/pg_archive/%f %p"
> master_ip="192.168.10.200"
> >> >> >> > restart_on_promote="true"
> config="/etc/postgresql/9.3/main/postgres
> >> >> >> > ql.conf" \
> >> >> >> >         op start interval="0s" timeout="60s" on-fail="restart" \
> >> >> >> >         op monitor interval="4s" timeout="60s"
> on-fail="restart" \
> >> >> >> >         op monitor interval="3s" role="Master" timeout="60s"
> >> >> >> > on-fail="restart" \
> >> >> >> >         op promote interval="0s" timeout="60s"
> on-fail="restart" \
> >> >> >> >         op demote interval="0s" timeout="60s" on-fail="stop" \
> >> >> >> >         op stop interval="0s" timeout="60s" on-fail="block" \
> >> >> >> >         op notify interval="0s" timeout="60s"
> >> >> >> > primitive pgsql-master-ip ocf:heartbeat:IPaddr2 \
> >> >> >> >         params ip="192.168.10.200" nic="peervpn0" \
> >> >> >> >         op start interval="0s" timeout="60s" on-fail="restart" \
> >> >> >> >         op monitor interval="10s" timeout="60s"
> on-fail="restart" \
> >> >> >> >         op stop interval="0s" timeout="60s" on-fail="block" \
> >> >> >> >         meta target-role="Started"
> >> >> >> > group master pgsql-master-ip
> >> >> >> > ms msPostgresql pgsql \
> >> >> >> >         meta master-max="1" master-node-max="1" clone-max="3"
> >> >> >> > clone-node-max="1" target-role="Master" notify="true"
> >> >> >> > location prefer-apache-node apache 150: b.mydomain.com
> >> >> >> > colocation apache-with-ip inf: apache apache-master-ip
> >> >> >> > colocation set_ip inf: master msPostgresql:Master
> >> >> >> > order apache-after-ip inf: apache-master-ip apache
> >> >> >> > order ip_down 0: msPostgresql:demote master:stop
> symmetrical=false
> >> >> >> > order ip_up 0: msPostgresql:promote master:start
> symmetrical=false
> >> >> >> > property $id="cib-bootstrap-options" \
> >> >> >> >
> >> dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
> >> >> >> >         cluster-infrastructure="openais" \
> >> >> >> >         expected-quorum-votes="3" \
> >> >> >> >         stonith-enabled="false" \
> >> >> >> >         crmd-transition-delay="0" \
> >> >> >> >         last-lrm-refresh="1386751770"
> >> >> >> > rsc_defaults $id="rsc-options" \
> >> >> >> >         resource-stickiness="100" \
> >> >> >> >         migration-threshold="1"
> >> >> >> >
> >> >> >> > Where I will add rep_mode="async"? In easch slave node
> attributes?
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > 2013/12/13 Takehiro Matsushima <[email protected]>
> >> >> >> >
> >> >> >> >> Hello,
> >> >> >> >>
> >> >> >> >> 1. How is it work stably after that? Failover works correctly,
> >> too?
> >> >> >> >>
> >> >> >> >> 2. I see, in this case, specify rep_mode="async" in crm config
> >> then
> >> >> >> >> all slaves run in async.
> >> >> >> >>
> >> >> >> >> --
> >> >> >> >> Regards,
> >> >> >> >> Takehiro Matsushima
> >> >> >> >> _______________________________________________
> >> >> >> >> Linux-HA mailing list
> >> >> >> >> [email protected]
> >> >> >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >> >> >> See also: http://linux-ha.org/ReportingProblems
> >> >> >> >>
> >> >> >> > _______________________________________________
> >> >> >> > Linux-HA mailing list
> >> >> >> > [email protected]
> >> >> >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >> >> > See also: http://linux-ha.org/ReportingProblems
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Regards,
> >> >> >> Takehiro Matsushima
> >> >> >> _______________________________________________
> >> >> >> Linux-HA mailing list
> >> >> >> [email protected]
> >> >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >> >> See also: http://linux-ha.org/ReportingProblems
> >> >> >>
> >> >> > _______________________________________________
> >> >> > Linux-HA mailing list
> >> >> > [email protected]
> >> >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >> > See also: http://linux-ha.org/ReportingProblems
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Regards,
> >> >> Takehiro Matsushima
> >> >> _______________________________________________
> >> >> Linux-HA mailing list
> >> >> [email protected]
> >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >> See also: http://linux-ha.org/ReportingProblems
> >> >>
> >> > _______________________________________________
> >> > Linux-HA mailing list
> >> > [email protected]
> >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> > See also: http://linux-ha.org/ReportingProblems
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Takehiro Matsushima
> >> _______________________________________________
> >> Linux-HA mailing list
> >> [email protected]
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> >>
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
>
>
> --
> Regards,
> Takehiro Matsushima
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] How to tell master-slave group set one node to master?

Reply via email to