>>> This doesn't look good? >>> >>> 25314 2011-01-21 10:00:43 CET ERROR: function pgpool_walrecrunning() does >>> not exist at character 32
Have you installed pgpool_walrecrunning()? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp > No answer here? > > That's real problem for me. > I would really appreciate any help or suggestion. > > Thanks. > > On Fri, Jan 21, 2011 at 9:21 AM, Armin Nesiren <[email protected]>wrote: > >> This is what bothers me >> from node1 postgresql.log: >> >> 25319 2011-01-21 10:01:23 CET LOG: statement: SELECT >> pg_current_xlog_location() >> 25319 2011-01-21 10:01:23 CET ERROR: recovery is in progress >> 25319 2011-01-21 10:01:23 CET HINT: WAL control functions cannot be >> executed during recovery. >> >> What this error mean? >> SELECT can not be executed on node1 and node1 have primary status. >> And where did I tell pgpool to switch primaries from node2 -> node1. >> >> new status: >> >> # psql -p 9999 -c "show pool_nodes;" -Upostgres >> node_id | hostname | port | status | lb_weight | state >> ---------+------------------+------+--------+-----------+------- >> 0 | node1 | 5432 | 2 | 0.500000 | P >> 1 | node2 | 5432 | 2 | 0.500000 | S >> >> >> On Fri, Jan 21, 2011 at 9:15 AM, Tatsuo Ishii <[email protected]> wrote: >> >>> This doesn't look good? >>> >>> 25314 2011-01-21 10:00:43 CET ERROR: function pgpool_walrecrunning() does >>> not exist at character 32 >>> -- >>> Tatsuo Ishii >>> SRA OSS, Inc. Japan >>> English: http://www.sraoss.co.jp/index_en.php >>> Japanese: http://www.sraoss.co.jp >>> >>> > Ok, this is before starting pcp_recovery_node: >>> > >>> > # psql -p 9999 -c "show pool_nodes;" -Upostgres >>> > node_id | hostname | port | status | lb_weight | state >>> > ---------+------------------+------+--------+-----------+------- >>> > 0 | node1 | 5432 | 3 | 0.500000 | S >>> > 1 | node2 | 5432 | 2 | 0.500000 | P >>> > >>> > then, I type: >>> > >>> > pcp_recovery_node 20 localhost 9898 postgres <password> 0 -d >>> > >>> > (I have disabled creating trigger file in failback for this example) >>> > >>> > pgpool.log (I have removed some debug duplicated messages): >>> > >>> > 2011-01-21 08:58:18 LOG: pid 4467: starting recovering node 0 >>> > >>> > >>> > 2011-01-21 08:58:18 LOG: pid 4467: starting recovery command: "SELECT >>> > pgpool_recovery('basebackup.sh', 'puma.intera.si', >>> '/pgdata/data90.C/')" >>> > >>> > 2011-01-21 08:58:19 DEBUG: pid 4434: starting health checking >>> > >>> > >>> > 2011-01-21 08:58:19 DEBUG: pid 4434: health_check: 0 th DB node status: >>> 3 >>> > >>> > >>> > 2011-01-21 08:58:19 DEBUG: pid 4434: health_check: 1 th DB node status: >>> 2 >>> > >>> > >>> > 2011-01-21 08:58:56 LOG: pid 4467: 1st stage is done >>> > >>> > >>> > 2011-01-21 08:58:56 LOG: pid 4467: check_postmaster_started: try to >>> > connect to postmaster on hostname:node1 database:postgres user:postgres >>> > (retry 0 times) >>> > 2011-01-21 08:58:56 LOG: pid 4467: check_postmaster_started: failed to >>> > connect to postmaster on hostname:node1 database:postgres user:postgres >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4467: check_postmaster_started: try to >>> > connect to postmaster on hostname:puma.intera.si database:postgres >>> > user:postgres (retry 1 times) >>> > 2011-01-21 08:58:59 LOG: pid 4467: 0 node restarted >>> > >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4467: send_failback_request: fail back 0 >>> th >>> > node request from pid 4467 >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler called >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: starting to >>> select >>> > new master node >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4434: starting fail back. reconnect host >>> > node1(5432) >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4434: execute command: >>> > /usr/local/pgpool-II/etc/failback.sh 0 "node1" 5432 /pgdata/data90.C/ 0 >>> 1 >>> > "node2" 1 >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: kill 4944 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4944: child received shutdown request >>> signal >>> > 3 >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: kill 4945 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: kill 4946 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: kill 4947 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: kill 4948 >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: kill 4974 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: failover_handler: kill 4975 >>> > >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4434: failover_handler: set new master >>> node: >>> > 0 >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 5583: I am 5583 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4975: child received shutdown request >>> signal >>> > 3 >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4971: child received shutdown request >>> signal >>> > 3 >>> > 2011-01-21 08:58:59 DEBUG: pid 4945: child received shutdown request >>> signal >>> > 3 >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 5613: I am 5613 >>> > >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4434: failback done. reconnect host >>> > node1(5432) >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 5614: I am 5614 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4964: child received shutdown request >>> signal >>> > 3 >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: pool_ssl: SSL requested but SSL >>> support >>> > is not available >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: s_do_auth: auth kind: 0 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: s_do_auth: parameter status data >>> > received >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: s_do_auth: parameter status data >>> > received >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: do_query: kind: E >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: do_query: kind: Z >>> > >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4434: find_primary_node: do_query returns >>> no >>> > rows >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4434: find_primary_node: do_query returns >>> no >>> > data >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4434: find_primary_node: primary node id >>> is 0 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: reap_handler called >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: reap_handler: call wait3 >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: child 4944 exits with status 0 >>> > 2011-01-21 08:58:59 DEBUG: pid 4434: reap_handler: normally exited >>> > >>> > >>> > 2011-01-21 08:58:59 LOG: pid 4467: recovery done >>> > >>> > >>> > 2011-01-21 08:58:59 DEBUG: pid 4468: do_query: kind: E >>> > >>> > >>> > 2011-01-21 08:58:59 ERROR: pid 4468: pool_read: EOF encountered with >>> backend >>> > >>> > >>> > 2011-01-21 08:58:59 ERROR: pid 4468: do_query: error while reading >>> message >>> > kind >>> > >>> > 2011-01-21 08:58:59 ERROR: pid 4468: check_replication_time_lag: SELECT >>> > pg_current_xlog_location() failed >>> > >>> > 2011-01-21 08:59:09 DEBUG: pid 4434: starting health checking >>> > >>> > >>> > 2011-01-21 08:59:09 DEBUG: pid 4434: health_check: 0 th DB node status: >>> 1 >>> > >>> > >>> > 2011-01-21 08:59:09 DEBUG: pid 4434: health_check: 1 th DB node status: >>> 2 >>> > >>> > >>> > 2011-01-21 08:59:09 DEBUG: pid 4468: pool_ssl: SSL requested but SSL >>> support >>> > is not available >>> > >>> > 2011-01-21 08:59:09 DEBUG: pid 4468: s_do_auth: auth kind: 0 >>> > 2011-01-21 09:03:40 DEBUG: pid 4468: do_query: kind: T >>> > 2011-01-21 09:03:40 DEBUG: pid 4468: num_fileds: 1 >>> > 2011-01-21 09:03:40 DEBUG: pid 4468: do_query: kind: E >>> > 2011-01-21 09:03:40 DEBUG: pid 4468: do_query: kind: Z >>> > 2011-01-21 09:03:40 ERROR: pid 4468: check_replication_time_lag: SELECT >>> > pg_current_xlog_location() returns no rows >>> > 2011-01-21 09:03:49 DEBUG: pid 4434: starting health checking >>> > 2011-01-21 09:03:49 DEBUG: pid 4434: health_check: 0 th DB node status: >>> 1 >>> > 2011-01-21 09:03:50 DEBUG: pid 4434: health_check: 1 th DB node status: >>> 2 >>> > 2011-01-21 09:03:50 DEBUG: pid 4468: do_query: kind: T >>> > 2011-01-21 09:03:50 DEBUG: pid 4468: num_fileds: 1 >>> > 2011-01-21 09:03:50 DEBUG: pid 4468: do_query: kind: E >>> > 2011-01-21 09:03:50 DEBUG: pid 4468: do_query: kind: Z >>> > 2011-01-21 09:03:50 ERROR: pid 4468: check_replication_time_lag: SELECT >>> > pg_current_xlog_location() returns no rows >>> > 2011-01-21 09:04:00 DEBUG: pid 4434: starting health checking >>> > 2011-01-21 09:04:00 DEBUG: pid 4434: health_check: 0 th DB node status: >>> 1 >>> > 2011-01-21 09:04:00 DEBUG: pid 4434: health_check: 1 th DB node status: >>> 2 >>> > 2011-01-21 09:04:00 DEBUG: pid 4468: do_query: kind: T >>> > 2011-01-21 09:04:00 DEBUG: pid 4468: num_fileds: 1 >>> > >>> > postgresql.conf (node1): >>> > >>> > 25308 2011-01-21 10:00:40 CET LOG: database system was interrupted; >>> last >>> > known up at 2011-01-21 09:58:57 CET >>> > >>> > 25308 2011-01-21 10:00:40 CET LOG: creating missing WAL directory >>> > "pg_xlog/archive_status" >>> > >>> > 25308 2011-01-21 10:00:40 CET LOG: entering standby mode >>> > >>> > >>> > 25309 2011-01-21 10:00:40 CET LOG: streaming replication successfully >>> > connected to primary >>> > >>> > 25308 2011-01-21 10:00:41 CET LOG: redo starts at 1/E4000020 >>> > >>> > >>> > 25308 2011-01-21 10:00:41 CET LOG: consistent recovery state reached at >>> > 1/E5000000 >>> > >>> > 25306 2011-01-21 10:00:41 CET LOG: database system is ready to accept >>> read >>> > only connections >>> > >>> > 25314 2011-01-21 10:00:43 CET LOG: statement: SELECT >>> pg_is_in_recovery() >>> > AND pgpool_walrecrunning() >>> > >>> > 25314 2011-01-21 10:00:43 CET ERROR: function pgpool_walrecrunning() >>> does >>> > not exist at character 32 >>> > >>> > 25314 2011-01-21 10:00:43 CET HINT: No function matches the given name >>> and >>> > argument types. You might need to add explicit type casts. >>> > >>> > 25314 2011-01-21 10:00:43 CET STATEMENT: SELECT pg_is_in_recovery() AND >>> > pgpool_walrecrunning() >>> > >>> > 25314 2011-01-21 10:00:43 CET LOG: unexpected EOF on client connection >>> > >>> > >>> > 25319 2011-01-21 10:00:53 CET LOG: statement: SELECT >>> > pg_current_xlog_location() >>> > >>> > 25319 2011-01-21 10:00:53 CET ERROR: recovery is in progress >>> > >>> > >>> > 25319 2011-01-21 10:00:53 CET HINT: WAL control functions cannot be >>> > executed during recovery. >>> > >>> > 25319 2011-01-21 10:00:53 CET STATEMENT: SELECT >>> pg_current_xlog_location() >>> > >>> > >>> > 25319 2011-01-21 10:01:03 CET LOG: statement: SELECT >>> > pg_current_xlog_location() >>> > >>> > 25319 2011-01-21 10:01:03 CET ERROR: recovery is in progress >>> > >>> > >>> > 25319 2011-01-21 10:01:03 CET HINT: WAL control functions cannot be >>> > executed during recovery. >>> > >>> > 25319 2011-01-21 10:01:03 CET STATEMENT: SELECT >>> pg_current_xlog_location() >>> > >>> > >>> > 25319 2011-01-21 10:01:13 CET LOG: statement: SELECT >>> > pg_current_xlog_location() >>> > >>> > 25319 2011-01-21 10:01:13 CET ERROR: recovery is in progress >>> > >>> > >>> > 25319 2011-01-21 10:01:13 CET HINT: WAL control functions cannot be >>> > executed during recovery. >>> > 25319 2011-01-21 10:01:13 CET STATEMENT: SELECT >>> pg_current_xlog_location() >>> > 25319 2011-01-21 10:01:23 CET LOG: statement: SELECT >>> > pg_current_xlog_location() >>> > 25319 2011-01-21 10:01:23 CET ERROR: recovery is in progress >>> > 25319 2011-01-21 10:01:23 CET HINT: WAL control functions cannot be >>> > executed during recovery. >>> > 25319 2011-01-21 10:01:23 CET STATEMENT: SELECT >>> pg_current_xlog_location() >>> > 25319 2011-01-21 10:01:33 CET LOG: statement: SELECT >>> > pg_current_xlog_location() >>> > >>> > postgresql.conf (node2): >>> > >>> > 5853 2011-01-21 09:58:57 CET LOG: statement: SELECT >>> > pgpool_recovery('basebackup.sh', node1', '/pgdata/data90.C/') >>> > 5856 2011-01-21 09:58:57 CET LOG: statement: SELECT >>> > pg_start_backup('Streaming Replication', true) >>> > pg_start_backup >>> > ----------------- >>> > 1/E4000020 >>> > (1 row) >>> > >>> > 5887 2011-01-21 09:59:34 CET LOG: statement: SELECT pg_stop_backup() >>> > NOTICE: pg_stop_backup complete, all required WAL segments have been >>> > archived >>> > pg_stop_backup >>> > ---------------- >>> > 1/E40000D8 >>> > (1 row) >>> > >>> > 5853 2011-01-21 09:59:35 CET LOG: statement: SELECT >>> > pgpool_remote_start('node1', '/pgdata/data90.C/') >>> > 5893 2011-01-21 09:59:36 CET LOG: replication connection authorized: >>> > user=postgres host=192.168.10.120 port=45066 >>> > >>> > >>> > >>> > On Fri, Jan 21, 2011 at 8:44 AM, Tatsuo Ishii <[email protected]> >>> wrote: >>> > >>> >> Ok. Can you show the pgpool log and PostgreSQL log(both node 0 and 1) >>> >> when you did: >>> >> >>> >> >> > pcp_recovery_node 20 localhost 9898 postgres <password> 0 -d >>> >> -- >>> >> Tatsuo Ishii >>> >> SRA OSS, Inc. Japan >>> >> English: http://www.sraoss.co.jp/index_en.php >>> >> Japanese: http://www.sraoss.co.jp >>> >> >>> >> > Actually, yesterday in desperate moments I have done that, I got >>> >> > cvs version of pgpool. >>> >> > I think I have same situation. >>> >> > I have added failback script that creates trigger file on my new >>> primary >>> >> > node, >>> >> > but standby is acting like primary.. >>> >> > >>> >> > On Fri, Jan 21, 2011 at 8:19 AM, Tatsuo Ishii <[email protected]> >>> >> wrote: >>> >> > >>> >> >> > I'm having problem with pcp_recovery_node. >>> >> >> > I have 3 computer architecture. One computer is with installed >>> >> pg_pool, >>> >> >> and >>> >> >> > two other are nodes. >>> >> >> > Pg_pool version installed is 3.0.1, and postgres is 9.0.2 >>> >> >> > I have streaming replication configured. >>> >> >> > >>> >> >> > I'm trying to setup automatic recovery of nodes. >>> >> >> > I set one server to be master, and other slave (read only >>> connection). >>> >> >> > When primary fails, slave is being used as a master and this >>> >> transition >>> >> >> from >>> >> >> > one >>> >> >> > node to another is automatic and everything works great. >>> >> >> > But, when I execute pcp_recovery_node command: >>> >> >> > >>> >> >> > pcp_recovery_node 20 localhost 9898 postgres <password> 0 -d >>> >> >> > >>> >> >> > pgpool is trying to make my first node (one that I'm trying to >>> >> recover) >>> >> >> > as a master node, and everything went well except crating trigger >>> >> file. >>> >> >> >>> >> >> That is the limitation of pgpool-II 3.0.x. pgpool-II 3.1 will make >>> >> >> this possible. You can try the to-be-pgpool-II 3.1 source code from >>> >> >> CVS HEAD or tar ball from >>> >> >> >>> >> >> >>> >> >>> http://pgpool.projects.postgresql.org/contrib_docs/simple_sr_setting/index.html >>> >> >> -- >>> >> >> Tatsuo Ishii >>> >> >> SRA OSS, Inc. Japan >>> >> >> English: http://www.sraoss.co.jp/index_en.php >>> >> >> Japanese: http://www.sraoss.co.jp >>> >> >> >>> >> >> > So, in log file there is no errors, failover.sh is not executed, >>> and >>> >> >> > node1 (just recovered node) is acting like it is master, but it is >>> not >>> >> >> > until I manually create trigger file. >>> >> >> > >>> >> >> > Why pgpool is trying to make my just recovered node as master, and >>> >> >> > not just bring it up and set it to act like slave. Can I configure >>> >> this? >>> >> >> > >>> >> >> > How can I know what action is provided by executing >>> pcp_recovery_node? >>> >> >> > >>> >> >> > I hope that I descibed my problem well. >>> >> >>> > >>> > >>> > >>> > -- >>> > Armin >>> >> >> >> >> -- >> Armin >> > > > > -- > Armin _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
