Hello,
Thanks for your info!
I was able to do some progress with node recovery when using
pgpool_recovery on both recovery command.
I am able to recovery most of the times, but sometimes it fails with
the following error:
$ pcp_recovery_node -d 90 localhost 9898 postgres ******* 2
DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
DEBUG: recv: tos="e", len=20, data=recovery failed
DEBUG: command failed. reason=recovery failed
BackendError
DEBUG: send: tos="X", len=4
pgpool log
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
type of service 'M'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: salt sent to the client
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
type of service 'R'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: authentication OK
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
type of service 'O'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: start online recovery
2009-12-15 20:10:56 LOG: pid 8747: starting recovering node 2
2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: start checkpoint
2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: finish checkpoint
2009-12-15 20:10:56 LOG: pid 8747: CHECKPOINT in the 1st stage done
2009-12-15 20:10:56 LOG: pid 8747: starting recovery command:
"SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/
data')"
2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: start recovery
2009-12-15 20:10:56 ERROR: pid 8747: exec_recovery: pgpool_recovery
command failed at 1st stage
2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: finish recovery
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
type of service 'X'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: client disconnecting.
close connection
2009-12-15 20:11:22 DEBUG: pid 8446: starting health checking
Unfortunately i am not sure what this error means. Did it failed at
"SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/
data')"? How can i find the reason?
Best Regards,
---
Fernando Marcelo
www.consultorpc.com
[email protected]
Em 15/12/2009, às 13:36, Jaume Sabater escreveu:
On Tue, Dec 15, 2009 at 4:20 PM, Fernando Morgenstern
<[email protected]> wrote:
While reading pgpool manual i found this:
Note that there is a restriction about online recovery. If pgpool-
II works
on multiple hosts, online recovery does not work correctly, because
pgpool-II stops clients on the 2nd stage of online recovery. If
there are
some pgpool hosts, pgpool-II excepted for receiving online recovery
request
cannot block connections.
It means running two or more pgpool-II instances simultaneously, which
won't be your case since, with Heartbeat, you'll configure pgpool-II
as a resource, hence it will only be active in one node at a given
time.
--
Jaume Sabater
http://linuxsilo.net/
"Ubi sapientas ibi libertas"
_______________________________________________
Pgpool-general mailing list
[email protected]
http://pgfoundry.org/mailman/listinfo/pgpool-general