Hello,

Thanks for your info!

I was able to do some progress with node recovery when using pgpool_recovery on both recovery command.

I am able to recovery most of the times, but sometimes it fails with the following error:

$ pcp_recovery_node  -d 90 localhost 9898 postgres ******* 2
DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
DEBUG: recv: tos="e", len=20, data=recovery failed
DEBUG: command failed. reason=recovery failed
BackendError
DEBUG: send: tos="X", len=4

pgpool log

2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet type of service 'M'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: salt sent to the client
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet type of service 'R'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: authentication OK
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet type of service 'O'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: start online recovery
2009-12-15 20:10:56 LOG:   pid 8747: starting recovering node 2
2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: start checkpoint
2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: finish checkpoint
2009-12-15 20:10:56 LOG:   pid 8747: CHECKPOINT in the 1st stage done
2009-12-15 20:10:56 LOG: pid 8747: starting recovery command: "SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/ data')"
2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: start recovery
2009-12-15 20:10:56 ERROR: pid 8747: exec_recovery: pgpool_recovery command failed at 1st stage
2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: finish recovery
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet type of service 'X' 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: client disconnecting. close connection
2009-12-15 20:11:22 DEBUG: pid 8446: starting health checking

Unfortunately i am not sure what this error means. Did it failed at "SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/ data')"? How can i find the reason?

Best Regards,
---

Fernando Marcelo
www.consultorpc.com
[email protected]


Em 15/12/2009, às 13:36, Jaume Sabater escreveu:

On Tue, Dec 15, 2009 at 4:20 PM, Fernando Morgenstern
<[email protected]> wrote:

While reading pgpool manual i found this:
Note that there is a restriction about online recovery. If pgpool- II works
on multiple hosts, online recovery does not work correctly, because
pgpool-II stops clients on the 2nd stage of online recovery. If there are some pgpool hosts, pgpool-II excepted for receiving online recovery request
cannot block connections.

It means running two or more pgpool-II instances simultaneously, which
won't be your case since, with Heartbeat, you'll configure pgpool-II
as a resource, hence it will only be active in one node at a given
time.

--
Jaume Sabater
http://linuxsilo.net/

"Ubi sapientas ibi libertas"

_______________________________________________
Pgpool-general mailing list
[email protected]
http://pgfoundry.org/mailman/listinfo/pgpool-general

Reply via email to