[HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jim Buttafuoco
All,I have a large PG 9.1.1 server (over 1TB of data) and replica using log shipping. I had some hardware issues on the replica system and now I am getting the following in my pg_log/* files. Same2 lines over and over since yesterday.2011-12-01 07:46:30 EST LOG: restored log file "0001028E00E5" from archive2011-12-01 07:46:30 EST LOG: incorrect resource manager data checksum in record at 28E/E555E1B8Anything I can do on the replica or do I have to start over?Finally, I know this is not the correct list, I tried general with no answer.ThanksJim___Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Robert Haas
On Thu, Dec 1, 2011 at 1:41 PM, Jim Buttafuoco j...@contacttelecom.com wrote:
 2011-12-01 07:46:30 EST  LOG:  restored log file 0001028E00E5 
 from archive
 2011-12-01 07:46:30 EST  LOG:  incorrect resource manager data checksum in 
 record at 28E/E555E1B8

 Anything I can do on the replica or do I have to start over?

I think you want to rebuild the standby.  Even if you could repair the
damaged WAL record, how can you have any confidence that there is no
other corruption?

Note that rsync has some options to only copy the changed data, which
might greatly accelerated resyncing the standby from the master.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jerry Sievers
Jim Buttafuoco j...@contacttelecom.com writes:

 All,

 I have a large PG 9.1.1 server (over 1TB of data) and replica using log 
 shipping.  I had some hardware issues on the
 replica system and now I am getting the following in my pg_log/* files.  Same 
 2 lines over and over since yesterday.

 2011-12-01 07:46:30 EST  LOG:  restored log file 0001028E00E5 
 from archive
 2011-12-01 07:46:30 EST  LOG:  incorrect resource manager data checksum in 
 record at 28E/E555E1B8

 Anything I can do on the replica or do I have to start over?

INspect that WAL segment or possibly the one immediatly following it
in comparison to another copy if you still have it on the master or a
central WAL repository.

A standby crashing meanwhile copying in a WAL segment and/or synching
one to disk could result in ramdon corruption.

If you have another copy of the segment and does not compare equal to
the one your standby is trying to read, try another copy.

 Finally, I know this is not the correct list, I tried general with no answer.

The admin list is the right one for such a post probably.

HTH

 Thanks
 Jim
 ___

 [cid]

 Jim Buttafuoco
 j...@contacttelecom.com
 603-647-7170 ext. - Office
 603-490-3409 - Cell
 jimbuttafuoco - Skype


-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 305.321.1144

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jim Buttafuoco
the WAL file on the master is long gone, how would one inspect the web segment? Any way to have PG "move" on?On Dec 1, 2011, at 2:02 PM, Jerry Sievers wrote:Jim Buttafuoco j...@contacttelecom.com writes:All,I have a large PG 9.1.1 server (over 1TB of data) and replica using log shipping. I had some hardware issues on thereplica system and now I am getting the following in my pg_log/* files. Same 2 lines over and over since yesterday.2011-12-01 07:46:30 EST LOG: restored log file "0001028E00E5" from archive2011-12-01 07:46:30 EST LOG: incorrect resource manager data checksum in record at 28E/E555E1B8Anything I can do on the replica or do I have to start over?INspect that WAL segment or possibly the one immediatly following itin comparison to another copy if you still have it on the master or acentral WAL repository.A standby crashing meanwhile copying in a WAL segment and/or synchingone to disk could result in ramdon corruption.If you have another copy of the segment and does not compare equal tothe one your standby is trying to read, try another copy.Finally, I know this is not the correct list, I tried general with no answer.The admin list is the right one for such a post probably.HTHThanksJim___[cid]Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype-- Jerry SieversPostgres DBA/Development Consultinge: postgres.consult...@comcast.netp: 305.321.1144___Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Simon Riggs
On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.comwrote:

 the WAL file on the master is long gone, how would one inspect the web
 segment?  Any way to have PG move on?


Regenerate the master.



 --

 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Simon Riggs
On Thu, Dec 1, 2011 at 9:08 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.com
 wrote:

 the WAL file on the master is long gone, how would one inspect the web
 segment?  Any way to have PG move on?


 Regenerate the master.

typo: regenerate *from* the master

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jim Buttafuoco
Simon,What do you mean, start over with a base backup?JimOn Dec 1, 2011, at 4:08 PM, Simon Riggs wrote:On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.com wrote:
the WAL file on the master is long gone, how would one inspect the web segment? Any way to have PG "move" on?Regenerate the master.
-- Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

___Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype



Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread desmodemone
Hello Jim,
   I think you not have other possibilities if the archives are
corrupted and there are no possibilities to restore it,
you need to recreate the standby starting from a base backup.

Kind Regards


2011/12/1 Jim Buttafuoco j...@contacttelecom.com

 Simon,

 What do you mean, start over with a base backup?

 Jim

 On Dec 1, 2011, at 4:08 PM, Simon Riggs wrote:

 On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.comwrote:

 the WAL file on the master is long gone, how would one inspect the web
 segment?  Any way to have PG move on?


 Regenerate the master.



 --

  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 ___






 Jim Buttafuoco
 j...@contacttelecom.com
 603-647-7170 ext. - Office
 603-490-3409 - Cell
 jimbuttafuoco - Skype