Hi,

Ya the error logs came, but only when the DB1 was corrupted. Here is a
excerpt from the logs.

*LOG:  database system was shut down at 2013-06-19 09:40:55 UTC
LOG:  restored log file "00000004.history" from archive
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or
directory
LOG:  restored log file "00000004.history" from archive
LOG:  entering standby mode
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/000000030000000000000013': No such
file or directory
LOG:  consistent recovery state reached at 0/13000080
LOG:  record with zero length at 0/13000080
LOG:  database system is ready to accept read only connections
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/000000030000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or
directory
LOG:  streaming replication successfully connected to primary
ERROR:  cannot execute CREATE ROLE in a read-only transaction
STATEMENT:  CREATE USER replicationuser REPLICATION;
LOG:  invalid record length at 0/13000080
FATAL:  terminating walreceiver process due to administrator command
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such
file or directory
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or
directory
LOG:  invalid record length at 0/13000080
*

This is what happened..
When DB1 came up as a new standby, It connected SR with DB2 (new master)
then after some time it terminated the walreceiver process.
And the invalid record length logs went on, until i took base backup on DB1
and restarted it.


Also On DB1 (before DB1 was started in standby mode)  : pg_controldata
/data/pgsql/9.2/data/

pg_control version number:            922
Catalog version number:               201204301
Database system identifier:           5891091665573732008
Database cluster state:               shut down
pg_control last modified:             Wed Jun 19 09:40:55 2013
Latest checkpoint location:           0/13000020
Prior checkpoint location:            0/124F1BC0
Latest checkpoint's REDO location:    0/13000020
Latest checkpoint's TimeLineID:       3
Latest checkpoint's full_page_writes: off
Latest checkpoint's NextXID:          0/8409
Latest checkpoint's NextOID:          18470
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        669
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  0
Time of latest checkpoint:            Wed Jun 19 09:40:54 2013
Minimum recovery ending location:     0/0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
Current wal_level setting:            hot_standby
Current max_connections setting:      300
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value


Will it be wrong to compare* the "last replayed xlog id" on DB2* (which is
the new master) with "Latest checkpoint location" on DB1 before starting
DB1 in standby mode and if "Latest checkpoint location" on DB1(old master)
is greater than *"last replayed xlog id*" on DB2(new master) then i have to
go for basebackup.


regards,
Prakhar.


On Wed, Jun 19, 2013 at 1:11 PM, Albe Laurenz <laurenz.a...@wien.gv.at>wrote:

> prakhar jauhari wrote:
> > I am facing a similar kind of problem, but in a two node setup.
> Streaming replication is being used
> > with a high wal_keep_segments, using log shipping to bump up timelines
> on the standby server to setup
> > SR(streaming replication).
> > DB1 - master
> >
> > DB2 - standby
> >
> >
> > When a switchover happens DB2 becomes the new master and when DB1 comes
> up it will act as the standby
> > to the new master (.history files from new master are copied to DB1 to
> bump up its timeline so as to
> > setup SR). DB1 is not recreated from scratch. This runs fine in normal
> switchover, but there seems to
> > be problem in the following situation, leading to database corruption:
> >
> >
> > Current state :
> > DB1 - master
> > DB2 - standby
> >
> >
> > Now the failing scenario:
> >
> >
> >
> > 1. DB2 machine goes down.
> >
> > 2. After some time DB1 machine also goes down (DB2 is still down).
> >
> > 3. Now DB2 comes up (it will join the cluster as master as DB1 is still
> down).
> >
> > 4. DB2 is started as master postgresql.
> >
> > 5. Now DB1 comes up (it will join the cluster as standby to DB2)
> >
> > 6. Now when DB1 attempts to SR with DB2, DB1 gets corrupted.
> >
> >
> > Looked into the issue and found that when DB1 went down initially, it
> created some WAL's which were
> > not synced to DB2 as it was already down.
> >
> > Now when DB2 started as master it still had not played the last few
> WAL's created by DB1(when it was
> > master). DB2 starts as master properly.
> >
> > When DB1 came as standby to DB2, it bumped it timeline using history
> file from DB2, but when SR was
> > setup with DB2, DB1 gets corrupted.
> >
> >
> > Now the question is:
> >
> >
> > 1. Is this a theoretically valid approach?
> >
> > 2. If it is a valid approach, then how can i detect such a scenario
> (where SR will corrupt the DB)? So
> > that i can go for a basebackup in such situation.
>
> If you want to use the old primary as new standby without a new backup,
> you have to ascertain that all transactions from the former have
> been replayed at the latter.
>
> To figure out where the primary currently is, you can
>    SELECT pg_current_xlog_location();
>
> To figure how much the standby has replayed, you can
>    SELECT pg_last_xlog_replay_location();
>
> Of course this only works if both are up.
>
> I think that it would be tricky to automatize that; I'd choose
> making a new backup after each failover.
>
> In the event of a controlled failover it might be an option.
>
> I am surprised that the scenario you described leads to
> corruption; I would have expected an error message.
>
> Yours,
> Laurenz Albe
>

Reply via email to