The script I used to check the lag time between the primary and the standby would show that the standby server was not even close, right?
Paula On Sat, Jun 20, 2015 at 9:51 AM, Jerry Sievers <gsiever...@comcast.net> wrote: > Adrian Klaver <adrian.kla...@aklaver.com> writes: > > > On 06/19/2015 01:05 PM, Paula Price wrote: > > > >> > >> > >> On Fri, Jun 19, 2015 at 12:01 PM, Adrian Klaver > >> <adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>> wrote: > >> > >> On 06/18/2015 05:45 PM, Paula Price wrote: > >> > >> I have Postgresql 9.2.10 streaming replication set up with log > >> shipping in > >> case the replication falls behind. I discovered that the > >> log-shipping had > >> been disabled at some point in time. I enabled the log shipping > >> again. > >> > >> If at some point in time the streaming replication fell behind > >> and the > >> standby server was not able to retrieve the necessary WAL > >> file(s) from the > >> primary, would the standby server continue to function > >> normally? Do I need > >> to rebuild the standby server? I have restarted the standby > >> server and it > >> is up and running with no issues. > >> > >> > >> Well that seems at odds with it being unable to retrieve the WAL > >> files. This leads to these questions: > >> > >> 1) What makes you think it did not retrieve the WAL files via > streaming? > >> > >> > >> It _may_ _not _have fallen behind via replication. We do have standby > >> servers that fall behind, but since we have log-shipping it is not a > >> concern. On this server, i have no idea how long we were running > >> without log-shipping. I have no idea how many log files I would have to > >> go through to find out when log-shipping stopped. > >> My basic question is: > >> If a standby server falls behind with streaming replication AND the > >> standby server cannot obtain the WAL file needed from the primary, will > >> you get an error from the standby server? Or does it just hiccup and > >> try to carry on? > > > > No it will fall over: > > I wouldn't describe it that way... > > To a user, the standby will function and appear normal, unless they > notice that the data is not current. > > In the server logs, there will be indications that replication is stuck > waiting for WAL. > > HTH > > > > http://www.postgresql.org/docs/9.2/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-SENDER > > > > wal_keep_segments (integer) > > > > Specifies the minimum number of past log file segments kept in the > > pg_xlog directory, in case a standby server needs to fetch them for > > streaming replication. Each segment is normally 16 megabytes. If a > > standby server connected to the sending server falls behind by more > > than wal_keep_segments segments, the sending server might remove a WAL > > segment still needed by the standby, in which case the replication > > connection will be terminated. Downstream connections will also > > eventually fail as a result. (However, the standby server can recover > > by fetching the segment from archive, if WAL archiving is in use.) > > > > This sets only the minimum number of segments retained in pg_xlog; > > the system might need to retain more segments for WAL archival or to > > recover from a checkpoint. If wal_keep_segments is zero (the default), > > the system doesn't keep any extra segments for standby purposes, so > > the number of old WAL segments available to standby servers is a > > function of the location of the previous checkpoint and status of WAL > > archiving. This parameter can only be set in the postgresql.conf file > > or on the server command line. > > > > When you started up if the necessary WAL files where not on the server > > you would have seen Postgres throwing errors in the log. > > > > I would check out the below to verify: > > > > > http://www.postgresql.org/docs/9.2/interactive/warm-standby.html#STREAMING-REPLICATION > > > > 25.2.5.2. Monitoring > > > >> > >> > >> 2) What does the postgres log show at the time you restarted the > >> standby? > >> > >> 2015-06-18 01:12:41.871 UTC::@:[28168]:2015-06-18 01:12:41 > UTC: > >> LOG: 00000: database system was shut down in recovery at > >> 2015-06-18 01:12:14 UTC > >> > >> 2015-06-18 01:12:41.871 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOCATION: StartupXLOG, xlog.c:6298 > >> > >> 2015-06-18 01:12:41.904 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOG: 00000: entering standby mode > >> > >> 2015-06-18 01:12:41.904 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOCATION: StartupXLOG, xlog.c:6384 > >> > >> 2015-06-18 01:12:41.987 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOG: 00000: redo starts at 867/FDF32E18 > >> > >> 2015-06-18 01:12:41.987 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOCATION: StartupXLOG, xlog.c:6855 > >> > >> 2015-06-18 01:12:42.486 > >> UTC::[unknown]@[unknown]:[28213]:2015-06-18 01:12:42 UTC: LOG: > >> 00000: connection received: host=[local] > >> > >> 2015-06-18 01:12:42.486 > >> UTC::[unknown]@[unknown]:[28213]:2015-06-18 01:12:42 UTC: > >> LOCATION: BackendInitialize, postmaster.c:3501 > >> > >> 2015-06-18 01:12:42.486 > >> UTC:[local]:postgres@postgres:[28213]:2015-06-18 01:12:42 UTC: > >> FATAL: 57P03: the database system is starting up > >> > >> 2015-06-18 01:12:42.486 > >> UTC:[local]:postgres@postgres:[28213]:2015-06-18 01:12:42 UTC: > >> LOCATION: ProcessStartupPacket, postmaster.c:1792 > >> > >> 2015-06-18 01:12:43.488 > >> UTC::[unknown]@[unknown]:[28270]:2015-06-18 01:12:43 UTC: LOG: > >> 00000: connection received: host=[local] > >> > >> 2015-06-18 01:12:43.488 > >> UTC::[unknown]@[unknown]:[28270]:2015-06-18 01:12:43 UTC: > >> LOCATION: BackendInitialize, postmaster.c:3501 > >> > >> 2015-06-18 01:12:43.488 > >> UTC:[local]:postgres@postgres:[28270]:2015-06-18 01:12:43 UTC: > >> FATAL: 57P03: the database system is starting up > >> > >> 2015-06-18 01:12:43.488 > >> UTC:[local]:postgres@postgres:[28270]:2015-06-18 01:12:43 UTC: > >> LOCATION: ProcessStartupPacket, postmaster.c:1792 > >> > >> 2015-06-18 01:12:44.489 > >> UTC::[unknown]@[unknown]:[28327]:2015-06-18 01:12:44 UTC: LOG: > >> 00000: connection received: host=[local] > >> > >> 2015-06-18 01:12:44.489 > >> UTC::[unknown]@[unknown]:[28327]:2015-06-18 01:12:44 UTC: > >> LOCATION: BackendInitialize, postmaster.c:3501 > >> > >> 2015-06-18 01:12:44.489 > >> UTC:[local]:postgres@postgres:[28327]:2015-06-18 01:12:44 UTC: > >> FATAL: 57P03: the database system is starting up > >> > >> 2015-06-18 01:12:44.489 > >> UTC:[local]:postgres@postgres:[28327]:2015-06-18 01:12:44 UTC: > >> LOCATION: ProcessStartupPacket, postmaster.c:1792 > >> > >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOG: 00000: consistent recovery state reached at 868/112AF7F8 > >> > >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOCATION: CheckRecoveryConsistency, xlog.c:7405 > >> > >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOG: 00000: invalid record length at 868/112AFB00 > >> > >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: > >> LOCATION: ReadRecord, xlog.c:4078 > >> > >> 2015-06-18 01:12:44.490 UTC::@:[28166]:2015-06-18 01:12:41 UTC: > >> LOG: 00000: database system is ready to accept read only > >> connections > >> > >> 2015-06-18 01:12:44.490 UTC::@:[28166]:2015-06-18 01:12:41 UTC: > >> LOCATION: sigusr1_handler, postmaster.c:4314 > >> > >> > >> > >> I need to know if the > >> > >> data integrity has been compromised. > >> > >> I have run this query to determine the lag time for the > >> standby(in case > >> this tells me anything): > >> "SELECT now(), now() - pg_last_xact_replay_timestamp() AS > time_lag; > >> RESULT: > >> "2015-06-19 00:40:48.83701+00";"00:00:01.078616" > >> > >> > >> Thank you, > >> Paula P > >> > >> > >> > >> -- > >> Adrian Klaver > >> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> > >> > >> > > > > > > -- > > Adrian Klaver > > adrian.kla...@aklaver.com > > -- > Jerry Sievers > Postgres DBA/Development Consulting > e: postgres.consult...@comcast.net > p: 312.241.7800 >