Re: [HACKERS] archive_keepalive_command
Where are we on this? --- On Mon, Jan 16, 2012 at 01:52:35AM +, Simon Riggs wrote: On Fri, Dec 16, 2011 at 3:01 PM, Simon Riggs si...@2ndquadrant.com wrote: archive_command and restore_command describe how to ship WAL files to/from an archive. When there is nothing to ship, we delay sending WAL files. When no WAL files, the standby has no information at all. To provide some form of keepalive on quiet systems the archive_keepalive_command provides a generic hook to implement keepalives. This is implemented as a separate command to avoid storing keepalive messages in the archive, or at least allow overwrites using a single filename like keepalive. Examples archive_keepalive_command = 'arch_cmd keepalive' # sends a file called keepalive to archive, overwrites allowed archive_keepalive_command = 'arch_cmd %f.%t.keepalive #sends a file like 0001000ABFE.20111216143517.keepalive If there is no WAL file to send, then we send a keepalive file instead. Keepalive is a small file that contains same contents as a streaming keepalive message (re: other patch on that). If no WAL file is available and we are attempting to restore in standby_mode, then we execute restore_keepalive_command to see if a keepalive file is available. Checks for a file in the specific keepalive format and then uses that to update last received info from master. e.g. restore_keepalive_command = 'restore_cmd keepalive' # gets a file called keepalive to archive, overwrites allowed Patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample index 5acfa57..fab288c 100644 --- a/src/backend/access/transam/recovery.conf.sample +++ b/src/backend/access/transam/recovery.conf.sample @@ -43,6 +43,13 @@ # #restore_command = ''# e.g. 'cp /mnt/server/archivedir/%f %p' # +# restore_keepalive_command +# +# specifies an optional shell command to download keepalive files +# e.g. archive_keepalive_command = 'cp -f %p $ARCHIVE/keepalive /dev/null' +# e.g. restore_keepalive_command = 'cp $ARCHIVE/keepalive %p' +# +#restore_keepalive_command = '' # # archive_cleanup_command # diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ce659ec..2729141 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -73,8 +73,10 @@ intCheckPointSegments = 3; int wal_keep_segments = 0; int XLOGbuffers = -1; int XLogArchiveTimeout = 0; +int XLogArchiveKeepaliveTimeout = 10; /* XXX set to 60 before commit */ bool XLogArchiveMode = false; char*XLogArchiveCommand = NULL; +char*XLogArchiveKeepaliveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool log_checkpoints = false; @@ -188,6 +190,7 @@ static bool restoredFromArchive = false; /* options taken from recovery.conf for archive recovery */ static char *recoveryRestoreCommand = NULL; +static char *recoveryRestoreKeepaliveCommand = NULL; static char *recoveryEndCommand = NULL; static char *archiveCleanupCommand = NULL; static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET; @@ -634,6 +637,7 @@ static intemode_for_corrupt_record(int emode, XLogRecPtr RecPtr); static void XLogFileClose(void); static bool RestoreArchivedFile(char *path, const char *xlogfname, const char *recovername, off_t expectedSize); +static void RestoreKeepaliveFile(void); static void ExecuteRecoveryCommand(char *command, char *commandName, bool failOnerror); static void PreallocXlogFiles(XLogRecPtr endptr); @@ -2718,7 +2722,10 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli, RECOVERYXLOG, XLogSegSize); if (!restoredFromArchive) + { + RestoreKeepaliveFile(); return -1; + } break; case XLOG_FROM_PG_XLOG: @@ -3179,6 +3186,192 @@ not_available: return false; } +static void +RestoreKeepaliveFile(void) +{ + charkeepalivepath[MAXPGPATH]; + charkeepaliveRestoreCmd[MAXPGPATH]; + char *dp; + char *endp; + const char *sp; + int
Re: [HACKERS] archive_keepalive_command
On Mon, Aug 27, 2012 at 9:48 AM, Bruce Momjian br...@momjian.us wrote: Where are we on this? It didn't make it into 9.2, and the patch hasn't been resubmitted for 9.3. It's still not really 100% clear to me what problem it lets us solve that we can't solve otherwise. Maybe that is just a question of adding documentation; I don't know. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_keepalive_command
On Mon, Mar 5, 2012 at 11:55 AM, Simon Riggs si...@2ndquadrant.com wrote: On Sun, Mar 4, 2012 at 1:20 AM, Jeff Janes jeff.ja...@gmail.com wrote: Does this patch have any user-visible effect? I thought it would make pg_last_xact_replay_timestamp() advance, but it does not seem to. I looked through the source a bit, and as best I can tell this only sets some internal state which is never used, except under DEBUG2 Thanks for the review. I'll look into that. Simon, are you still hoping to get this done for this releases? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_keepalive_command
On Sun, Mar 4, 2012 at 1:20 AM, Jeff Janes jeff.ja...@gmail.com wrote: Does this patch have any user-visible effect? I thought it would make pg_last_xact_replay_timestamp() advance, but it does not seem to. I looked through the source a bit, and as best I can tell this only sets some internal state which is never used, except under DEBUG2 Thanks for the review. I'll look into that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_keepalive_command
On Sun, Jan 15, 2012 at 5:52 PM, Simon Riggs si...@2ndquadrant.com wrote: On Fri, Dec 16, 2011 at 3:01 PM, Simon Riggs si...@2ndquadrant.com wrote: archive_command and restore_command describe how to ship WAL files to/from an archive. When there is nothing to ship, we delay sending WAL files. When no WAL files, the standby has no information at all. To provide some form of keepalive on quiet systems the archive_keepalive_command provides a generic hook to implement keepalives. This is implemented as a separate command to avoid storing keepalive messages in the archive, or at least allow overwrites using a single filename like keepalive. Patch. Preliminary review: Applies with several hunks, and with some fuzz in xlog.h Builds cleanly and passes make check. Does not provide documentation, which is needed. Does not include regression tests, but there is no framework for testing archiving. Usability testing: Does this patch have any user-visible effect? I thought it would make pg_last_xact_replay_timestamp() advance, but it does not seem to. I looked through the source a bit, and as best I can tell this only sets some internal state which is never used, except under DEBUG2 The example archive_keepalive_command given in postgresql.conf.sample is not usable as given. If the file is named %f, then there is no easy way for restore_keepalive_command to retrieve the file because it would not know the name to use. So the example given in postgresql.conf.sample should be more like the one given in recovery.conf.sample, where it uses a hard-coded name rather than %f. But in that case, it is not clear what %f might be useful for. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_keepalive_command
On Fri, Dec 16, 2011 at 3:01 PM, Simon Riggs si...@2ndquadrant.com wrote: archive_command and restore_command describe how to ship WAL files to/from an archive. When there is nothing to ship, we delay sending WAL files. When no WAL files, the standby has no information at all. To provide some form of keepalive on quiet systems the archive_keepalive_command provides a generic hook to implement keepalives. This is implemented as a separate command to avoid storing keepalive messages in the archive, or at least allow overwrites using a single filename like keepalive. Examples archive_keepalive_command = 'arch_cmd keepalive' # sends a file called keepalive to archive, overwrites allowed archive_keepalive_command = 'arch_cmd %f.%t.keepalive #sends a file like 0001000ABFE.20111216143517.keepalive If there is no WAL file to send, then we send a keepalive file instead. Keepalive is a small file that contains same contents as a streaming keepalive message (re: other patch on that). If no WAL file is available and we are attempting to restore in standby_mode, then we execute restore_keepalive_command to see if a keepalive file is available. Checks for a file in the specific keepalive format and then uses that to update last received info from master. e.g. restore_keepalive_command = 'restore_cmd keepalive' # gets a file called keepalive to archive, overwrites allowed Patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample index 5acfa57..fab288c 100644 --- a/src/backend/access/transam/recovery.conf.sample +++ b/src/backend/access/transam/recovery.conf.sample @@ -43,6 +43,13 @@ # #restore_command = '' # e.g. 'cp /mnt/server/archivedir/%f %p' # +# restore_keepalive_command +# +# specifies an optional shell command to download keepalive files +# e.g. archive_keepalive_command = 'cp -f %p $ARCHIVE/keepalive /dev/null' +# e.g. restore_keepalive_command = 'cp $ARCHIVE/keepalive %p' +# +#restore_keepalive_command = '' # # archive_cleanup_command # diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ce659ec..2729141 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -73,8 +73,10 @@ int CheckPointSegments = 3; int wal_keep_segments = 0; int XLOGbuffers = -1; int XLogArchiveTimeout = 0; +int XLogArchiveKeepaliveTimeout = 10; /* XXX set to 60 before commit */ bool XLogArchiveMode = false; char *XLogArchiveCommand = NULL; +char *XLogArchiveKeepaliveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool log_checkpoints = false; @@ -188,6 +190,7 @@ static bool restoredFromArchive = false; /* options taken from recovery.conf for archive recovery */ static char *recoveryRestoreCommand = NULL; +static char *recoveryRestoreKeepaliveCommand = NULL; static char *recoveryEndCommand = NULL; static char *archiveCleanupCommand = NULL; static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET; @@ -634,6 +637,7 @@ static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr); static void XLogFileClose(void); static bool RestoreArchivedFile(char *path, const char *xlogfname, const char *recovername, off_t expectedSize); +static void RestoreKeepaliveFile(void); static void ExecuteRecoveryCommand(char *command, char *commandName, bool failOnerror); static void PreallocXlogFiles(XLogRecPtr endptr); @@ -2718,7 +2722,10 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli, RECOVERYXLOG, XLogSegSize); if (!restoredFromArchive) + { +RestoreKeepaliveFile(); return -1; + } break; case XLOG_FROM_PG_XLOG: @@ -3179,6 +3186,192 @@ not_available: return false; } +static void +RestoreKeepaliveFile(void) +{ + char keepalivepath[MAXPGPATH]; + char keepaliveRestoreCmd[MAXPGPATH]; + char *dp; + char *endp; + const char *sp; + int rc; + bool signaled; + struct stat stat_buf; + + /* In standby mode, restore_command might not be supplied */ + if (recoveryRestoreKeepaliveCommand == NULL) + return; + + snprintf(keepalivepath, MAXPGPATH, XLOGDIR /archive_status/KEEPALIVE); + + /* + * Make sure there is no existing file in keepalivepath + */ + if (stat(keepalivepath, stat_buf) == 0) + { + if (unlink(keepalivepath) != 0) + ereport(FATAL, + (errcode_for_file_access(), + errmsg(could not remove file \%s\: %m, + keepalivepath))); + } + + /* + * construct the command to be executed + */ + dp = keepaliveRestoreCmd; + endp = keepaliveRestoreCmd + MAXPGPATH - 1; + *endp = '\0'; + + for (sp = recoveryRestoreKeepaliveCommand; *sp; sp++) + { + if (*sp == '%') + { + switch (sp[1]) + { +case 'p': + /* %p: relative path of target file */ + sp++; +
Re: [HACKERS] archive_keepalive_command
On Mon, Dec 19, 2011 at 1:02 PM, Simon Riggs si...@2ndquadrant.com wrote: On Dec 12, you said It also strikes me that anything that is based on augmenting the walsender/walreceiver protocol leaves anyone who is using WAL shipping out in the cold. I'm not clear from the comments you or Simon have made how important you think that use case still is. Not wanting to leave anyone out in the cold, I proposed something to enhance file based replication also. Fair enough. I am still of the opinion that we ought to commit some version of the pg_last_xact_insert_timestamp patch. I accept that patch isn't going to solve every problem, but I still think it's worth having. If one of these other solutions comes along and turns out to work great, that's fine, too; but I don't think any of them are so compelling that we can credibly say that pg_last_xact_insert_timestamp is useless or obsolete. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_keepalive_command
On Fri, Dec 16, 2011 at 10:01 AM, Simon Riggs si...@2ndquadrant.com wrote: To provide some form of keepalive on quiet systems the archive_keepalive_command provides a generic hook to implement keepalives. This is implemented as a separate command to avoid storing keepalive messages in the archive, or at least allow overwrites using a single filename like keepalive. This may be stupid of me, but I don't see the point of this. If you want keepalives, why use log shipping rather than SR? Implementing a really-high-latency method of passing protocol messages through the archive seems like a complex solution to a non-problem (but, like I say, I may be missing something). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_keepalive_command
On 12/19/2011 08:17 AM, Robert Haas wrote: If you want keepalives, why use log shipping rather than SR? Implementing a really-high-latency method of passing protocol messages through the archive seems like a complex solution to a non-problem The problem being addressed is how can people using archiving compute time-based lag usefully? Thinking about an answer to that question that made sense for SR drove us toward keepalive timestamp sharing. This is trying to introduce a mechanism good enough to do the same thing for regular archive recovery. In the archiving case, the worst case waiting to trip you up is always the one where not enough activity happened to generate a new WAL file yet. If people want lag to move correctly in that case anyway, a message needs to be transferred from archiver to recovery system. Simon is suggesting that we do that via shipping a new small file in that case, rather than trying to muck with putting it into the WAL data or something like that. It's a bit hackish, but a) no more hackish than people are used to for PITR, and b) in a way that avoids touching database code in the critical path for SR. This idea might eliminate the last of the reasons I was speculating on for adding more timestamps into the WAL stream. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] archive_keepalive_command
On Mon, Dec 19, 2011 at 1:17 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Dec 16, 2011 at 10:01 AM, Simon Riggs si...@2ndquadrant.com wrote: To provide some form of keepalive on quiet systems the archive_keepalive_command provides a generic hook to implement keepalives. This is implemented as a separate command to avoid storing keepalive messages in the archive, or at least allow overwrites using a single filename like keepalive. This may be stupid of me, but I don't see the point of this. If you want keepalives, why use log shipping rather than SR? On Dec 12, you said It also strikes me that anything that is based on augmenting the walsender/walreceiver protocol leaves anyone who is using WAL shipping out in the cold. I'm not clear from the comments you or Simon have made how important you think that use case still is. Not wanting to leave anyone out in the cold, I proposed something to enhance file based replication also. In any case, multiple others have requested this feature, so its worth doing even if you have changed your mind. Implementing a really-high-latency method of passing protocol messages through the archive seems like a complex solution to a non-problem (but, like I say, I may be missing something). So a) it is a problem, and b) its not complex. The proposed method doesn't necessarily use the archive. Allowing users to specify how the keepalive will work makes it a flexible solution to a widely recognised problem. This proposal doesn't replace the protocol keepalive for streaming replication, it provides exactly the same thing for file based replication users. Many people use both streaming and file-based, so need a way to measure latency that acts similarly no matter which one is currently in use. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers