Hi, I've attached an updated version of the patch. This fixes the issue on checksum calculation for segments after the first one.
To solve it I've added an optional uint32 *segno argument to parse_filename_for_nontemp_relation, so I can know the segment number and calculate the block number correctly. Il 29/01/15 18:57, Robert Haas ha scritto: > On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini > <marco.nenciar...@2ndquadrant.it> wrote: >> The current implementation of copydir function is incompatible with LSN >> based incremental backups. The problem is that new files are created, >> but their blocks are still with the old LSN, so they will not be backed >> up because they are looking old enough. > > I think this is trying to pollute what's supposed to be a pure > fs-level operation ("copy a directory") into something that is aware > of specific details like the PostgreSQL page format. I really think > that nothing in storage/file should know about the page format. If we > need a function that copies a file while replacing the LSNs, I think > it should be a new function living somewhere else. > I've named it copydir_set_lsn and placed it as static function in dbcommands.c. This lefts the copydir and copy_file functions in copydir.c untouched. The copydir function in copydir.c is now unused, while the copy_file function is still used during unlogged tables reinit. > A bigger problem is that you are proposing to stamp those files with > LSNs that are, for lack of a better word, fake. I would expect that > this would completely break if checksums are enabled. Also, unlogged > relations typically have an LSN of 0; this would change that in some > cases, and I don't know whether that's OK. > I've investigate a bit and I have not been able to find any problem here. > The issues here are similar to those in > http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de > - basically, I think we need to make CREATE DATABASE and ALTER > DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is > never going to work right. If we're not going to allow that, we need > to disallow hot backups while those operations are in progress. > As already said the copydir-LSN patch should be treated as a "temporary" until a proper WAL logging of CREATE DATABASE and ALTER DATABASE SET TABLESPACE will be implemented. At that time we could probably get rid of the whole copydir.[ch] file moving the copy_file function inside reinit.c Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index afd9255..02b5fee 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *************** static void ResetUnloggedRelationsInTabl *** 28,35 **** int op); static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op); - static bool parse_filename_for_nontemp_relation(const char *name, - int *oidchars, ForkNumber *fork); typedef struct { --- 28,33 ---- *************** ResetUnloggedRelationsInDbspaceDir(const *** 388,446 **** fsync_fname((char *) dbspacedirname, true); } } - - /* - * Basic parsing of putative relation filenames. - * - * This function returns true if the file appears to be in the correct format - * for a non-temporary relation and false otherwise. - * - * NB: If this function returns true, the caller is entitled to assume that - * *oidchars has been set to the a value no more than OIDCHARS, and thus - * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID - * portion of the filename. This is critical to protect against a possible - * buffer overrun. - */ - static bool - parse_filename_for_nontemp_relation(const char *name, int *oidchars, - ForkNumber *fork) - { - int pos; - - /* Look for a non-empty string of digits (that isn't too long). */ - for (pos = 0; isdigit((unsigned char) name[pos]); ++pos) - ; - if (pos == 0 || pos > OIDCHARS) - return false; - *oidchars = pos; - - /* Check for a fork name. */ - if (name[pos] != '_') - *fork = MAIN_FORKNUM; - else - { - int forkchar; - - forkchar = forkname_chars(&name[pos + 1], fork); - if (forkchar <= 0) - return false; - pos += forkchar + 1; - } - - /* Check for a segment number. */ - if (name[pos] == '.') - { - int segchar; - - for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar) - ; - if (segchar <= 1) - return false; - pos += segchar; - } - - /* Now we should be at the end. */ - if (name[pos] != '\0') - return false; - return true; - } --- 386,388 ---- diff --git a/src/common/relpath.c b/src/common/relpath.c index 66dfef1..83a1e3a 100644 *** a/src/common/relpath.c --- b/src/common/relpath.c *************** GetRelationPath(Oid dbNode, Oid spcNode, *** 206,208 **** --- 206,264 ---- } return path; } + + /* + * Basic parsing of putative relation filenames. + * + * This function returns true if the file appears to be in the correct format + * for a non-temporary relation and false otherwise. + * + * NB: If this function returns true, the caller is entitled to assume that + * *oidchars has been set to the a value no more than OIDCHARS, and thus + * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID + * portion of the filename. This is critical to protect against a possible + * buffer overrun. + */ + bool + parse_filename_for_nontemp_relation(const char *name, int *oidchars, + ForkNumber *fork) + { + int pos; + + /* Look for a non-empty string of digits (that isn't too long). */ + for (pos = 0; isdigit((unsigned char) name[pos]); ++pos) + ; + if (pos == 0 || pos > OIDCHARS) + return false; + *oidchars = pos; + + /* Check for a fork name. */ + if (name[pos] != '_') + *fork = MAIN_FORKNUM; + else + { + int forkchar; + + forkchar = forkname_chars(&name[pos + 1], fork); + if (forkchar <= 0) + return false; + pos += forkchar + 1; + } + + /* Check for a segment number. */ + if (name[pos] == '.') + { + int segchar; + + for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar) + ; + if (segchar <= 1) + return false; + pos += segchar; + } + + /* Now we should be at the end. */ + if (name[pos] != '\0') + return false; + return true; + } diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h index a263779..9736a78 100644 *** a/src/include/common/relpath.h --- b/src/include/common/relpath.h *************** extern char *GetDatabasePath(Oid dbNode, *** 52,57 **** --- 52,59 ---- extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode, int backendId, ForkNumber forkNumber); + extern bool parse_filename_for_nontemp_relation(const char *name, + int *oidchars, ForkNumber *fork); /* * Wrapper macros for GetRelationPath. Beware of multiple -- 2.3.0
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 3a753a0..a1db67c 100644 *** a/doc/src/sgml/protocol.sgml --- b/doc/src/sgml/protocol.sgml *************** The commands accepted in walsender mode *** 1882,1888 **** </varlistentry> <varlistentry> ! <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>] <indexterm><primary>BASE_BACKUP</primary></indexterm> </term> <listitem> --- 1882,1888 ---- </varlistentry> <varlistentry> ! <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>] <indexterm><primary>BASE_BACKUP</primary></indexterm> </term> <listitem> *************** The commands accepted in walsender mode *** 1905,1910 **** --- 1905,1928 ---- </varlistentry> <varlistentry> + <term><literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable></term> + <listitem> + <para> + Requests a file-level incremental backup of all files changed after + <replaceable>start_lsn</replaceable>. When operating with + <literal>INCREMENTAL</literal>, the content of every block-organised + file will be analyzed and the file will be sent if at least one + block has a LSN higher than or equal to the provided + <replaceable>start_lsn</replaceable>. + </para> + <para> + The <filename>backup_profile</filename> will contain information on + every file that has been analyzed, even those that have not been sent. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><literal>PROGRESS</></term> <listitem> <para> *************** The commands accepted in walsender mode *** 2022,2028 **** <quote>ustar interchange format</> specified in the POSIX 1003.1-2008 standard) dump of the tablespace contents, except that the two trailing blocks of zeroes specified in the standard are omitted. ! After the tar data is complete, a final ordinary result set will be sent, containing the WAL end position of the backup, in the same format as the start position. </para> --- 2040,2046 ---- <quote>ustar interchange format</> specified in the POSIX 1003.1-2008 standard) dump of the tablespace contents, except that the two trailing blocks of zeroes specified in the standard are omitted. ! After the tar data is complete, an ordinary result set will be sent, containing the WAL end position of the backup, in the same format as the start position. </para> *************** The commands accepted in walsender mode *** 2073,2082 **** the server supports it. </para> <para> ! Once all tablespaces have been sent, a final regular result set will be sent. This result set contains the end position of the backup, given in XLogRecPtr format as a single column in a single row. </para> </listitem> </varlistentry> </variablelist> --- 2091,2162 ---- the server supports it. </para> <para> ! Once all tablespaces have been sent, another regular result set will be sent. This result set contains the end position of the backup, given in XLogRecPtr format as a single column in a single row. </para> + <para> + Finally a last CopyResponse will be sent, containing only the + <filename>backup_profile</filename> file, in tar format. + </para> + <para> + The <filename>backup_profile</filename> file will have the following + format: + <programlisting> + POSTGRESQL BACKUP PROFILE 1 + <backup label content> + FILE LIST + <file list> + </programlisting> + where <replaceable><backup label content></replaceable> is a + verbatim copy of the content of <filename>backup_label</filename> file + and the <replaceable><file list></replaceable> section is made up + of one line per file examined by the backup, having the following format + (standard COPY TEXT file, tab separated): + <programlisting> + tablespace maxlsn included mtime size relpath + </programlisting> + </para> + <para> + The meaning of the fields is the following: + <itemizedlist spacing="compact" mark="bullet"> + <listitem> + <para> + <replaceable>tablespace</replaceable> is the OID of the tablespace + (or <literal>\N</literal> for files in PGDATA) + </para> + </listitem> + <listitem> + <para> + <replaceable>maxlsn</replaceable> is the file's max LSN in case + the file has been skipped, <literal>\N</literal> otherwise + </para> + </listitem> + <listitem> + <para> + <replaceable>included</replaceable> is a <literal>'t'</literal> if + the file is included in the backup, <literal>'f'</literal> otherwise + </para> + </listitem> + <listitem> + <para> + <replaceable>mtime</replaceable> is the timestamp of the last file + modification + </para> + </listitem> + <listitem> + <para> + <replaceable>size</replaceable> is the number of bytes of the file + </para> + </listitem> + <listitem> + <para> + <replaceable>relpath</replaceable> is the path of the file relative + to the tablespace root (PGDATA or the tablespace) + </para> + </listitem> + </itemizedlist> + </para> </listitem> </varlistentry> </variablelist> diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index 642fccf..a13b188 100644 *** a/doc/src/sgml/ref/pg_basebackup.sgml --- b/doc/src/sgml/ref/pg_basebackup.sgml *************** PostgreSQL documentation *** 158,163 **** --- 158,165 ---- tablespaces, the main data directory will be placed in the target directory, but all other tablespaces will be placed in the same absolute path as they have on the server. + The <filename>backup_profile</filename> file will be placed in + this directory. </para> <para> This is the default format. *************** PostgreSQL documentation *** 174,186 **** data directory will be written to a file named <filename>base.tar</filename>, and all other tablespaces will be named after the tablespace OID. ! </para> <para> If the value <literal>-</literal> (dash) is specified as target directory, the tar contents will be written to standard output, suitable for piping to for example <productname>gzip</productname>. This is only possible if the cluster has no additional tablespaces. </para> </listitem> </varlistentry> --- 176,192 ---- data directory will be written to a file named <filename>base.tar</filename>, and all other tablespaces will be named after the tablespace OID. ! The <filename>backup_profile</filename> file will be placed in ! this directory. ! </para> <para> If the value <literal>-</literal> (dash) is specified as target directory, the tar contents will be written to standard output, suitable for piping to for example <productname>gzip</productname>. This is only possible if the cluster has no additional tablespaces. + In this case, the <filename>backup_profile</filename> file + will be sent to standard output as part of the tar stream. </para> </listitem> </varlistentry> *************** PostgreSQL documentation *** 189,194 **** --- 195,214 ---- </varlistentry> <varlistentry> + <term><option>-I <replaceable class="parameter">directory</replaceable></option></term> + <term><option>--incremental=<replaceable class="parameter">directory</replaceable></option></term> + <listitem> + <para> + Directory containing the backup to use as a start point for a file-level + incremental backup. <application>pg_basebackup</application> will read + the <filename>backup_profile</filename> file and then create an + incremental backup containing only the files which have been modified + after the start point. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-r <replaceable class="parameter">rate</replaceable></option></term> <term><option>--max-rate=<replaceable class="parameter">rate</replaceable></option></term> <listitem> *************** PostgreSQL documentation *** 588,593 **** --- 608,622 ---- </para> <para> + In order to support file-level incremental backups, a + <filename>backup_profile</filename> file + is generated in the target directory as last step of every backup. This + file will be transparently used by <application>pg_basebackup</application> + when invoked with the option <replaceable>--incremental</replaceable> to start + a new file-level incremental backup. + </para> + + <para> <application>pg_basebackup</application> works with servers of the same or an older major version, down to 9.1. However, WAL streaming mode (-X stream) only works with server version 9.3 and later. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 629a457..a642a04 100644 *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *************** *** 47,52 **** --- 47,53 ---- #include "replication/snapbuild.h" #include "replication/walreceiver.h" #include "replication/walsender.h" + #include "replication/basebackup.h" #include "storage/barrier.h" #include "storage/bufmgr.h" #include "storage/fd.h" *************** StartupXLOG(void) *** 6164,6169 **** --- 6165,6173 ---- * the latest recovery restartpoint instead of going all the way back * to the backup start point. It seems prudent though to just rename * the file out of the way rather than delete it completely. + * + * Rename also the backup profile if present. This marks the data + * directory as not usable as base for an incremental backup. */ if (haveBackupLabel) { *************** StartupXLOG(void) *** 6173,6178 **** --- 6177,6189 ---- (errcode_for_file_access(), errmsg("could not rename file \"%s\" to \"%s\": %m", BACKUP_LABEL_FILE, BACKUP_LABEL_OLD))); + unlink(BACKUP_PROFILE_OLD); + if (rename(BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD) != 0 + && errno != ENOENT) + ereport(FATAL, + (errcode_for_file_access(), + errmsg("could not rename file \"%s\" to \"%s\": %m", + BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD))); } /* Check that the GUCs used to generate the WAL allow recovery */ *************** XLogFileNameP(TimeLineID tli, XLogSegNo *** 9249,9255 **** * permissions of the calling user! */ XLogRecPtr ! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p, char **labelfile) { bool exclusive = (labelfile == NULL); --- 9260,9267 ---- * permissions of the calling user! */ XLogRecPtr ! do_pg_start_backup(const char *backupidstr, bool fast, ! XLogRecPtr incremental_startpoint, TimeLineID *starttli_p, char **labelfile) { bool exclusive = (labelfile == NULL); *************** do_pg_start_backup(const char *backupids *** 9468,9473 **** --- 9480,9489 ---- (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename); appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n", (uint32) (checkpointloc >> 32), (uint32) checkpointloc); + if (incremental_startpoint > 0) + appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n", + (uint32) (incremental_startpoint >> 32), + (uint32) incremental_startpoint); appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n", exclusive ? "pg_start_backup" : "streamed"); appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n", diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c index 2179bf7..ace84d8 100644 *** a/src/backend/access/transam/xlogfuncs.c --- b/src/backend/access/transam/xlogfuncs.c *************** pg_start_backup(PG_FUNCTION_ARGS) *** 59,65 **** (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg("must be superuser or replication role to run a backup"))); ! startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL); PG_RETURN_LSN(startpoint); } --- 59,65 ---- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg("must be superuser or replication role to run a backup"))); ! startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL); PG_RETURN_LSN(startpoint); } diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c index 3058ce9..107d70c 100644 *** a/src/backend/replication/basebackup.c --- b/src/backend/replication/basebackup.c *************** *** 30,40 **** --- 30,42 ---- #include "replication/basebackup.h" #include "replication/walsender.h" #include "replication/walsender_private.h" + #include "storage/bufpage.h" #include "storage/fd.h" #include "storage/ipc.h" #include "utils/builtins.h" #include "utils/elog.h" #include "utils/ps_status.h" + #include "utils/pg_lsn.h" #include "utils/timestamp.h" *************** typedef struct *** 46,56 **** bool nowait; bool includewal; uint32 maxrate; } basebackup_options; ! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces); ! static int64 sendTablespace(char *path, bool sizeonly); static bool sendFile(char *readfilename, char *tarfilename, struct stat * statbuf, bool missing_ok); static void sendFileWithContent(const char *filename, const char *content); --- 48,62 ---- bool nowait; bool includewal; uint32 maxrate; + XLogRecPtr incremental_startpoint; } basebackup_options; ! static int64 sendDir(char *path, int basepathlen, bool sizeonly, ! List *tablespaces, bool has_relfiles, ! XLogRecPtr incremental_startpoint); ! static int64 sendTablespace(char *path, bool sizeonly, ! XLogRecPtr incremental_startpoint); static bool sendFile(char *readfilename, char *tarfilename, struct stat * statbuf, bool missing_ok); static void sendFileWithContent(const char *filename, const char *content); *************** static void parse_basebackup_options(Lis *** 64,69 **** --- 70,80 ---- static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli); static int compareWalFileNames(const void *a, const void *b); static void throttle(size_t increment); + static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf, + XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn); + static void writeBackupProfileLine(const char *filename, struct stat * statbuf, + bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent); + static void sendBackupProfile(const char *labelfile); /* Was the backup currently in-progress initiated in recovery mode? */ static bool backup_started_in_recovery = false; *************** static int64 elapsed_min_unit; *** 93,98 **** --- 104,115 ---- /* The last check of the transfer rate. */ static int64 throttled_last; + /* Temporary file containing the backup profile */ + static File backup_profile_fd = 0; + + /* Tablespace being currently sent. Used in backup profile generation */ + static char *current_tablespace = NULL; + typedef struct { char *oid; *************** perform_base_backup(basebackup_options * *** 132,138 **** backup_started_in_recovery = RecoveryInProgress(); ! startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli, &labelfile); /* * Once do_pg_start_backup has been called, ensure that any failure causes --- 149,159 ---- backup_started_in_recovery = RecoveryInProgress(); ! /* Open a temporary file to hold the profile content. */ ! backup_profile_fd = OpenTemporaryFile(false); ! ! startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, ! opt->incremental_startpoint, &starttli, &labelfile); /* * Once do_pg_start_backup has been called, ensure that any failure causes *************** perform_base_backup(basebackup_options * *** 208,214 **** ti->oid = pstrdup(de->d_name); ti->path = pstrdup(linkpath); ti->rpath = relpath ? pstrdup(relpath) : NULL; ! ti->size = opt->progress ? sendTablespace(fullpath, true) : -1; tablespaces = lappend(tablespaces, ti); #else --- 229,236 ---- ti->oid = pstrdup(de->d_name); ti->path = pstrdup(linkpath); ti->rpath = relpath ? pstrdup(relpath) : NULL; ! ti->size = opt->progress ? sendTablespace(fullpath, true, ! opt->incremental_startpoint) : -1; tablespaces = lappend(tablespaces, ti); #else *************** perform_base_backup(basebackup_options * *** 225,231 **** /* Add a node for the base directory at the end */ ti = palloc0(sizeof(tablespaceinfo)); ! ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1; tablespaces = lappend(tablespaces, ti); /* Send tablespace header */ --- 247,254 ---- /* Add a node for the base directory at the end */ ti = palloc0(sizeof(tablespaceinfo)); ! ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false, ! opt->incremental_startpoint) : -1; tablespaces = lappend(tablespaces, ti); /* Send tablespace header */ *************** perform_base_backup(basebackup_options * *** 267,272 **** --- 290,301 ---- pq_sendint(&buf, 0, 2); /* natts */ pq_endmessage(&buf); + /* + * Save the current tablespace, used in writeBackupProfileLine + * function + */ + current_tablespace = ti->oid; + if (ti->path == NULL) { struct stat statbuf; *************** perform_base_backup(basebackup_options * *** 275,281 **** sendFileWithContent(BACKUP_LABEL_FILE, labelfile); /* ... then the bulk of the files ... */ ! sendDir(".", 1, false, tablespaces); /* ... and pg_control after everything else. */ if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0) --- 304,310 ---- sendFileWithContent(BACKUP_LABEL_FILE, labelfile); /* ... then the bulk of the files ... */ ! sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint); /* ... and pg_control after everything else. */ if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0) *************** perform_base_backup(basebackup_options * *** 284,292 **** errmsg("could not stat control file \"%s\": %m", XLOG_CONTROL_FILE))); sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false); } else ! sendTablespace(ti->path, false); /* * If we're including WAL, and this is the main data directory we --- 313,322 ---- errmsg("could not stat control file \"%s\": %m", XLOG_CONTROL_FILE))); sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false); + writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true); } else ! sendTablespace(ti->path, false, opt->incremental_startpoint); /* * If we're including WAL, and this is the main data directory we *************** perform_base_backup(basebackup_options * *** 501,507 **** FreeFile(fp); ! /* * Mark file as archived, otherwise files can get archived again * after promotion of a new node. This is in line with * walreceiver.c always doing a XLogArchiveForceDone() after a --- 531,540 ---- FreeFile(fp); ! /* Add the WAL file to backup profile */ ! writeBackupProfileLine(pathbuf, &statbuf, false, 0, true); ! ! /* * Mark file as archived, otherwise files can get archived again * after promotion of a new node. This is in line with * walreceiver.c always doing a XLogArchiveForceDone() after a *************** perform_base_backup(basebackup_options * *** 533,538 **** --- 566,574 ---- sendFile(pathbuf, pathbuf, &statbuf, false); + /* Add the WAL file to backup profile */ + writeBackupProfileLine(pathbuf, &statbuf, false, 0, true); + /* unconditionally mark file as archived */ StatusFilePath(pathbuf, fname, ".done"); sendFileWithContent(pathbuf, ""); *************** perform_base_backup(basebackup_options * *** 542,547 **** --- 578,586 ---- pq_putemptymessage('c'); } SendXlogRecPtrResult(endptr, endtli); + + /* Send the profile file. */ + sendBackupProfile(labelfile); } /* *************** parse_basebackup_options(List *options, *** 570,575 **** --- 609,615 ---- bool o_nowait = false; bool o_wal = false; bool o_maxrate = false; + bool o_incremental = false; MemSet(opt, 0, sizeof(*opt)); foreach(lopt, options) *************** parse_basebackup_options(List *options, *** 640,645 **** --- 680,697 ---- opt->maxrate = (uint32) maxrate; o_maxrate = true; } + else if (strcmp(defel->defname, "incremental") == 0) + { + if (o_incremental) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("duplicate option \"%s\"", defel->defname))); + + opt->incremental_startpoint = DatumGetLSN( + DirectFunctionCall1(pg_lsn_in, + CStringGetDatum(strVal(defel->arg)))); + o_incremental = true; + } else elog(ERROR, "option \"%s\" not recognized", defel->defname); *************** sendFileWithContent(const char *filename *** 859,864 **** --- 911,919 ---- MemSet(buf, 0, pad); pq_putmessage('d', buf, pad); } + + /* Write a backup profile entry for this file. */ + writeBackupProfileLine(filename, &statbuf, false, 0, true); } /* *************** sendFileWithContent(const char *filename *** 869,875 **** * Only used to send auxiliary tablespaces, not PGDATA. */ static int64 ! sendTablespace(char *path, bool sizeonly) { int64 size; char pathbuf[MAXPGPATH]; --- 924,930 ---- * Only used to send auxiliary tablespaces, not PGDATA. */ static int64 ! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint) { int64 size; char pathbuf[MAXPGPATH]; *************** sendTablespace(char *path, bool sizeonly *** 902,908 **** size = 512; /* Size of the header just added */ /* Send all the files in the tablespace version directory */ ! size += sendDir(pathbuf, strlen(path), sizeonly, NIL); return size; } --- 957,963 ---- size = 512; /* Size of the header just added */ /* Send all the files in the tablespace version directory */ ! size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint); return size; } *************** sendTablespace(char *path, bool sizeonly *** 914,922 **** * * Omit any directory in the tablespaces list, to avoid backing up * tablespaces twice when they were created inside PGDATA. */ static int64 ! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces) { DIR *dir; struct dirent *de; --- 969,981 ---- * * Omit any directory in the tablespaces list, to avoid backing up * tablespaces twice when they were created inside PGDATA. + * + * If 'has_relfiles' is set, this directory will be checked to identify + * relnode files and compute their maxLSN. */ static int64 ! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces, ! bool has_relfiles, XLogRecPtr incremental_startpoint) { DIR *dir; struct dirent *de; *************** sendDir(char *path, int basepathlen, boo *** 1124,1138 **** } } if (!skip_this_dir) ! size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces); } else if (S_ISREG(statbuf.st_mode)) { bool sent = false; if (!sizeonly) ! sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf, ! true); if (sent || sizeonly) { --- 1183,1243 ---- } } if (!skip_this_dir) ! { ! bool subdir_has_relfiles; ! ! /* ! * Whithin PGDATA relnode files are contained only in "global" ! * and "base" directory ! */ ! subdir_has_relfiles = has_relfiles ! || strcmp(pathbuf, "./global") == 0 ! || strcmp(pathbuf, "./base") == 0; ! ! size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces, ! subdir_has_relfiles, incremental_startpoint); ! } } else if (S_ISREG(statbuf.st_mode)) { bool sent = false; if (!sizeonly) ! { ! bool is_relfile; ! XLogRecPtr filemaxlsn = 0; ! int oidchars; ! ForkNumber forknum; ! ! /* ! * If the current directory can have relnode files, check the file ! * name to see if it is one of them. ! * ! * Only copy the main fork because is the only one ! * where page LSNs are always updated ! */ ! is_relfile = ( has_relfiles ! && parse_filename_for_nontemp_relation(de->d_name, ! &oidchars, ! &forknum) ! && forknum == MAIN_FORKNUM); ! ! if (!is_relfile ! || incremental_startpoint == 0 ! || relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn, ! incremental_startpoint)) ! { ! sent = sendFile(pathbuf, pathbuf + basepathlen + 1, ! &statbuf, true); ! /* Write a backup profile entry for the sent file. */ ! writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf, ! false, 0, sent); ! } ! else ! /* Write a backup profile entry for the skipped file. */ ! writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf, ! true, filemaxlsn, sent); ! } if (sent || sizeonly) { *************** throttle(size_t increment) *** 1333,1335 **** --- 1438,1626 ---- /* Sleep was necessary but might have been interrupted. */ throttled_last = GetCurrentIntegerTimestamp(); } + + /* + * Search in a relnode file for a page with a LSN greater than the threshold. + * If all the blocks in the file are older than the threshold the file can + * be safely skipped during an incremental backup. + */ + static bool + relnodeIsNewerThanLSN(char *filename, struct stat * statbuf, + XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn) + { + FILE *fp; + char buf[BLCKSZ]; + size_t cnt; + pgoff_t len = 0; + XLogRecPtr pagelsn; + + *filemaxlsn = 0; + + fp = AllocateFile(filename, "rb"); + if (fp == NULL) + { + if (errno == ENOENT) + return true; + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not open file \"%s\": %m", filename))); + } + + while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0) + { + pagelsn = PageGetLSN(buf); + + /* Keep the max LSN found */ + if (*filemaxlsn < pagelsn) + *filemaxlsn = pagelsn; + + /* + * If a page with a LSN newer than the threshold stop scanning + * and set the filemaxlsn value to 0 as it is only partial. + */ + if (thresholdlsn <= pagelsn) + { + *filemaxlsn = 0; + FreeFile(fp); + return true; + } + + if (len >= statbuf->st_size) + { + /* + * Reached end of file. The file could be longer, if it was + * extended while we were sending it, but for a base backup we can + * ignore such extended data. It will be restored from WAL. + */ + break; + } + } + + FreeFile(fp); + + /* + * At this point, if *filemaxlsn contains InvalidXLogRecPtr + * the file contains something that doesn't update page LSNs (e.g. FSM) + */ + if (*filemaxlsn == InvalidXLogRecPtr) + return true; + + return false; + } + + /* + * Write an entry in file list section of backup profile. + */ + static void + writeBackupProfileLine(const char *filename, struct stat * statbuf, + bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent) + { + /* + * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) + + * path (MAXPGPATH) + separators (4) + trailing \0 = 65 + */ + char buf[MAXPGPATH + 65]; + char maxlsn[17]; + int rowlen; + + Assert(backup_profile_fd > 0); + + /* Prepare maxlsn */ + if (has_maxlsn) + { + snprintf(maxlsn, sizeof(maxlsn), "%X/%X", + (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn); + } + else + { + strlcpy(maxlsn, "\\N", sizeof(maxlsn)); + } + + rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n", + current_tablespace ? current_tablespace : "\\N", + maxlsn, + sent ? "t" : "f", + (uint32) statbuf->st_mtime, + statbuf->st_size, + filename); + FileWrite(backup_profile_fd, buf, rowlen); + } + + /* + * Send the backup profile. It is wrapped in a tar CopyOutResponse containing + * a tar stream with only one file. + */ + static void + sendBackupProfile(const char *labelfile) + { + StringInfoData msgbuf; + struct stat statbuf; + char buf[TAR_SEND_SIZE]; + size_t cnt; + pgoff_t len = 0; + size_t pad; + char *backup_profile = FilePathName(backup_profile_fd); + + /* Send CopyOutResponse message */ + pq_beginmessage(&msgbuf, 'H'); + pq_sendbyte(&msgbuf, 0); /* overall format */ + pq_sendint(&msgbuf, 0, 2); /* natts */ + pq_endmessage(&msgbuf); + + if (lstat(backup_profile, &statbuf) != 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not stat backup_profile file \"%s\": %m", + backup_profile))); + + /* Set the file position to the beginning. */ + FileSeek(backup_profile_fd, 0, SEEK_SET); + + /* + * Fill the buffer with content of backup profile header section. Being it + * the concatenation of two separator and the backup label, it should be + * shorter of TAR_SEND_SIZE. + */ + cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n", + BACKUP_PROFILE_HEADER, + labelfile, + BACKUP_PROFILE_SEPARATOR); + + /* Add size of backup label and separators */ + statbuf.st_size += cnt; + + _tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf); + + /* Send backup profile header */ + if (pq_putmessage('d', buf, cnt)) + ereport(ERROR, + (errmsg("base backup could not send data, aborting backup"))); + + len += cnt; + throttle(cnt); + + while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0) + { + /* Send the chunk as a CopyData message */ + if (pq_putmessage('d', buf, cnt)) + ereport(ERROR, + (errmsg("base backup could not send data, aborting backup"))); + + len += cnt; + throttle(cnt); + + } + + /* + * Pad to 512 byte boundary, per tar format requirements. (This small + * piece of data is probably not worth throttling.) + */ + pad = ((len + 511) & ~511) - len; + if (pad > 0) + { + MemSet(buf, 0, pad); + pq_putmessage('d', buf, pad); + } + + pq_putemptymessage('c'); /* CopyDone */ + } diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y index 2a41eb1..684cf4d 100644 *** a/src/backend/replication/repl_gram.y --- b/src/backend/replication/repl_gram.y *************** Node *replication_parse_result; *** 75,80 **** --- 75,81 ---- %token K_PHYSICAL %token K_LOGICAL %token K_SLOT + %token K_INCREMENTAL %type <node> command %type <node> base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history *************** base_backup_opt: *** 168,173 **** --- 169,179 ---- $$ = makeDefElem("max_rate", (Node *)makeInteger($2)); } + | K_INCREMENTAL SCONST + { + $$ = makeDefElem("incremental", + (Node *)makeString($2)); + } ; create_replication_slot: diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l index 449c127..a6d0dd8 100644 *** a/src/backend/replication/repl_scanner.l --- b/src/backend/replication/repl_scanner.l *************** TIMELINE_HISTORY { return K_TIMELINE_HIS *** 96,101 **** --- 96,102 ---- PHYSICAL { return K_PHYSICAL; } LOGICAL { return K_LOGICAL; } SLOT { return K_SLOT; } + INCREMENTAL { return K_INCREMENTAL; } "," { return ','; } ";" { return ';'; } diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index fbf7106..c03e7e0 100644 *** a/src/bin/pg_basebackup/pg_basebackup.c --- b/src/bin/pg_basebackup/pg_basebackup.c *************** static bool writerecoveryconf = false; *** 67,72 **** --- 67,74 ---- static int standby_message_timeout = 10 * 1000; /* 10 sec = default */ static pg_time_t last_progress_report = 0; static int32 maxrate = 0; /* no limit by default */ + static XLogRecPtr incremental_startpoint = 0; + static TimeLineID incremental_timeline = 0; /* Progress counters */ *************** static void usage(void); *** 99,107 **** static void disconnect_and_exit(int code); static void verify_dir_is_empty_or_create(char *dirname); static void progress_report(int tablespacenum, const char *filename, bool force); static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum); ! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum); static void GenerateRecoveryConf(PGconn *conn); static void WriteRecoveryConf(void); static void BaseBackup(void); --- 101,111 ---- static void disconnect_and_exit(int code); static void verify_dir_is_empty_or_create(char *dirname); static void progress_report(int tablespacenum, const char *filename, bool force); + static void read_backup_profile_header(const char *profile_path); static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum); ! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum, ! const char *dest_path); static void GenerateRecoveryConf(PGconn *conn); static void WriteRecoveryConf(void); static void BaseBackup(void); *************** usage(void) *** 232,237 **** --- 236,243 ---- printf(_("\nOptions controlling the output:\n")); printf(_(" -D, --pgdata=DIRECTORY receive base backup into directory\n")); printf(_(" -F, --format=p|t output format (plain (default), tar)\n")); + printf(_(" -I, --incremental=DIRECTORY\n" + " incremental backup from an existing backup\n")); printf(_(" -r, --max-rate=RATE maximum transfer rate to transfer data directory\n" " (in kB/s, or use suffix \"k\" or \"M\")\n")); printf(_(" -R, --write-recovery-conf\n" *************** parse_max_rate(char *src) *** 717,722 **** --- 723,794 ---- return (int32) result; } + + /* + * Read incremental_startpoint and incremental_timeline + * from a backup profile. + */ + static void + read_backup_profile_header(const char *reference_path) + { + char profile_path[MAXPGPATH]; + FILE *pfp; + char ch; + uint32 hi, + lo; + + /* The directory must exist and must be not empty */ + if (pg_check_dir(reference_path) < 3) + { + fprintf(stderr, _("%s: invalid incremental base directory \"%s\"\n"), + progname, reference_path); + exit(1); + } + + /* Build the backup profile location */ + join_path_components(profile_path, reference_path, BACKUP_PROFILE_FILE); + + /* See if label file is present */ + pfp = fopen(profile_path, "r"); + if (!pfp) + { + fprintf(stderr, _("%s: could not read file \"%s\": %s\n"), + progname, profile_path, strerror(errno)); + exit(1); + } + + /* Consume the profile header */ + fscanf(pfp, BACKUP_PROFILE_HEADER); + if (fscanf(pfp, "%c", &ch) != 1 || ch != '\n') + { + fprintf(stderr, _("%s: invalid data in file \"%s\"\n"), + progname, profile_path); + exit(1); + } + + /* + * Read and parse the START WAL LOCATION (this code + * is pretty crude, but we are not expecting any variability in the file + * format). + */ + if (fscanf(pfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c", + &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n') + { + fprintf(stderr, _("%s: invalid data in file \"%s\"\n"), + progname, profile_path); + exit(1); + } + incremental_startpoint = ((uint64) hi) << 32 | lo; + + if (ferror(pfp) || fclose(pfp)) + { + fprintf(stderr, _("%s: could not read file \"%s\": %s\n"), + progname, profile_path, strerror(errno)); + exit(1); + } + } + + /* * Write a piece of tar data */ *************** ReceiveTarFile(PGconn *conn, PGresult *r *** 773,784 **** char *copybuf = NULL; FILE *tarfile = NULL; char tarhdr[512]; ! bool basetablespace = PQgetisnull(res, rownum, 0); bool in_tarhdr = true; bool skip_file = false; size_t tarhdrsz = 0; size_t filesz = 0; #ifdef HAVE_LIBZ gzFile ztarfile = NULL; #endif --- 845,866 ---- char *copybuf = NULL; FILE *tarfile = NULL; char tarhdr[512]; ! bool basetablespace; bool in_tarhdr = true; bool skip_file = false; size_t tarhdrsz = 0; size_t filesz = 0; + /* + * If 'res' is NULL, we are appending the backup profile to + * the standard output tar stream. + */ + Assert(res || (strcmp(basedir, "-") == 0)); + if (res) + basetablespace = PQgetisnull(res, rownum, 0); + else + basetablespace = true; + #ifdef HAVE_LIBZ gzFile ztarfile = NULL; #endif *************** ReceiveTarFile(PGconn *conn, PGresult *r *** 939,946 **** WRITE_TAR_DATA(zerobuf, padding); } ! /* 2 * 512 bytes empty data at end of file */ ! WRITE_TAR_DATA(zerobuf, sizeof(zerobuf)); #ifdef HAVE_LIBZ if (ztarfile != NULL) --- 1021,1033 ---- WRITE_TAR_DATA(zerobuf, padding); } ! /* ! * Write the end-of-file blocks unless using stdout ! * and not writing the backup profile (res is NULL). ! */ ! if (!res || strcmp(basedir, "-") != 0) ! /* 2 * 512 bytes empty data at end of file */ ! WRITE_TAR_DATA(zerobuf, sizeof(zerobuf)); #ifdef HAVE_LIBZ if (ztarfile != NULL) *************** get_tablespace_mapping(const char *dir) *** 1128,1136 **** * If the data is for the main data directory, it will be restored in the * specified directory. If it's for another tablespace, it will be restored * in the original or mapped directory. */ static void ! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum) { char current_path[MAXPGPATH]; char filename[MAXPGPATH]; --- 1215,1230 ---- * If the data is for the main data directory, it will be restored in the * specified directory. If it's for another tablespace, it will be restored * in the original or mapped directory. + * + * If 'res' is NULL, the destination directory is taken from the + * 'dest_path' parameter. + * + * When 'dest_path' is specified, progresses are not displayed because the + * content it is not in any tablespace. */ static void ! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum, ! const char *dest_path) { char current_path[MAXPGPATH]; char filename[MAXPGPATH]; *************** ReceiveAndUnpackTarFile(PGconn *conn, PG *** 1141,1153 **** char *copybuf = NULL; FILE *file = NULL; ! basetablespace = PQgetisnull(res, rownum, 0); ! if (basetablespace) ! strlcpy(current_path, basedir, sizeof(current_path)); else ! strlcpy(current_path, ! get_tablespace_mapping(PQgetvalue(res, rownum, 1)), ! sizeof(current_path)); /* * Get the COPY data --- 1235,1262 ---- char *copybuf = NULL; FILE *file = NULL; ! /* 'res' and 'dest_path' are mutually exclusive */ ! Assert(!res != !dest_path); ! ! /* ! * If 'res' is NULL, the destination directory is taken from the ! * 'dest_path' parameter. ! */ ! if (res) ! { ! basetablespace = PQgetisnull(res, rownum, 0); ! if (basetablespace) ! strlcpy(current_path, basedir, sizeof(current_path)); ! else ! strlcpy(current_path, ! get_tablespace_mapping(PQgetvalue(res, rownum, 1)), ! sizeof(current_path)); ! } else ! { ! basetablespace = false; ! strlcpy(current_path, dest_path, sizeof(current_path)); ! } /* * Get the COPY data *************** ReceiveAndUnpackTarFile(PGconn *conn, PG *** 1355,1361 **** disconnect_and_exit(1); } totaldone += r; ! progress_report(rownum, filename, false); current_len_left -= r; if (current_len_left == 0 && current_padding == 0) --- 1464,1472 ---- disconnect_and_exit(1); } totaldone += r; ! /* report progress unless a custom destination is used */ ! if (!dest_path) ! progress_report(rownum, filename, false); current_len_left -= r; if (current_len_left == 0 && current_padding == 0) *************** ReceiveAndUnpackTarFile(PGconn *conn, PG *** 1371,1377 **** } } /* continuing data in existing file */ } /* loop over all data blocks */ ! progress_report(rownum, filename, true); if (file != NULL) { --- 1482,1490 ---- } } /* continuing data in existing file */ } /* loop over all data blocks */ ! /* report progress unless a custom destination is used */ ! if (!dest_path) ! progress_report(rownum, filename, true); if (file != NULL) { *************** BaseBackup(void) *** 1587,1592 **** --- 1700,1706 ---- char *basebkp; char escaped_label[MAXPGPATH]; char *maxrate_clause = NULL; + char *incremental_clause = NULL; int i; char xlogstart[64]; char xlogend[64]; *************** BaseBackup(void) *** 1648,1661 **** if (maxrate > 0) maxrate_clause = psprintf("MAX_RATE %u", maxrate); basebkp = ! psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s", escaped_label, showprogress ? "PROGRESS" : "", includewal && !streamwal ? "WAL" : "", fastcheckpoint ? "FAST" : "", includewal ? "NOWAIT" : "", ! maxrate_clause ? maxrate_clause : ""); if (PQsendQuery(conn, basebkp) == 0) { --- 1762,1801 ---- if (maxrate > 0) maxrate_clause = psprintf("MAX_RATE %u", maxrate); + if (incremental_startpoint > 0) + { + incremental_clause = psprintf("INCREMENTAL '%X/%X'", + (uint32) (incremental_startpoint >> 32), + (uint32) incremental_startpoint); + + /* + * Sanity check: if from a different timeline abort the backup. + */ + if (latesttli != incremental_timeline) + { + fprintf(stderr, + _("%s: incremental backup from a different timeline " + "is not supported: base=%u current=%u\n"), + progname, incremental_timeline, latesttli); + disconnect_and_exit(1); + } + + if (verbose) + fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"), + (uint32) (incremental_startpoint >> 32), + (uint32) incremental_startpoint, + incremental_timeline); + } + basebkp = ! psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s", escaped_label, showprogress ? "PROGRESS" : "", includewal && !streamwal ? "WAL" : "", fastcheckpoint ? "FAST" : "", includewal ? "NOWAIT" : "", ! maxrate_clause ? maxrate_clause : "", ! incremental_clause ? incremental_clause : ""); if (PQsendQuery(conn, basebkp) == 0) { *************** BaseBackup(void) *** 1769,1775 **** if (format == 't') ReceiveTarFile(conn, res, i); else ! ReceiveAndUnpackTarFile(conn, res, i); } /* Loop over all tablespaces */ if (showprogress) --- 1909,1915 ---- if (format == 't') ReceiveTarFile(conn, res, i); else ! ReceiveAndUnpackTarFile(conn, res, i, NULL); } /* Loop over all tablespaces */ if (showprogress) *************** BaseBackup(void) *** 1803,1808 **** --- 1943,1960 ---- fprintf(stderr, "transaction log end point: %s\n", xlogend); PQclear(res); + /* + * Get the backup profile + * + * If format is tar and we are writing on standard output + * append the backup profile to the stream, otherwise put it + * in the destination directory + */ + if (format == 't' && (strcmp(basedir, "-") == 0)) + ReceiveTarFile(conn, NULL, -1); + else + ReceiveAndUnpackTarFile(conn, NULL, -1, basedir); + res = PQgetResult(conn); if (PQresultStatus(res) != PGRES_COMMAND_OK) { *************** main(int argc, char **argv) *** 1942,1947 **** --- 2094,2100 ---- {"username", required_argument, NULL, 'U'}, {"no-password", no_argument, NULL, 'w'}, {"password", no_argument, NULL, 'W'}, + {"incremental", required_argument, NULL, 'I'}, {"status-interval", required_argument, NULL, 's'}, {"verbose", no_argument, NULL, 'v'}, {"progress", no_argument, NULL, 'P'}, *************** main(int argc, char **argv) *** 1949,1955 **** {NULL, 0, NULL, 0} }; int c; - int option_index; progname = get_progname(argv[0]); --- 2102,2107 ---- *************** main(int argc, char **argv) *** 1970,1976 **** } } ! while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP", long_options, &option_index)) != -1) { switch (c) --- 2122,2128 ---- } } ! while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP", long_options, &option_index)) != -1) { switch (c) *************** main(int argc, char **argv) *** 2088,2093 **** --- 2240,2248 ---- case 'W': dbgetpassword = 1; break; + case 'I': + read_backup_profile_header(optarg); + break; case 's': standby_message_timeout = atoi(optarg) * 1000; if (standby_message_timeout < 0) diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 138deaf..4bb261a 100644 *** a/src/include/access/xlog.h --- b/src/include/access/xlog.h *************** extern void SetWalWriterSleeping(bool sl *** 249,255 **** * Starting/stopping a base backup */ extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast, ! TimeLineID *starttli_p, char **labelfile); extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p); extern void do_pg_abort_backup(void); --- 249,256 ---- * Starting/stopping a base backup */ extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast, ! XLogRecPtr incremental_startpoint, ! TimeLineID *starttli_p, char **labelfile); extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p); extern void do_pg_abort_backup(void); diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h index 64f2bd5..08f8e90 100644 *** a/src/include/replication/basebackup.h --- b/src/include/replication/basebackup.h *************** *** 20,25 **** --- 20,30 ---- #define MAX_RATE_LOWER 32 #define MAX_RATE_UPPER 1048576 + /* Backup profile */ + #define BACKUP_PROFILE_HEADER "POSTGRESQL BACKUP PROFILE 1" + #define BACKUP_PROFILE_SEPARATOR "FILE LIST" + #define BACKUP_PROFILE_FILE "backup_profile" + #define BACKUP_PROFILE_OLD "backup_profile.old" extern void SendBaseBackup(BaseBackupCmd *cmd); -- 2.3.0
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c index 5e66961..7409471 100644 *** a/src/backend/commands/dbcommands.c --- b/src/backend/commands/dbcommands.c *************** static bool have_createdb_privilege(void *** 89,94 **** --- 89,98 ---- static void remove_dbtablespaces(Oid db_id); static bool check_db_file_conflict(Oid db_id); static int errdetail_busy_db(int notherbackends, int npreparedxacts); + static void copydir_set_lsn(char *fromdir, char *todir, bool recurse, + XLogRecPtr recptr); + static void copy_file_set_lsn(char *fromfile, char *tofile, + XLogRecPtr recptr); /* *************** createdb(const CreatedbStmt *stmt) *** 586,591 **** --- 590,596 ---- Oid dsttablespace; char *srcpath; char *dstpath; + XLogRecPtr recptr; struct stat st; /* No need to copy global tablespace */ *************** createdb(const CreatedbStmt *stmt) *** 609,621 **** dstpath = GetDatabasePath(dboid, dsttablespace); - /* - * Copy this subdirectory to the new location - * - * We don't need to copy subdirectories - */ - copydir(srcpath, dstpath, false); - /* Record the filesystem change in XLOG */ { xl_dbase_create_rec xlrec; --- 614,619 ---- *************** createdb(const CreatedbStmt *stmt) *** 628,636 **** XLogBeginInsert(); XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec)); ! (void) XLogInsert(RM_DBASE_ID, XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE); } } heap_endscan(scan); heap_close(rel, AccessShareLock); --- 626,641 ---- XLogBeginInsert(); XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec)); ! recptr = XLogInsert(RM_DBASE_ID, XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE); } + + /* + * Copy this subdirectory to the new location + * + * We don't need to copy subdirectories + */ + copydir_set_lsn(srcpath, dstpath, false, recptr); } heap_endscan(scan); heap_close(rel, AccessShareLock); *************** movedb(const char *dbname, const char *t *** 1214,1223 **** PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback, PointerGetDatum(&fparms)); { ! /* ! * Copy files from the old tablespace to the new one ! */ ! copydir(src_dbpath, dst_dbpath, false); /* * Record the filesystem change in XLOG --- 1219,1225 ---- PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback, PointerGetDatum(&fparms)); { ! XLogRecPtr recptr; /* * Record the filesystem change in XLOG *************** movedb(const char *dbname, const char *t *** 1233,1243 **** XLogBeginInsert(); XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec)); ! (void) XLogInsert(RM_DBASE_ID, XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE); } /* * Update the database's pg_database tuple */ ScanKeyInit(&scankey, --- 1235,1250 ---- XLogBeginInsert(); XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec)); ! recptr = XLogInsert(RM_DBASE_ID, XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE); } /* + * Copy files from the old tablespace to the new one + */ + copydir_set_lsn(src_dbpath, dst_dbpath, false, recptr); + + /* * Update the database's pg_database tuple */ ScanKeyInit(&scankey, *************** dbase_redo(XLogReaderState *record) *** 2045,2050 **** --- 2052,2058 ---- if (info == XLOG_DBASE_CREATE) { xl_dbase_create_rec *xlrec = (xl_dbase_create_rec *) XLogRecGetData(record); + XLogRecPtr lsn = record->EndRecPtr; char *src_path; char *dst_path; struct stat st; *************** dbase_redo(XLogReaderState *record) *** 2077,2083 **** * * We don't need to copy subdirectories */ ! copydir(src_path, dst_path, false); } else if (info == XLOG_DBASE_DROP) { --- 2085,2091 ---- * * We don't need to copy subdirectories */ ! copydir_set_lsn(src_path, dst_path, false, lsn); } else if (info == XLOG_DBASE_DROP) { *************** dbase_redo(XLogReaderState *record) *** 2128,2130 **** --- 2136,2377 ---- else elog(PANIC, "dbase_redo: unknown op code %u", info); } + + /* + * copydir: copy a directory + * + * If recurse is false, subdirectories are ignored. Anything that's not + * a directory or a regular file is ignored. + * + * If recptr is different from InvalidXlogRecPtr, LSN of pages in the + * destination directory will be updated to recptr. + */ + void + copydir_set_lsn(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr) + { + DIR *xldir; + struct dirent *xlde; + char fromfile[MAXPGPATH]; + char tofile[MAXPGPATH]; + + if (mkdir(todir, S_IRWXU) != 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not create directory \"%s\": %m", todir))); + + xldir = AllocateDir(fromdir); + if (xldir == NULL) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not open directory \"%s\": %m", fromdir))); + + while ((xlde = ReadDir(xldir, fromdir)) != NULL) + { + struct stat fst; + + /* If we got a cancel signal during the copy of the directory, quit */ + CHECK_FOR_INTERRUPTS(); + + if (strcmp(xlde->d_name, ".") == 0 || + strcmp(xlde->d_name, "..") == 0) + continue; + + snprintf(fromfile, MAXPGPATH, "%s/%s", fromdir, xlde->d_name); + snprintf(tofile, MAXPGPATH, "%s/%s", todir, xlde->d_name); + + if (lstat(fromfile, &fst) < 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not stat file \"%s\": %m", fromfile))); + + if (S_ISDIR(fst.st_mode)) + { + /* recurse to handle subdirectories */ + if (recurse) + copydir_set_lsn(fromfile, tofile, true, recptr); + } + else if (S_ISREG(fst.st_mode)) + copy_file_set_lsn(fromfile, tofile, recptr); + } + FreeDir(xldir); + + /* + * Be paranoid here and fsync all files to ensure the copy is really done. + * But if fsync is disabled, we're done. + */ + if (!enableFsync) + return; + + xldir = AllocateDir(todir); + if (xldir == NULL) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not open directory \"%s\": %m", todir))); + + while ((xlde = ReadDir(xldir, todir)) != NULL) + { + struct stat fst; + + if (strcmp(xlde->d_name, ".") == 0 || + strcmp(xlde->d_name, "..") == 0) + continue; + + snprintf(tofile, MAXPGPATH, "%s/%s", todir, xlde->d_name); + + /* + * We don't need to sync subdirectories here since the recursive + * copydir will do it before it returns + */ + if (lstat(tofile, &fst) < 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not stat file \"%s\": %m", tofile))); + + if (S_ISREG(fst.st_mode)) + fsync_fname(tofile, false); + } + FreeDir(xldir); + + /* + * It's important to fsync the destination directory itself as individual + * file fsyncs don't guarantee that the directory entry for the file is + * synced. Recent versions of ext4 have made the window much wider but + * it's been true for ext3 and other filesystems in the past. + */ + fsync_fname(todir, true); + } + + /* + * copy one file + * + * If recptr is different from InvalidXlogRecPtr, the destination file will + * have all its pages with LSN set accordingly + */ + void + copy_file_set_lsn(char *fromfile, char *tofile, XLogRecPtr recptr) + { + char *buffer; + int srcfd; + int dstfd; + int nbytes; + off_t offset; + BlockNumber blkno = 0; + + /* Use palloc to ensure we get a maxaligned buffer */ + #define COPY_BUF_SIZE (8 * BLCKSZ) + + buffer = palloc(COPY_BUF_SIZE); + + /* + * To support incremental backups, we need to update the LSN in + * all relation files we are copying. + * + * We are updating only the MAIN fork because at the moment + * blocks in FSM and VM forks are not guaranteed to have an + * up-to-date LSN + */ + if (recptr != InvalidXLogRecPtr) + { + char *filename = last_dir_separator(fromfile); + ForkNumber fork; + int oidchars; + uint32 segno; + + if (filename && + *(filename + 1) && + parse_filename_for_nontemp_relation(filename + 1, + &oidchars, &fork, &segno) && fork == MAIN_FORKNUM) + blkno = segno * RELSEG_SIZE; + else + recptr = InvalidXLogRecPtr; + } + + /* + * Open the files + */ + srcfd = OpenTransientFile(fromfile, O_RDONLY | PG_BINARY, 0); + if (srcfd < 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not open file \"%s\": %m", fromfile))); + + dstfd = OpenTransientFile(tofile, O_RDWR | O_CREAT | O_EXCL | PG_BINARY, + S_IRUSR | S_IWUSR); + if (dstfd < 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not create file \"%s\": %m", tofile))); + + /* + * Do the data copying. + */ + for (offset = 0;; offset += nbytes) + { + /* If we got a cancel signal during the copy of the file, quit */ + CHECK_FOR_INTERRUPTS(); + + nbytes = read(srcfd, buffer, COPY_BUF_SIZE); + if (nbytes < 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not read file \"%s\": %m", fromfile))); + if (nbytes == 0) + break; + + /* + * If a valid recptr has been provided, the resulting file will have + * all its pages with LSN set accordingly + */ + if (recptr != InvalidXLogRecPtr) + { + char *page; + + /* + * If we are updating LSN of a file, we must be sure that the + * source file is not being extended. + */ + if (nbytes % BLCKSZ != 0) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("file \"%s\" size is not multiple of %d", + fromfile, BLCKSZ))); + + for (page = buffer; page < (buffer + nbytes); page += BLCKSZ, blkno++) + { + /* Update LSN only if the page looks valid */ + if (!PageIsNew(page) && PageIsVerified(page, blkno)) + { + PageSetLSN(page, recptr); + PageSetChecksumInplace(page, blkno); + } + } + } + + errno = 0; + if ((int) write(dstfd, buffer, nbytes) != nbytes) + { + /* if write didn't set errno, assume problem is no disk space */ + if (errno == 0) + errno = ENOSPC; + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not write to file \"%s\": %m", tofile))); + } + + /* + * We fsync the files later but first flush them to avoid spamming the + * cache and hopefully get the kernel to start writing them out before + * the fsync comes. Ignore any error, since it's only a hint. + */ + (void) pg_flush_data(dstfd, offset, nbytes); + } + + if (CloseTransientFile(dstfd)) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not close file \"%s\": %m", tofile))); + + CloseTransientFile(srcfd); + + pfree(buffer); + } diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c index 107d70c..8f85752 100644 *** a/src/backend/replication/basebackup.c --- b/src/backend/replication/basebackup.c *************** sendDir(char *path, int basepathlen, boo *** 1219,1225 **** is_relfile = ( has_relfiles && parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forknum) && forknum == MAIN_FORKNUM); if (!is_relfile --- 1219,1226 ---- is_relfile = ( has_relfiles && parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forknum, ! NULL) && forknum == MAIN_FORKNUM); if (!is_relfile diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index 02b5fee..2f7dca6 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *************** ResetUnloggedRelationsInDbspaceDir(const *** 190,196 **** /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum)) continue; /* Also skip it unless this is the init fork. */ --- 190,196 ---- /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum, NULL)) continue; /* Also skip it unless this is the init fork. */ *************** ResetUnloggedRelationsInDbspaceDir(const *** 243,249 **** /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum)) continue; /* We never remove the init fork. */ --- 243,249 ---- /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum, NULL)) continue; /* We never remove the init fork. */ *************** ResetUnloggedRelationsInDbspaceDir(const *** 313,319 **** /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum)) continue; /* Also skip it unless this is the init fork. */ --- 313,319 ---- /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum, NULL)) continue; /* Also skip it unless this is the init fork. */ *************** ResetUnloggedRelationsInDbspaceDir(const *** 364,370 **** /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum)) continue; /* Also skip it unless this is the init fork. */ --- 364,370 ---- /* Skip anything that doesn't look like a relation data file. */ if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, ! &forkNum, NULL)) continue; /* Also skip it unless this is the init fork. */ diff --git a/src/common/relpath.c b/src/common/relpath.c index 83a1e3a..63972bd 100644 *** a/src/common/relpath.c --- b/src/common/relpath.c *************** GetRelationPath(Oid dbNode, Oid spcNode, *** 213,218 **** --- 213,222 ---- * This function returns true if the file appears to be in the correct format * for a non-temporary relation and false otherwise. * + * The segno parameter can be safely set to NULL. + * It should be of BlockNumber* type, but it is declared as uint32 + * to avoid depending on storage/block.h + * * NB: If this function returns true, the caller is entitled to assume that * *oidchars has been set to the a value no more than OIDCHARS, and thus * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID *************** GetRelationPath(Oid dbNode, Oid spcNode, *** 221,227 **** */ bool parse_filename_for_nontemp_relation(const char *name, int *oidchars, ! ForkNumber *fork) { int pos; --- 225,231 ---- */ bool parse_filename_for_nontemp_relation(const char *name, int *oidchars, ! ForkNumber *fork, uint32 *segno) { int pos; *************** parse_filename_for_nontemp_relation(cons *** 246,257 **** } /* Check for a segment number. */ if (name[pos] == '.') { int segchar; for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar) ! ; if (segchar <= 1) return false; pos += segchar; --- 250,264 ---- } /* Check for a segment number. */ + if (segno) + *segno = 0; if (name[pos] == '.') { int segchar; for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar) ! if (segno) ! *segno = *segno * 10 + name[pos + segchar] - '0'; if (segchar <= 1) return false; pos += segchar; diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h index 9736a78..9dd492f 100644 *** a/src/include/common/relpath.h --- b/src/include/common/relpath.h *************** extern char *GetDatabasePath(Oid dbNode, *** 53,59 **** extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode, int backendId, ForkNumber forkNumber); extern bool parse_filename_for_nontemp_relation(const char *name, ! int *oidchars, ForkNumber *fork); /* * Wrapper macros for GetRelationPath. Beware of multiple --- 53,60 ---- extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode, int backendId, ForkNumber forkNumber); extern bool parse_filename_for_nontemp_relation(const char *name, ! int *oidchars, ForkNumber *fork, ! uint32 *seqno); /* * Wrapper macros for GetRelationPath. Beware of multiple -- 2.3.0
signature.asc
Description: OpenPGP digital signature