Re: [HACKERS] File based Incremental backup v8
On Tue, Jan 26, 2016 at 12:55 AM, David Fetter wrote: > On Tue, Mar 10, 2015 at 08:25:27AM +0900, Michael Paquier wrote: >> On Tue, Mar 10, 2015 at 1:50 AM, Robert Haas wrote: >> > I think there's absolutely no point in spending more time on this for >> > 9.5. At least 4 committers have looked at it and none of them are >> > convinced by the current design; feedback from almost half a year ago >> > hasn't been incorporated; obviously-needed parts of the patch >> > (pg_restorebackup) are missing weeks after the last CF deadline. >> > Let's mark this Rejected in the CF app and move on. >> >> Agreed. I lost a bit interest in this patch lately, but if all the >> necessary parts of the patch were not posted before the CF deadline >> that's not something we should consider for integration at this point. >> Let's give it a couple of months of fresh air and, Gabriele, I am sure >> you will be able to come back with something far more advanced for the >> first CF of 9.6. > > What's the latest on this patch? My guess is that Marco and Gabriele are working on something directly for barman, the backup tool they use, with a differential backup implementation based on tracking blocks modified by WAL records (far faster for large data sets than scanning all the relation files of PGDATA). Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
On Tue, Mar 10, 2015 at 08:25:27AM +0900, Michael Paquier wrote: > On Tue, Mar 10, 2015 at 1:50 AM, Robert Haas wrote: > > I think there's absolutely no point in spending more time on this for > > 9.5. At least 4 committers have looked at it and none of them are > > convinced by the current design; feedback from almost half a year ago > > hasn't been incorporated; obviously-needed parts of the patch > > (pg_restorebackup) are missing weeks after the last CF deadline. > > Let's mark this Rejected in the CF app and move on. > > Agreed. I lost a bit interest in this patch lately, but if all the > necessary parts of the patch were not posted before the CF deadline > that's not something we should consider for integration at this point. > Let's give it a couple of months of fresh air and, Gabriele, I am sure > you will be able to come back with something far more advanced for the > first CF of 9.6. What's the latest on this patch? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
On Tue, Mar 10, 2015 at 1:50 AM, Robert Haas wrote: > I think there's absolutely no point in spending more time on this for > 9.5. At least 4 committers have looked at it and none of them are > convinced by the current design; feedback from almost half a year ago > hasn't been incorporated; obviously-needed parts of the patch > (pg_restorebackup) are missing weeks after the last CF deadline. > Let's mark this Rejected in the CF app and move on. Agreed. I lost a bit interest in this patch lately, but if all the necessary parts of the patch were not posted before the CF deadline that's not something we should consider for integration at this point. Let's give it a couple of months of fresh air and, Gabriele, I am sure you will be able to come back with something far more advanced for the first CF of 9.6. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
On Sat, Mar 7, 2015 at 5:45 PM, Gabriele Bartolini wrote: >> By the way, unless I'm missing something, this patch only seems to >> include the code to construct an incremental backup, but no tools >> whatsoever to do anything useful with it once you've got it. > > As stated previously, Marco is writing a tool called pg_restorebackup (the > prototype in Python has been already posted) to be included in the core. I > am in Australia now and not in the office so I cannot confirm it, but I am > pretty sure he had already written it and was about to send it to the list. Gabriele, the deadline for the last CommitFest was three weeks ago. Having a patch that you are "about to send to the list" is not good enough at this point. >> I think that's 100% unacceptable. > > I agree, that's why pg_restorebackup written in C is part of this patch. > See: https://wiki.postgresql.org/wiki/Incremental_backup No, it *isn't* part of this patch. You may have a plan to add it to this patch, but that's not the same thing. > The goal has always been to provide "file-based incremental backup". I can > assure that this has always been our compass and the direction to follow. Regardless of community feedback? OK. Let's see how that works out for you. > I repeat that, using pg_restorebackup, this patch will transparently let > users benefit from incremental backup even when it will be moved to an > internal block-level logic. Users will continue to execute pg_basebackup and > pg_restorebackup, ignoring that with - for example 9.5 - it is file-based > (saving between 50-70% of space and time) of block level - for example 9.6. I understand that. But I also understand that in other cases it's going to be slower than a full backup. This problem has been pointed out several times, and you're just refusing to admit that it's a real issue. A user with a bunch of tables where only the rows near the end of the file get updated is going to repeatedly read those files until it hits the first modified block and then rewind and reread the whole file. I pointed this problem out back in early October and suggested some ways of fixing it; Heikki followed up with his own suggestions for modifying my idea. Instead of implementing any of that, or even discussing it, you're still plugging away on a design that no committer has endorsed and that several committers obviously have concerns about. It's also pretty clear that nobody likes the backup profile, at least in the form it exists today. But it's still here, many patch versions later. I think there's absolutely no point in spending more time on this for 9.5. At least 4 committers have looked at it and none of them are convinced by the current design; feedback from almost half a year ago hasn't been incorporated; obviously-needed parts of the patch (pg_restorebackup) are missing weeks after the last CF deadline. Let's mark this Rejected in the CF app and move on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
On Sun, Mar 8, 2015 at 09:26:38AM +1100, Gabriele Bartolini wrote: > Hi Bruce, > > 2015-03-08 5:37 GMT+11:00 Bruce Momjian : > > Desirability -> Design -> Implement -> Test -> Review -> Commit > > This patch has continued in development without getting agreement on > its Desirability or Design, meaning we are going to continue going back > to those points until there is agreement. Posting more versions of this > patch is not going to change that. > > > Could you please elaborate that? > > I actually think the approach that has been followed is what makes open source > and collaborative development work. The initial idea was based on timestamp > approach which, thanks to the input of several developers, led Marco to > develop > LSN based checks and move forward the feature implementation. OK, if you think everyone just going on their own and working on patches that have little chance of being accepted, you can do it, but that rarely makes successful open source software. You can do whatever you want with your patch. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Hi Robert, 2015-03-07 2:57 GMT+11:00 Robert Haas : > By the way, unless I'm missing something, this patch only seems to > include the code to construct an incremental backup, but no tools > whatsoever to do anything useful with it once you've got it. As stated previously, Marco is writing a tool called pg_restorebackup (the prototype in Python has been already posted) to be included in the core. I am in Australia now and not in the office so I cannot confirm it, but I am pretty sure he had already written it and was about to send it to the list. He's been trying to find more data - see the one that he's sent - in order to convince that even a file-based approach is useful. I think that's 100% unacceptable. I agree, that's why pg_restorebackup written in C is part of this patch. See: https://wiki.postgresql.org/wiki/Incremental_backup > Users need to be able to manipulate > PostgreSQL backups using either standard operating system tools or > tools provided with PostgreSQL. Some people may prefer to use > something like repmgr or pitrtools or omniptr in addition, but that > shouldn't be a requirement for incremental backup to be usable. > Not at all. I believe those tools will have to use pg_basebackup and pg_restorebackup. If they want to use streaming replication protocol they will be responsible to make sure that - if the protocol changes - they adapt their technology. Agile development is good, but that does not mean you can divide a big > project into arbitrarily small chunks. At some point the chunks are > too small to be sure that the overall direction is right, and/or > individually useless. > The goal has always been to provide "file-based incremental backup". I can assure that this has always been our compass and the direction to follow. I repeat that, using pg_restorebackup, this patch will transparently let users benefit from incremental backup even when it will be moved to an internal block-level logic. Users will continue to execute pg_basebackup and pg_restorebackup, ignoring that with - for example 9.5 - it is file-based (saving between 50-70% of space and time) of block level - for example 9.6. My proposal is that Marco provides pg_restorebackup according to the initial plan - a matter of hours/days. Cheers, Gabriele
Re: [HACKERS] File based Incremental backup v8
Hi Bruce, 2015-03-08 5:37 GMT+11:00 Bruce Momjian : > > Desirability -> Design -> Implement -> Test -> Review -> Commit > > This patch has continued in development without getting agreement on > its Desirability or Design, meaning we are going to continue going back > to those points until there is agreement. Posting more versions of this > patch is not going to change that. > Could you please elaborate that? I actually think the approach that has been followed is what makes open source and collaborative development work. The initial idea was based on timestamp approach which, thanks to the input of several developers, led Marco to develop LSN based checks and move forward the feature implementation. The numbers that Marco has posted clearly show that a lot of users will benefit from this file-based approach for incremental backup through pg_basebackup. As far as I see it, the only missing bit is the pg_restorebackup tool which is quite trivial - given the existing prototype in Python. Thanks, Gabriele
Re: [HACKERS] File based Incremental backup v8
On Thu, Mar 5, 2015 at 11:10:08AM -0500, Robert Haas wrote: > But I agree with Fujii to the extent that I see little value in > committing this patch in the form proposed. Being smart enough to use > the LSN to identify changed blocks, but then sending the entirety of > every file anyway because you don't want to go to the trouble of > figuring out how to revise the wire protocol to identify the > individual blocks being sent and write the tools to reconstruct a full > backup based on that data, does not seem like enough of a win. As > Fujii says, if we ship this patch as written, people will just keep > using the timestamp-based approach anyway. Let's wait until we have > something that is, at least in some circumstances, a material > improvement over the status quo before committing anything. The big problem I have with this patch is that it has not followed the proper process for development, i.e. at the top of the TODO list we have: Desirability -> Design -> Implement -> Test -> Review -> Commit This patch has continued in development without getting agreement on its Desirability or Design, meaning we are going to continue going back to those points until there is agreement. Posting more versions of this patch is not going to change that. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Il 05/03/15 05:42, Bruce Momjian ha scritto: > On Thu, Mar 5, 2015 at 01:25:13PM +0900, Fujii Masao wrote: Yeah, it might make the situation better than today. But I'm afraid that many users might get disappointed about that behavior of an incremental backup after the release... >>> >>> I don't get what do you mean here. Can you elaborate this point? >> >> The proposed version of LSN-based incremental backup has some limitations >> (e.g., every database files need to be read even when there is no >> modification >> in database since last backup, and which may make the backup time longer than >> users expect) which may disappoint users. So I'm afraid that users who can >> benefit from the feature might be very limited. IOW, I'm just sticking to >> the idea of timestamp-based one :) But I should drop it if the majority in >> the list prefers the LSN-based one even if it has such limitations. > > We need numbers on how effective each level of tracking will be. Until > then, the patch can't move forward. > I've written a little test script to estimate how much space can be saved by file level incremental, and I've run it on some real backups I have access to. The script takes two basebackup directory and simulate how much data can be saved in the 2nd backup using incremental backup (using file size/time and LSN) It assumes that every file in base, global and pg_tblspc which matches both size and modification time will also match from the LSN point of view. The result is that many databases can take advantage of incremental, even if not do big, and considering LSNs yield a result almost identical to the approach based on filesystem metadata. == Very big geographic database (similar to openstreetmap main DB), it contains versioned data, interval two months First backup size: 13286623850656 (12.1TiB) Second backup size: 13323511925626 (12.1TiB) Matching files count: 17094 Matching LSN count: 14580 Matching files size: 9129755116499 (8.3TiB, 68.5%) Matching LSN size: 9128568799332 (8.3TiB, 68.5%) == Big on-line store database, old data regularly moved to historic partitions, interval one day First backup size: 1355285058842 (1.2TiB) Second backup size: 1358389467239 (1.2TiB) Matching files count: 3937 Matching LSN count: 2821 Matching files size: 762292960220 (709.9GiB, 56.1%) Matching LSN size: 762122543668 (709.8GiB, 56.1%) == Ticketing system database, interval one day First backup size: 144988275 (138.3MiB) Second backup size: 146135155 (139.4MiB) Matching files count: 3124 Matching LSN count: 2641 Matching files size: 76908986 (73.3MiB, 52.6%) Matching LSN size: 67747928 (64.6MiB, 46.4%) == Online store, interval one day First backup size: 20418561133 (19.0GiB) Second backup size: 20475290733 (19.1GiB) Matching files count: 5744 Matching LSN count: 4302 Matching files size: 4432709876 (4.1GiB, 21.6%) Matching LSN size: 4388993884 (4.1GiB, 21.4%) == Heavily updated database, interval one week First backup size: 3203198962 (3.0GiB) Second backup size: 3222409202 (3.0GiB) Matching files count: 1801 Matching LSN count: 1273 Matching files size: 91206317 (87.0MiB, 2.8%) Matching LSN size: 69083532 (65.9MiB, 2.1%) Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it #!/usr/bin/env python from __future__ import print_function import collections from optparse import OptionParser import os import sys __author__ = 'Marco Nenciarini ' usage = """usage: %prog [options] backup_1 backup_2 """ parser = OptionParser(usage=usage) (options, args) = parser.parse_args() # need 2 directories if len(args) != 2: parser.print_help() sys.exit(1) FileItem = collections.namedtuple('FileItem', 'size time path') def get_files(target_dir): """Return a set of FileItem""" files = set() for dir_path, _, file_names in os.walk(target_dir, followlinks=True): for filename in file_names: path = os.path.join(dir_path, filename) rel_path = path[len(target_dir) + 1:] stats = os.stat(path) files.add(FileItem(stats.st_size, stats.st_mtime, rel_path)) return files def size_fmt(num, suffix='B'): """Format a size""" for unit in ['', 'Ki', 'Mi', 'Gi', 'Ti', 'Pi', 'Ei', 'Zi']: if abs(num) < 1024.0: return "%3.1f%s%s" % (num, unit, suffix) num /= 1024.0 return "%.1f%s%s" % (num, 'Yi', suffix) def percent_fmt(a, b): """Format a percent""" return "%.1f%%" % (100.0*a/b) def get_size(file_set): return reduce(lambda x, y: x + y.size, file_set, 0) def report(a, b): # find files that are identical (same size and same time) common = a & b # remove files count only main forks LSN common_lsn = set() for item in common: # remove non main fork if any(suffix in item.path for suffix in ('_fsm', '_vm', '_init')): continue
Re: [HACKERS] File based Incremental backup v8
On Fri, Mar 6, 2015 at 9:38 AM, Gabriele Bartolini wrote: > I believe the main point is to look at a user interface point of view. > If/When we switch to a block level incremental support, this will be > completely transparent to the end user, even if we start with a file-level > approach with LSN check. I don't think that's true. To have a real file-level incremental backup you need the ability to take the incremental backup, and then you also need the ability to take a full backup + an incremental backup taken later and reassemble a full image of the cluster on which you can run recovery. The means of doing that is going to be different for an approach that only copies certain blocks vs. one that copies whole files. Once we have the block-based approach, nobody will ever use the file-based approach, so whatever code or tools we write to do that will all be dead code, yet we'll still have to support them for many years. By the way, unless I'm missing something, this patch only seems to include the code to construct an incremental backup, but no tools whatsoever to do anything useful with it once you've got it. I think that's 100% unacceptable. Users need to be able to manipulate PostgreSQL backups using either standard operating system tools or tools provided with PostgreSQL. Some people may prefer to use something like repmgr or pitrtools or omniptr in addition, but that shouldn't be a requirement for incremental backup to be usable. Agile development is good, but that does not mean you can divide a big project into arbitrarily small chunks. At some point the chunks are too small to be sure that the overall direction is right, and/or individually useless. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Hi Robert, 2015-03-06 3:10 GMT+11:00 Robert Haas : > But I agree with Fujii to the extent that I see little value in > committing this patch in the form proposed. Being smart enough to use > the LSN to identify changed blocks, but then sending the entirety of > every file anyway because you don't want to go to the trouble of > figuring out how to revise the wire protocol to identify the > individual blocks being sent and write the tools to reconstruct a full > backup based on that data, does not seem like enough of a win. I believe the main point is to look at a user interface point of view. If/When we switch to a block level incremental support, this will be completely transparent to the end user, even if we start with a file-level approach with LSN check. The win is already determined by the average space/time gained by users of VLDB with a good chunk of read-only data. Our Barman users with incremental backup (released recently - its algorithm can be compared to the one of file-level backup proposed by Marco) can benefit on average of a data deduplication ratio ranging between 50 to 70% of the cluster size. A tangible example is depicted here, with Navionics saving 8.2TB a week thanks to this approach (and 17 hours instead of 50 for backup time): http://blog.2ndquadrant.com/incremental-backup-barman-1-4-0/ However, even smaller databases will benefit. It is clear that very small databases as well as frequently updated ones won't be interested in incremental backup, but that is never been the use case for this feature. I believe that if we still think that this approach is not worth it, we are making a big mistake. The way I see it, this patch follows an agile approach and it is an important step towards incremental backup on a block basis. > As Fujii says, if we ship this patch as written, people will just keep > using the timestamp-based approach anyway. I think that allowing users to be able to backup in an incremental way through streaming replication (even though based on files) will give more flexibility to system and database administrators for their disaster recovery solutions. Thanks, Gabriele
Re: [HACKERS] File based Incremental backup v8
On Wed, Mar 4, 2015 at 11:42 PM, Bruce Momjian wrote: > On Thu, Mar 5, 2015 at 01:25:13PM +0900, Fujii Masao wrote: >> >> Yeah, it might make the situation better than today. But I'm afraid that >> >> many users might get disappointed about that behavior of an incremental >> >> backup after the release... >> > >> > I don't get what do you mean here. Can you elaborate this point? >> >> The proposed version of LSN-based incremental backup has some limitations >> (e.g., every database files need to be read even when there is no >> modification >> in database since last backup, and which may make the backup time longer than >> users expect) which may disappoint users. So I'm afraid that users who can >> benefit from the feature might be very limited. IOW, I'm just sticking to >> the idea of timestamp-based one :) But I should drop it if the majority in >> the list prefers the LSN-based one even if it has such limitations. > > We need numbers on how effective each level of tracking will be. Until > then, the patch can't move forward. The point is that this is a stepping stone toward what will ultimately be a better solution. You can use timestamps today if (a) whole-file granularity is good enough for you and (b) you trust your system clock to never go backwards. In fact, if you use pg_start_backup() and pg_stop_backup(), you don't even need a server patch; you can just go right ahead and implement whatever you like. A server patch would be needed to make pg_basebackup do a file-time-based incremental backup, but I'm not excited about that because I think the approach is a dead-end. If you want block-level granularity, and you should, an approach based on file times is never going to get you there. An approach based on LSNs can. If the first version of the patch requires reading the whole database, fine, it's not going to perform all that terribly well. But we can optimize that later by keeping track of which blocks have been modified since a given LSN. If we do that, we can get better reliability than the timestamp approach can ever offer, plus excellent transfer and storage characteristics. What I'm unhappy with about this patch is that it insists on sending the whole file if a single block in that file has changed. That is lame. To get something useful out of this, we should be looking to send only those blocks whose LSNs have actually changed. That would reduce I/O (in the worst case, the current patch each file in its entirety twice) and transfer bandwidth as compared to the proposed patch. We'd still have to read the whole database so it might very well do more I/O than the file-timestamp approach, but it would beat the file-timestamp approach on transfer bandwidth and on the amount of storage required to store the incremental. In many workloads, I expect those savings would be quite significant. If we then went back in a later release and implemented one of the various proposals to avoid needing to read every block, we'd then have a very robust and complete solution. But I agree with Fujii to the extent that I see little value in committing this patch in the form proposed. Being smart enough to use the LSN to identify changed blocks, but then sending the entirety of every file anyway because you don't want to go to the trouble of figuring out how to revise the wire protocol to identify the individual blocks being sent and write the tools to reconstruct a full backup based on that data, does not seem like enough of a win. As Fujii says, if we ship this patch as written, people will just keep using the timestamp-based approach anyway. Let's wait until we have something that is, at least in some circumstances, a material improvement over the status quo before committing anything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
On Thu, Mar 5, 2015 at 01:25:13PM +0900, Fujii Masao wrote: > >> Yeah, it might make the situation better than today. But I'm afraid that > >> many users might get disappointed about that behavior of an incremental > >> backup after the release... > > > > I don't get what do you mean here. Can you elaborate this point? > > The proposed version of LSN-based incremental backup has some limitations > (e.g., every database files need to be read even when there is no modification > in database since last backup, and which may make the backup time longer than > users expect) which may disappoint users. So I'm afraid that users who can > benefit from the feature might be very limited. IOW, I'm just sticking to > the idea of timestamp-based one :) But I should drop it if the majority in > the list prefers the LSN-based one even if it has such limitations. We need numbers on how effective each level of tracking will be. Until then, the patch can't move forward. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
On Thu, Mar 5, 2015 at 1:59 AM, Marco Nenciarini wrote: > Hi Fujii, > > Il 03/03/15 11:48, Fujii Masao ha scritto: >> On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini >> wrote: >>> Il 02/03/15 14:21, Fujii Masao ha scritto: On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini wrote: > Hi, > > I've attached an updated version of the patch. basebackup.c:1565: warning: format '%lld' expects type 'long long int', but argument 8 has type '__off_t' basebackup.c:1565: warning: format '%lld' expects type 'long long int', but argument 8 has type '__off_t' pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code >>> >>> I'll add the an explicit cast at that two lines. >>> When I applied three patches and compiled the code, I got the above warnings. How can we get the full backup that we can use for the archive recovery, from the first full backup and subsequent incremental backups? What commands should we use for that, for example? It's better to document that. >>> >>> I've sent a python PoC that supports the plain format only (not the tar >>> one). >>> I'm currently rewriting it in C (with also the tar support) and I'll send a >>> new patch containing it ASAP. >> >> Yeah, if special tool is required for that purpose, the patch should include >> it. >> > > I'm working on it. The interface will be exactly the same of the PoC script > I've attached to > > 54c7cdad.6060...@2ndquadrant.it > What does "1" of the heading line in backup_profile mean? >>> >>> Nothing. It's a version number. If you think it's misleading I will remove >>> it. >> >> A version number of file format of backup profile? If it's required for >> the validation of backup profile file as a safe-guard, it should be included >> in the profile file. For example, it might be useful to check whether >> pg_basebackup executable is compatible with the "source" backup that >> you specify. But more info might be needed for such validation. >> > > The current implementation bail out with an error if the header line is > different from what it expect. > It also reports and error if the 2nd line is not the start WAL location. > That's all that pg_basebackup needs to start a new incremental backup. All > the other information are useful to reconstruct a full backup in case of an > incremental backup, or maybe to check the completeness of an archived full > backup. > Initially the profile was present only in incremental backups, but after some > discussion on list we agreed to always write it. Don't we need more checks about the compatibility of the backup-target database cluster and the source incremental backup? Without such more checks, I'm afraid we can easily get a corrupted incremental backups. For example, pg_basebackup should emit an error if the target and source have the different system IDs, like walreceiver does? What happens if the timeline ID is different between the source and target? What happens if the source was taken from the standby but new incremental backup will be taken from the master? Do we need to check them? Sorry if this has been already discussed so far. Why is a backup profile file necessary? Maybe it's necessary in the future, but currently seems not. >>> >>> It's necessary because it's the only way to detect deleted files. >> >> Maybe I'm missing something. Seems we can detect that even without a profile. >> For example, please imagine the case where the file has been deleted since >> the last full backup and then the incremental backup is taken. In this case, >> that deleted file exists only in the full backup. We can detect the deletion >> of >> the file by checking both full and incremental backups. >> > > When you take an incremental backup, only changed files are sent. Without the > backup_profile in the incremental backup, you cannot detect a deleted file, > because it's indistinguishable from a file that is not changed. Yeah, you're right! We've really gotten the consensus about the current design, especially that every files basically need to be read to check whether they have been modified since last backup even when *no* modification happens since last backup? >>> >>> The real problem here is that there is currently no way to detect that a >>> file is not changed since the last backup. We agreed to not use file system >>> timestamps as they are not reliable for that purpose. >> >> TBH I prefer timestamp-based approach in the first version of incremental >> backup >> even if's less reliable than LSN-based one. I think that some users who are >> using timestamp-based rsync (i.e., default mode) for the backup would be >> satisfied with timestamp-based one. > > The original design was to compare size+timestamp+checksums (only if > everything else matches and the file has been modified after the start of the > backup), but the feedback fr
Re: [HACKERS] File based Incremental backup v8
Hi Fujii, Il 03/03/15 11:48, Fujii Masao ha scritto: > On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini > wrote: >> Il 02/03/15 14:21, Fujii Masao ha scritto: >>> On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini >>> wrote: Hi, I've attached an updated version of the patch. >>> >>> basebackup.c:1565: warning: format '%lld' expects type 'long long >>> int', but argument 8 has type '__off_t' >>> basebackup.c:1565: warning: format '%lld' expects type 'long long >>> int', but argument 8 has type '__off_t' >>> pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code >>> >> >> I'll add the an explicit cast at that two lines. >> >>> When I applied three patches and compiled the code, I got the above >>> warnings. >>> >>> How can we get the full backup that we can use for the archive recovery, >>> from >>> the first full backup and subsequent incremental backups? What commands >>> should >>> we use for that, for example? It's better to document that. >>> >> >> I've sent a python PoC that supports the plain format only (not the tar one). >> I'm currently rewriting it in C (with also the tar support) and I'll send a >> new patch containing it ASAP. > > Yeah, if special tool is required for that purpose, the patch should include > it. > I'm working on it. The interface will be exactly the same of the PoC script I've attached to 54c7cdad.6060...@2ndquadrant.it >>> What does "1" of the heading line in backup_profile mean? >>> >> >> Nothing. It's a version number. If you think it's misleading I will remove >> it. > > A version number of file format of backup profile? If it's required for > the validation of backup profile file as a safe-guard, it should be included > in the profile file. For example, it might be useful to check whether > pg_basebackup executable is compatible with the "source" backup that > you specify. But more info might be needed for such validation. > The current implementation bail out with an error if the header line is different from what it expect. It also reports and error if the 2nd line is not the start WAL location. That's all that pg_basebackup needs to start a new incremental backup. All the other information are useful to reconstruct a full backup in case of an incremental backup, or maybe to check the completeness of an archived full backup. Initially the profile was present only in incremental backups, but after some discussion on list we agreed to always write it. >>> Sorry if this has been already discussed so far. Why is a backup profile >>> file >>> necessary? Maybe it's necessary in the future, but currently seems not. >> >> It's necessary because it's the only way to detect deleted files. > > Maybe I'm missing something. Seems we can detect that even without a profile. > For example, please imagine the case where the file has been deleted since > the last full backup and then the incremental backup is taken. In this case, > that deleted file exists only in the full backup. We can detect the deletion > of > the file by checking both full and incremental backups. > When you take an incremental backup, only changed files are sent. Without the backup_profile in the incremental backup, you cannot detect a deleted file, because it's indistinguishable from a file that is not changed. >>> We've really gotten the consensus about the current design, especially that >>> every files basically need to be read to check whether they have been >>> modified >>> since last backup even when *no* modification happens since last backup? >> >> The real problem here is that there is currently no way to detect that a >> file is not changed since the last backup. We agreed to not use file system >> timestamps as they are not reliable for that purpose. > > TBH I prefer timestamp-based approach in the first version of incremental > backup > even if's less reliable than LSN-based one. I think that some users who are > using timestamp-based rsync (i.e., default mode) for the backup would be > satisfied with timestamp-based one. The original design was to compare size+timestamp+checksums (only if everything else matches and the file has been modified after the start of the backup), but the feedback from the list was that we cannot trust the filesystem mtime and we must use LSN instead. > >> Using LSN have a significant advantage over using checksum, as we can start >> the full copy as soon as we found a block whith a LSN greater than the >> threshold. >> There are two cases: 1) the file is changed, so we can assume that we detect >> it after reading 50% of the file, then we send it taking advantage of file >> system cache; 2) the file is not changed, so we read it without sending >> anything. >> It will end up producing an I/O comparable to a normal backup. > > Yeah, it might make the situation better than today. But I'm afraid that > many users might get disappointed about that behavior of an incremental > backup after the rel
Re: [HACKERS] File based Incremental backup v8
On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini wrote: > Il 02/03/15 14:21, Fujii Masao ha scritto: >> On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini >> wrote: >>> Hi, >>> >>> I've attached an updated version of the patch. >> >> basebackup.c:1565: warning: format '%lld' expects type 'long long >> int', but argument 8 has type '__off_t' >> basebackup.c:1565: warning: format '%lld' expects type 'long long >> int', but argument 8 has type '__off_t' >> pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code >> > > I'll add the an explicit cast at that two lines. > >> When I applied three patches and compiled the code, I got the above warnings. >> >> How can we get the full backup that we can use for the archive recovery, from >> the first full backup and subsequent incremental backups? What commands >> should >> we use for that, for example? It's better to document that. >> > > I've sent a python PoC that supports the plain format only (not the tar one). > I'm currently rewriting it in C (with also the tar support) and I'll send a > new patch containing it ASAP. Yeah, if special tool is required for that purpose, the patch should include it. >> What does "1" of the heading line in backup_profile mean? >> > > Nothing. It's a version number. If you think it's misleading I will remove it. A version number of file format of backup profile? If it's required for the validation of backup profile file as a safe-guard, it should be included in the profile file. For example, it might be useful to check whether pg_basebackup executable is compatible with the "source" backup that you specify. But more info might be needed for such validation. >> Sorry if this has been already discussed so far. Why is a backup profile file >> necessary? Maybe it's necessary in the future, but currently seems not. > > It's necessary because it's the only way to detect deleted files. Maybe I'm missing something. Seems we can detect that even without a profile. For example, please imagine the case where the file has been deleted since the last full backup and then the incremental backup is taken. In this case, that deleted file exists only in the full backup. We can detect the deletion of the file by checking both full and incremental backups. >> We've really gotten the consensus about the current design, especially that >> every files basically need to be read to check whether they have been >> modified >> since last backup even when *no* modification happens since last backup? > > The real problem here is that there is currently no way to detect that a file > is not changed since the last backup. We agreed to not use file system > timestamps as they are not reliable for that purpose. TBH I prefer timestamp-based approach in the first version of incremental backup even if's less reliable than LSN-based one. I think that some users who are using timestamp-based rsync (i.e., default mode) for the backup would be satisfied with timestamp-based one. > Using LSN have a significant advantage over using checksum, as we can start > the full copy as soon as we found a block whith a LSN greater than the > threshold. > There are two cases: 1) the file is changed, so we can assume that we detect > it after reading 50% of the file, then we send it taking advantage of file > system cache; 2) the file is not changed, so we read it without sending > anything. > It will end up producing an I/O comparable to a normal backup. Yeah, it might make the situation better than today. But I'm afraid that many users might get disappointed about that behavior of an incremental backup after the release... Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Il 02/03/15 14:21, Fujii Masao ha scritto: > On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini > wrote: >> Hi, >> >> I've attached an updated version of the patch. > > basebackup.c:1565: warning: format '%lld' expects type 'long long > int', but argument 8 has type '__off_t' > basebackup.c:1565: warning: format '%lld' expects type 'long long > int', but argument 8 has type '__off_t' > pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code > I'll add the an explicit cast at that two lines. > When I applied three patches and compiled the code, I got the above warnings. > > How can we get the full backup that we can use for the archive recovery, from > the first full backup and subsequent incremental backups? What commands should > we use for that, for example? It's better to document that. > I've sent a python PoC that supports the plain format only (not the tar one). I'm currently rewriting it in C (with also the tar support) and I'll send a new patch containing it ASAP. > What does "1" of the heading line in backup_profile mean? > Nothing. It's a version number. If you think it's misleading I will remove it. > Sorry if this has been already discussed so far. Why is a backup profile file > necessary? Maybe it's necessary in the future, but currently seems not. It's necessary because it's the only way to detect deleted files. > Several infos like LSN, modification time, size, etc are tracked in a backup > profile file for every backup files, but they are not used for now. If it's > now > not required, I'm inclined to remove it to simplify the code. I've put LSN there mainly for debugging purpose, but it can also be used to check the file during pg_restorebackup execution. The sent field is probably redundant (if sent = False and LSN is not set, we should probably simply avoid to write a line about that file) and I'll remove it in the next patch. > > We've really gotten the consensus about the current design, especially that > every files basically need to be read to check whether they have been modified > since last backup even when *no* modification happens since last backup? The real problem here is that there is currently no way to detect that a file is not changed since the last backup. We agreed to not use file system timestamps as they are not reliable for that purpose. Using LSN have a significant advantage over using checksum, as we can start the full copy as soon as we found a block whith a LSN greater than the threshold. There are two cases: 1) the file is changed, so we can assume that we detect it after reading 50% of the file, then we send it taking advantage of file system cache; 2) the file is not changed, so we read it without sending anything. It will end up producing an I/O comparable to a normal backup. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] File based Incremental backup v8
On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini wrote: > Hi, > > I've attached an updated version of the patch. basebackup.c:1565: warning: format '%lld' expects type 'long long int', but argument 8 has type '__off_t' basebackup.c:1565: warning: format '%lld' expects type 'long long int', but argument 8 has type '__off_t' pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code When I applied three patches and compiled the code, I got the above warnings. How can we get the full backup that we can use for the archive recovery, from the first full backup and subsequent incremental backups? What commands should we use for that, for example? It's better to document that. What does "1" of the heading line in backup_profile mean? Sorry if this has been already discussed so far. Why is a backup profile file necessary? Maybe it's necessary in the future, but currently seems not. Several infos like LSN, modification time, size, etc are tracked in a backup profile file for every backup files, but they are not used for now. If it's now not required, I'm inclined to remove it to simplify the code. We've really gotten the consensus about the current design, especially that every files basically need to be read to check whether they have been modified since last backup even when *no* modification happens since last backup? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Hi, I've attached an updated version of the patch. This fixes the issue on checksum calculation for segments after the first one. To solve it I've added an optional uint32 *segno argument to parse_filename_for_nontemp_relation, so I can know the segment number and calculate the block number correctly. Il 29/01/15 18:57, Robert Haas ha scritto: > On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini > wrote: >> The current implementation of copydir function is incompatible with LSN >> based incremental backups. The problem is that new files are created, >> but their blocks are still with the old LSN, so they will not be backed >> up because they are looking old enough. > > I think this is trying to pollute what's supposed to be a pure > fs-level operation ("copy a directory") into something that is aware > of specific details like the PostgreSQL page format. I really think > that nothing in storage/file should know about the page format. If we > need a function that copies a file while replacing the LSNs, I think > it should be a new function living somewhere else. > I've named it copydir_set_lsn and placed it as static function in dbcommands.c. This lefts the copydir and copy_file functions in copydir.c untouched. The copydir function in copydir.c is now unused, while the copy_file function is still used during unlogged tables reinit. > A bigger problem is that you are proposing to stamp those files with > LSNs that are, for lack of a better word, fake. I would expect that > this would completely break if checksums are enabled. Also, unlogged > relations typically have an LSN of 0; this would change that in some > cases, and I don't know whether that's OK. > I've investigate a bit and I have not been able to find any problem here. > The issues here are similar to those in > http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de > - basically, I think we need to make CREATE DATABASE and ALTER > DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is > never going to work right. If we're not going to allow that, we need > to disallow hot backups while those operations are in progress. > As already said the copydir-LSN patch should be treated as a "temporary" until a proper WAL logging of CREATE DATABASE and ALTER DATABASE SET TABLESPACE will be implemented. At that time we could probably get rid of the whole copydir.[ch] file moving the copy_file function inside reinit.c Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index afd9255..02b5fee 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *** static void ResetUnloggedRelationsInTabl *** 28,35 int op); static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op); - static bool parse_filename_for_nontemp_relation(const char *name, - int *oidchars, ForkNumber *fork); typedef struct { --- 28,33 *** ResetUnloggedRelationsInDbspaceDir(const *** 388,446 fsync_fname((char *) dbspacedirname, true); } } - - /* - * Basic parsing of putative relation filenames. - * - * This function returns true if the file appears to be in the correct format - * for a non-temporary relation and false otherwise. - * - * NB: If this function returns true, the caller is entitled to assume that - * *oidchars has been set to the a value no more than OIDCHARS, and thus - * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID - * portion of the filename. This is critical to protect against a possible - * buffer overrun. - */ - static bool - parse_filename_for_nontemp_relation(const char *name, int *oidchars, - ForkNumber *fork) - { - int pos; - - /* Look for a non-empty string of digits (that isn't too long). */ - for (pos = 0; isdigit((unsigned char) name[pos]); ++pos) - ; - if (pos == 0 || pos > OIDCHARS) - return false; - *oidchars = pos; - - /* Check for a fork name. */ - if (name[pos] != '_') - *fork = MAIN_FORKNUM; - else - { - int forkchar; - - forkchar = forkname_chars(&name[pos + 1], fork); - if (forkchar <= 0) - return false; - pos += forkchar + 1; - } - - /* Check for a segment number. */ - if (name[pos] == '.') - { - int segchar; - -
Re: [HACKERS] File based Incremental backup v8
On Sat, January 31, 2015 15:14, Marco Nenciarini wrote: > 0001-public-parse_filename_for_nontemp_relation.patch > 0002-copydir-LSN-v2.patch > 0003-File-based-incremental-backup-v8.patch Hi, It looks like it only compiles with assert enabled. This is perhaps not yet really a problem at this stage but I thought I'd mention it: make --quiet -j 8 In file included from gram.y:14403:0: scan.c: In function yy_try_NUL_trans: scan.c:10174:23: warning: unused variable yyg [-Wunused-variable] struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be unused depending upon options. */ ^ basebackup.c: In function writeBackupProfileLine: basebackup.c:1545:8: warning: format %lld expects argument of type long long int, but argument 8 has type __off_t [-Wformat=] filename); ^ basebackup.c:1545:8: warning: format %lld expects argument of type long long int, but argument 8 has type __off_t [-Wformat=] pg_basebackup.c: In function ReceiveTarFile: pg_basebackup.c:858:2: warning: implicit declaration of function assert [-Wimplicit-function-declaration] assert(res || (strcmp(basedir, "-") == 0)); ^ pg_basebackup.c:865:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] gzFile ztarfile = NULL; ^ pg_basebackup.o: In function `ReceiveAndUnpackTarFile': pg_basebackup.c:(.text+0x690): undefined reference to `assert' pg_basebackup.o: In function `ReceiveTarFile': pg_basebackup.c:(.text+0xeb0): undefined reference to `assert' pg_basebackup.c:(.text+0x10ad): undefined reference to `assert' collect2: error: ld returned 1 exit status make[3]: *** [pg_basebackup] Error 1 make[3]: *** Waiting for unfinished jobs make[2]: *** [all-pg_basebackup-recurse] Error 2 make[2]: *** Waiting for unfinished jobs make[1]: *** [all-bin-recurse] Error 2 make: *** [all-src-recurse] Error 2 The configure used was: ./configure \ --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \ --bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin.fast \ --libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib.fast \ --with-pgport=6973 --quiet --enable-depend \ --with-extra-version=_incremental_backup_20150131_1521_08bd0c581158 \ --with-openssl --with-perl --with-libxml --with-libxslt --with-zlib A build with --enable-cassert and --enable-debug builds fine: ./configure \ --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \ --bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin \ --libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib \ --with-pgport=6973 --quiet --enable-depend \ --with-extra-version=_incremental_backup_20150131_1628_08bd0c581158 \ --enable-cassert --enable-debug \ --with-openssl --with-perl --with-libxml --with-libxslt --with-zlib I will further test with that. thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Il 29/01/15 18:57, Robert Haas ha scritto: > On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini > wrote: >> The current implementation of copydir function is incompatible with LSN >> based incremental backups. The problem is that new files are created, >> but their blocks are still with the old LSN, so they will not be backed >> up because they are looking old enough. > > I think this is trying to pollute what's supposed to be a pure > fs-level operation ("copy a directory") into something that is aware > of specific details like the PostgreSQL page format. I really think > that nothing in storage/file should know about the page format. If we > need a function that copies a file while replacing the LSNs, I think > it should be a new function living somewhere else. Given that the copydir function is used only during CREATE DATABASE and ALTER DATABASE SET TABLESPACE, we could move it/renaming it to a better place that clearly mark it as "knowing about page format". I'm open to suggestions on where to place it an on what should be the correct name. However the whole copydir patch here should be treated as a "temporary" thing. It is necessary until a proper WAL logging of CREATE DATABASE and ALTER DATABASE SET TABLESPACE will be implemented to support any form of LSN based incremental backup. > > A bigger problem is that you are proposing to stamp those files with > LSNs that are, for lack of a better word, fake. I would expect that > this would completely break if checksums are enabled. I'm sorry I completely ignored checksums in previous patch. The attached one works with checksums enabled. > Also, unlogged relations typically have an LSN of 0; this would > change that in some cases, and I don't know whether that's OK. > It shouldn't be a problem because all the code that uses unlogged relations normally skip all the WAL related operations. From the point of view of an incremental backup it is also not a problem, because restoring the backup the unlogged tables will get reinitialized because of crash recovery procedure. However if you think it is worth the effort, I can rewrite the copydir as a two pass operation detecting the unlogged tables on the first pass and avoiding the LSN update on unlogged tables. I personally think that it doesn't wort the effort unless someone identify a real path where settins LSNs in unlogged relations leads to an issue. > The issues here are similar to those in > http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de > - basically, I think we need to make CREATE DATABASE and ALTER > DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is > never going to work right. If we're not going to allow that, we need > to disallow hot backups while those operations are in progress. > This is right, but the problem Andres reported is orthogonal with the one I'm addressing here. Without this copydir patch (or without a proper WAL logging of copydir operations), you cannot take an incremental backup after a CREATE DATABASE or ALTER DATABASE SET TABLESPACE until you get a full backup and use it as base. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it From 3e451077283de8e99c4eceb748d49c34329c6ef8 Mon Sep 17 00:00:00 2001 From: Marco Nenciarini Date: Thu, 29 Jan 2015 12:18:47 +0100 Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation --- src/backend/storage/file/reinit.c | 58 --- src/common/relpath.c | 56 + src/include/common/relpath.h | 2 ++ 3 files changed, 58 insertions(+), 58 deletions(-) diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index afd9255..02b5fee 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *** static void ResetUnloggedRelationsInTabl *** 28,35 int op); static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op); - static bool parse_filename_for_nontemp_relation(const char *name, - int *oidchars, ForkNumber *fork); typedef struct { --- 28,33 *** ResetUnloggedRelationsInDbspaceDir(const *** 388,446 fsync_fname((char *) dbspacedirname, true); } } - - /* - * Basic parsing of putative relation filenames. - * - * This function returns true if the file appears to be in the correct format - * for a non-temporary relation and false otherwise. - * - * NB: If this function returns true, the caller is entitled to assume that - * *oidchars has been set to the a value no more than OIDCHARS, and thus - * that a buffer of OIDCHARS+1 characters
Re: [HACKERS] File based Incremental backup v8
On 2015-01-29 12:57:22 -0500, Robert Haas wrote: > The issues here are similar to those in > http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de > - basically, I think we need to make CREATE DATABASE and ALTER > DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is > never going to work right. If we're not going to allow that, we need > to disallow hot backups while those operations are in progress. Yea, the current way is just a hack from the dark ages. Which has some advantages, true, but I don't think they outweight the disadvantages. I hope to find time to develop a patch to make those properly WAL logged (for master) sometime not too far away. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini wrote: > The current implementation of copydir function is incompatible with LSN > based incremental backups. The problem is that new files are created, > but their blocks are still with the old LSN, so they will not be backed > up because they are looking old enough. I think this is trying to pollute what's supposed to be a pure fs-level operation ("copy a directory") into something that is aware of specific details like the PostgreSQL page format. I really think that nothing in storage/file should know about the page format. If we need a function that copies a file while replacing the LSNs, I think it should be a new function living somewhere else. A bigger problem is that you are proposing to stamp those files with LSNs that are, for lack of a better word, fake. I would expect that this would completely break if checksums are enabled. Also, unlogged relations typically have an LSN of 0; this would change that in some cases, and I don't know whether that's OK. The issues here are similar to those in http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de - basically, I think we need to make CREATE DATABASE and ALTER DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is never going to work right. If we're not going to allow that, we need to disallow hot backups while those operations are in progress. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] File based Incremental backup v8
The current implementation of copydir function is incompatible with LSN based incremental backups. The problem is that new files are created, but their blocks are still with the old LSN, so they will not be backed up because they are looking old enough. copydir function is used in: CREATE DATABASE ALTER DATABASE SET TABLESPACE I can imagine two possible solutions: a) wal log the whole copydir operations, setting the lsn accordingly b) pass to copydir the LSN of the operation which triggered it, and update the LSN of all the copied blocks The latter solution is IMO easier to be implemented and does not deviate much from the current implementation. I've implemented it and it's attached to this message. I've also moved the parse_filename_for_notntemp_relation function out of reinit.c to make it available both to copydir.c and basebackup.c. I've also limited the LSN comparison to the only MAIN fork, because: * LSN fork doesn't uses LSN * VM fork update LSN only when the visibility bit is set * INIT forks doesn't use LSN. It's only one page anyway. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it From 087faed899b9afab324aff7fa20e715c4f99eb4a Mon Sep 17 00:00:00 2001 From: Marco Nenciarini Date: Thu, 29 Jan 2015 12:18:47 +0100 Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation --- src/backend/storage/file/reinit.c | 58 --- src/common/relpath.c | 56 + src/include/common/relpath.h | 2 ++ 3 files changed, 58 insertions(+), 58 deletions(-) diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index afd9255..02b5fee 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *** static void ResetUnloggedRelationsInTabl *** 28,35 int op); static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op); - static bool parse_filename_for_nontemp_relation(const char *name, - int *oidchars, ForkNumber *fork); typedef struct { --- 28,33 *** ResetUnloggedRelationsInDbspaceDir(const *** 388,446 fsync_fname((char *) dbspacedirname, true); } } - - /* - * Basic parsing of putative relation filenames. - * - * This function returns true if the file appears to be in the correct format - * for a non-temporary relation and false otherwise. - * - * NB: If this function returns true, the caller is entitled to assume that - * *oidchars has been set to the a value no more than OIDCHARS, and thus - * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID - * portion of the filename. This is critical to protect against a possible - * buffer overrun. - */ - static bool - parse_filename_for_nontemp_relation(const char *name, int *oidchars, - ForkNumber *fork) - { - int pos; - - /* Look for a non-empty string of digits (that isn't too long). */ - for (pos = 0; isdigit((unsigned char) name[pos]); ++pos) - ; - if (pos == 0 || pos > OIDCHARS) - return false; - *oidchars = pos; - - /* Check for a fork name. */ - if (name[pos] != '_') - *fork = MAIN_FORKNUM; - else - { - int forkchar; - - forkchar = forkname_chars(&name[pos + 1], fork); - if (forkchar <= 0) - return false; - pos += forkchar + 1; - } - - /* Check for a segment number. */ - if (name[pos] == '.') - { - int segchar; - - for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar) - ; - if (segchar <= 1) - return false; - pos += segchar; - } - - /* Now we should be at the end. */ - if (name[pos] != '\0') - return false; - return true; - } --- 386,388 diff --git a/src/common/relpath.c b/src/common/relpath.c index 66dfef1..83a1e3a 100644 *** a/src/common/relpath.c --- b/src/common/relpath.c *** GetRelationPath(Oid dbNode, Oid spcNode, *** 206,208 --- 206,264 } return path; } + + /* + * Basic parsing of putative relation filenames. + * + * This function returns true if the file appears to be in the correct format + * for a non-temporary relation and false otherwise. + * + * NB: If this function returns true, the caller is entitled to assume that + * *o