In the message dated: Wed, 09 Aug 2006 11:21:44 +0200, The pithy ruminations from Casper Thomsen on <Re: [BackupPC-users] Keep the last n revisions of files> were: => On Wed, 9 Aug 2006, Ralf Gross wrote: => (...) => >> What would be really great to have is the possibility to ensure that I => >> have the last n revisions of files; no matter how many fulls or => >> incrementals.
Interesting idea, probably best suited to a revision control system. => (...) => > => > I also think this would be the job of a revision control system. => => Or the job of a really smart backup system ;-). => => (...) => >> Any pointers, good ideas, work-arounds or whatever is of course => >> appreciated. Thanks in advance! => > => > How will you ensure that a file has not been changed several times since => > the last backup? In what cycle would you start your backup to get every => > existing version of that file? => => I will not ensure that the file has not changed several times between => backups. My formulation was inexact, sorry! What I want is the last n => revisions of files when they are checked for changes (once a day, week or => whatever). I only want the last n revisions of backups, not the "real" => file revision. OK. Do you plan to list specific files for which revisions should be retained (in which case there's much more overhead, a more complicated config, but the storage requirement would be lower) or apply the revision settings to every file? If the former (a list of files), then it sounds like something that's best handled via a revision control system. In the simplest sense, you could check-in files manually. This could also be automated, so that when BackupPC connects to the client to initiate a backup, it runs a script first. The script would traverse the filesystem (ie, using "find -newer") or check specified files, and automatically check them in to a revision control system running on the client prior to the backup. If you're thinking about applying the concept of saving revisions to the whole system, then it sounds more like you want a filesystem "snapshot", rather than a file-by-file revision history. This is much more common in backup systems, and would give you the ability to restore the entire system (or individual files) to a specified point in time. Be aware that there are many files that change often where you probably don't need or want to keep successive revisions (caches, mail spool files, mailboxes, config files that maintain a list of "last used files", etc.). => => Just to make it totally clear (I hope): If a file has changed when it is => being backed up, if there is less that n revisions of the file backed up, => do nothing (just back it up as usual), otherwise (back it up and) delete => the oldest revision of the file. That sounds computationaly expensive, and would signifcantly increase both storage and IO. Without a database backend to track file versions (the number of revisions, and the backup number), it would be extremely impractical. You're describing a much more traditional backup system, where each backup is stored onto separate volumes (CDs, disk-based files, tapes, etc.), and each backup has it's own file list and expiration period. This has some advantages (and disadvantages) over BackupPC, and would be much better suited to your revision scheme. => => Actually there would be other possiblities: you could also (1) set a bit => that indicates that the file can be deleted, you could (2) delete all the => oldest revisions such that there is exactly n left, and maybe some other => strategy I haven't thought of yet. Again, that sounds like the act of expiring all backups older than a given date. Consider looking at a "traditional" backup system (ie., not using BackupPC's concept of pooling and the use of a single storage "volume"), such as amanda or bacula. => => The different approaches imply new decisions. => => Ad (1). When should it be deleted? Should it be decompressed and deleted => imediately, once a day, while _nightly runs, or only when all (or x => percent) of the files in the compressed file is set for removal? That would add huge overhead. It's much more efficient to deal with an entire backup on a given day as a single "revision". => => Ad (2). Maybe it would be possible to have a flag or even the possibility => of specifying an "algorithm" to decide how many revisions to be deleted => (dependant on how often the file changes, how many revisions there are, => how recent the revisions changes (in about uniformly or not) etc.). This => seems, admittedly, quite strange. Interesting. I like the idea of the dynamic algorithm. This is similar to amanda, in that it dynamically chooses which filesets to backup, based on the queue and backup frequency. However, I see this as having limited application. The idea of keeping successive revisions, in addition to basic backups, seems to be at odds with the idea that older revisions would be deleted dynamically. Mark => => > Ralf => => => -- => Casper Thomsen => ----- Mark Bergman Biker, Rock Climber, Unix mechanic, IATSE #1 Stagehand http://wwwkeys.pgp.net:11371/pks/lookup?op=get&search=bergman%40merctech.com I want a newsgroup with a infinite S/N ratio! Now taking CFV on: rec.motorcycles.stagehands.pet-bird-owners.pinballers.unix-supporters 15+ So Far--Want to join? Check out: http://www.panix.com/~bergman ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/