Hi all, Carl Wilhelm Soderstrom wrote on 21.08.2007 at 09:04:40 [[BackupPC-users] wishlist: full backups whenever incrementals get too large]: > [...] > Would it be reasonable to have backuppc check the time used by the last > incremental against the time used by the last full, and if it's taken longer > to do the incremental, then automatically do a full backup next time? (Of > course, make a note in the logs as to why this was done).
no, not unconditionally. 1.) Bandwidth or backup duration may not be the primary concern. Maybe an individual setup can tolerate longer backups better than more server or client load. 2.) Duration of a backup is no accurate measure for the amount of data transferred. Maybe there is a completely different reason why the incremental backup takes significantly longer (like server/client usage or even a network problem limiting bandwidth to a fraction of the normal value). The point is: you can't tell if a full backup would have been faster under the exact circumstances of the incremental. 3.) Running a full backup in the middle of the week (or at any time it's not supposed to be run) may be problematic for some setups (eg. you've tuned your BackupPC server to run full backups for different machines on different weekdays). Jacob wrote on 21.08.2007 at 11:59:59 [Re: [BackupPC-users] wishlist: full backups whenever incrementals get too large]: > [...] > If this is really the way backuppc does incremental backups, I think > backuppc should be a bit more incremental with its incremental backups. I don't, but you surely remember Craig writing on 30.04.2007 at 22:16:31 -0700 [Re: [BackupPC-users] Incremental transferring same data over and over] with message-id [EMAIL PROTECTED] : > Unless I'm forgetting a good reason why I did it that way, for > rsync I should make the reference backup the most recent backup > in all cases - both full and incremental. It's a pretty simple > change - I'll add it to the todo list for 3.1.0. The logic is > correct as is for smb and tar: the reference backup for an > incremental always is the backup of the next lower level. The only thing I can think of is a variant of the normal rsync incremental "problem". The only potential misses are changed files with unchanged attributes. You can construct cases where files would be picked up relative to the last backup of the next lower level but not relative to the most recent backup (because the attributes changed before the most recent backup but not since then). I realise this may be stretching an already unlikely case a bit far, but the point is: on full backups the optimization of using the most recent backup as reference (for rsync!) is safe, because file contents are rechecked by the block checksums algorithm, so you are guaranteed to get an exact copy of the source tree in any case. For incrementals, you can't say "compare block checksums only for some files we mistrust a bit more than others, but not for the rest of them", so you would actually be doing a higher level incremental backup than strictly requested. Thinking about that, would not using the most recent backup (of higher level) as reference yield an incorrect (unfilled) backup tree? Wouldn't constructing the view of the new backup necessarily require the reference backup? You can probably get around these problems, but it doesn't exactly sound "simple" to me :-). Jacob: > Instead of comparing against the last full, it should compare against the > last full and incremental backups. This would solve this problem and make > backuppc more efficient anyway, AFAIK. There are still those of us who consider accuracy rather than efficiency the foremost goal of backup software. And there are those that define "efficiency" as "solving the problem in a faster way". If it's faster but doesn't solve the problem, it's not more efficient. Let me remind you of a few facts in case it was not clear above: - Full backups are supposed to get an *exact* copy of everything under all circumstances. There are no compromises for the sake of speeding things up. - Incremental backups are an optimization *that comes at a cost* of reduced certainty that all contents are correctly backed up. If this were not so, you wouldn't need full backups. The first incremental would simply be relative to <nothing>, therefore transferring everything. Case finished. With conventional tape strategies, it is simply not feasible to take a complete snapshot every time. It is also simply not feasible to read all files from tape from the previous backup(s) and compare the contents byte by byte. Therefore the incrementals, with the exact same risk of missing changes, as you only have timestamps to go by. In fact, with BackupPC rsync incrementals, you have *less risk* than with conventional tape strategies, because the decision is more elaborate than "mtime > last_backup_time". - The higher the level of the incremental backup, the greater the speedup, but the less certain you are of not having missed some changes. This is why you have all the options for defining your backup strategy. It's *your* decision as backup operator, what chances you are willing to take in order to speed things up. A further point: it's not "BackupPC should compare against whatever", it's the underlying transport method that has to do it. You're clearly talking about rsync(d), but tar/smb handle only timestamps for incrementals. Perhaps tar/smb make the problem more clear: if you change the reference backup (i.e. the timestamp), you alter the information you obtain. For example, you miss files created after the full and deleted again after the most recent incremental. Your backup will contain those files although it shouldn't. Your backup won't be a level 1 incremental (in the simple case), it will in fact be a level 2 incremental, possibly stored within BackupPC in a form that it can be accessed as though it were a level 1. The same applies for files created after the full and then moved into another directory, or created and dated back to a time before the incremental (except that those files will be missing in your backup although they should be present). Although rsync misses less changes, it *can* miss some, and the same thought applies, although it is more difficult to describe. You simply don't get the benefits of a level 1 incremental over a level 2 incremental, if you *do* a level 2 incremental. If you don't need those benefits, then set up a level 2 incremental in the first place! Jacob wrote on 24.08.2007 at 08:50:20 [Re: [BackupPC-users] wishlist: full backups whenever incrementals get too large]: > [...] > Maybe it's time for new principles? ;) Sure, when strategies are well understood and proven by long time experience, it's time to mandatorily switch everyone from one strategy to another because the second one is faster :-). > With large files, though, it is an absolute time- and space-waster. Go and re-read the BackupPC documentation and then tell me where *space* is wasted. Unless you are confusing "space" with "space divided by time" (bandwidth). Though I'm repeating myself: concerning "waster", it's only a waste if you doubtlessly know it's unnecessary, which, *in the general case*, you don't. As human, you might know that you copied foo.iso to bar.img, so it's a waste to transfer bar.img when you know you already have the contents on the other side. Backup software can't know that, so it's only doing its job in the way it's designed to. > I could easily see myself wanting to backup a 2GB .ISO, but wouldn't want > it to take 4x the actual size of the ISO just because of the way it's backed > up. :s You've got something wrong there. Regards, Holger ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/