Hi shorvath, > My question is whether or not checksumming the files is even necessary > (Either either with rsync or rdiff-backup) at all. Doesn't rsync as > well as rdiff -backup ensure file integrity in its basic operation and > if so can anyone elaborate or point me to some reading material to back > this up? I guess we all agree up to the point that you should periodically live-test your backups, e.g. restore them to a second maschine and verify everything still works.
With rsync (and presumeably using the --link-dest option), I do not think you have to checksum at all, because the underlying algorithm already works using checkums. The only error you will catch using your checksumming would be bit errors in ram or bad sectors on the target disk. And note that both errors probably only cause one single version of a file to be corrupt, because rsync has no increments, only hardlinks or full files. With rdiff-backup the situation is a little different. There is a build-in checksumming as well that almost guarantees you that the _last_ backup run had no errors what so ever. Please note the stressing. But as rdiff-backup stores every change as a compressed increment to the last run, you will need all the preceeding increments to restore the oldest state of a file. So if you run into bit errors or bad sectors that did _not_ affect the newest state of a file, you might not only end up with one corrupt version of a file, instead you will loose all preceeding versions, too. As rdiff-backups approach will save you quite some diskspace, this is, apart from performance issues, the most important drawback to consider. The creators were aware of this and build in a verify-at TIME option. Using this after each backup will make a test-restore of every file up to the given timestamp and thus verify each increment, giving you the guarantee that you will be able to restore up to that state. Please note that: a) this will take _a lot_ of time on big repositories and b) be sure to have /tmp (or whatever your $TEMP points to) installed as a big enough ram disk to fit the biggest file in the rep. We are heavy users of both rdiff-backup and rsync and we mainly use the first for the OS (max 20G) and the latter for all the other (huge amounts of) data. That gives us reasonable tradeoffs between security, reliability, storage need and backup/verify speed. Regards Florian _______________________________________________ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki