On Sun, 13 Nov 2005, Ben Escoto wrote: > >>>>> Sheldon Hearn <[EMAIL PROTECTED]> > >>>>> wrote the following on Sat, 12 Nov 2005 12:02:45 +0200 > > At the extreme, every object gets a serial number which is used as > > its name in the backup store's filesystem. ... > But this is basically the way most backup programs work, and from the > beginning the premise of rdiff-backup was that it made a mirror.
i've been thinking along the lines of sheldon's proposal for a while actually... most of the times i want the mirror functionality it's only for a small subtree at a time -- if i need the entire backup it's really better for me to restore it (due to metadata). for folks with posix filesystems with hardlinks there's a relatively fast option to mirror subtrees out of a "virtualized" mirror like sheldon is proposing -- rdiff-backup could hardlink the existing mirror files into an appropriate directory on the same partition. then i can look about in that hardlinked tree and remove it when i'm done perusing or copying whatever it is i was interested in. you'd have to be careful about accidentally modifying the hardlinked mirror, but no more careful than you have to be today when perusing the mirror itself. (actually we could remove write perms from all files and directories by default no matter what we do about long filenames... something to consider.) this doesn't help people using less featureful filesystems though. while the mirror property was something which initially attracted me to rdiff-backup, it wasn't the primary factor ... i was more attracted by rsync-style network usage, (relatively) efficient storage of increments, and storing backups on hard disk instead of tape. (i was already doing backups to hd with tar incrementals because i can't stand tape... but it wasn't as space or network efficient as rdiff-backup.) > The > best backup system that used your scheme would have a different > architecture, and might not have all that much in common with a > mirror+increment system. (For instance, if I were to your scheme from > scratch I would optimize for random access of older data, so it could > be mounted with FUSE or similar with decent performance.) you know i'm not so sure the goals conflict... even if you wanted to do something like FUSE as you suggest you're going to need to generate the already-patched blocks of older files ... so you'd probably end up keeping a cache around at the FUSE level. the only real optimisation i can think of is to store all the deltas for a particular object together -- so that you only need to go to one place to rebuild whatever ancestor you're interested in. but realistically even if you concatenate them together the filesystem isn't generally going to be able to avoid fragmentation... if i weren't lazy and were doing something like this from scratch i'd optimize assuming FUSE exists and use a compact single-file-per-backup representation and let a FUSE layer provide a mirror interface if it's desired... i'd go for a single-file-per-backup compressed in chunks with zlib (to enable random access), and append an index. a few motivations for doing this: - reduce I/O overhead on the backup server by eliminating a lot of disk seeks. (and probably reduce CPU overhead as well due to inode reduction.) - simplify archival to DVD/wherever (even archival to another hard drive would be fast due to all sequential read/write). - fold in the functionality of duplicity -- you could read/write through a gpg filter (chunks would be necessary still) - my largest backup is throwing away at least 0.6GB disk space just for the tail fragments on all the rdiff-backup-data inodes ... this is for 28 days of increments on a 1.5M inode fs -- there are an additional 0.5M inodes in rdiff-backup-data, of which 0.3M have a non-zero size, and so on avg waste 2048 bytes (4KiB blocks). not that i'm trying to convince you either way... i'm just babbling really. -dean _______________________________________________ rdiff-backup-users mailing list at [email protected] http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
