Le mercredi 02 septembre 2009 à 12:10 +0200, Pieter Wuille a écrit : > Hello everyone, > > trying to come up with a way for efficiently synchronising a BackupPC archive > on one server with a remote and encrypted offsite backup, the following > problems > arise: > * As often pointed out on this list, filesystem-level synchronisation is > extremely cpu and memory-intensive. Not actually impossible, but depending > on the scale of your backups, it is maybe not a practical solution. > In our case of a 350GiB pool containing 4 million directories and 20 miilion > inodes, simply locally copying the whole pool using > cp/rsync/xfsdump/whatever thrashes, gets killed by OOM or at least takes > days, longer than i find reasonable for a remote synchronisation run. > * Furthermore, we want our offsite backup to be encrypted - in the ideal case > using a secret key that is at no single moment ever known at the remote > location - there should only be encrypted files sent to it, and stored > there. > Doing this encryption at the file level given such massive amount of small > files, is a very serious additional overhead. > * The alternative to file-level synchronisation is (block)device-level > synchronisation. Many possibilities exist here, including ZFS send/receive > (if you use ZFS), using snapshots (eg. LVM) or temporarily stopping backups, > and do a full copy of the pool to the remote side (if you have enough > bandwidth), etc... Not everyone is willing to use these, or is prepared to > convert to such a system. > * We would like to use Rsync for this, since it will skip identical parts, yet > guarantee that the whole file is byte-per-byte identical to the original. > Unfortunately, as far as I know, rsync doesn't support data on block devices > to be synced, only the block device node itself. In addition to that, rsync > needs to read and process the whole file on the receiver side, calculate > checksums, send them all to the sender side, wait for the sender to > reconstruct the data using the checksums, send this reconstruction, and > apply this reconstruction at the receiver side. This requires at least the > sum of the times to read through the whole data on both sides if it is a > single file (correct me if i'm wrong, i don't know rsync internals). Data > hardly moves on-disk in the case of a BackupPC pool, so we would like to > disable or at least limit the range in which rsync searches for matching > data. > > To overcome this issue, i wrote a perl/fuse filesystem that allows you to > "mount" a block device (or real file) as a directory containing files > part0001.img, part0002.img, ... each representing 1 GiB of data of the > original device: > > https://svn.ulyssis.org/repos/sipa/backuppc-fuse/devfiles.pl > > This directory can be rsynced in a normal way with an "ordinary" directory > on an offsite backup. In case a restore is necessary, doing > 'ssh remote "cat /backup/part*.img" >/dev/sdXY' (or equivalent) suffices. > Although devfiles.pl has (limited) write support, rsync'ing to the resulting > directory is not yet possible - maybe i can try to have this working if > people have a need for it. This would allow restoration by simply rsync'ing > in the opposite direction. > Doing the synchronisation in groups of 1GiB prevents rsync from searching > too far, and splitting it in multiple files allows some parallellism > (sender transmitting data to receiver, while receiver already checksums > the next file; this is heavily limited by disk I/O however). > > In our case, the BackupPC pool is stored on an XFS filesystem on an LVM > volume, allowing a xfsfreeze/sync/snapshot/xfsunfreeze, and using > devfiles.pl on the snapshot. Instead of xfsfreeze+unfreeze, a backuppc > stop/umount + mount/backuppc start is also possible. If no system for making > snapshots is available, you would need to suspend backuppc during the whole > synchronisation. > In fact, the BackupPC volume is already encrypted on our backup server > itself, allowing very cheap encrypted offsite backups (simply not sending > the keyfile to the remote side is enough...) > > The result: offsite backups of our 400GiB pool, containing 350GiB data, of > which about 2GiB changes daily, is synchronised 5 times a week with offsite > backup in 12-15 hours, requiring nearly no bandwidth. This seems mostly > limited by the slow disk I/O on the receiver side (25MiB/s). > > Hope you find this interesting/useful,
Hi. This seems to be an interesting approach to solve the offsite backups problem. I'll try to test this when I have some time. thanks > > -- > Pieter > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ -- Daniel Berteaud FIREWALL-SERVICES SARL. Société de Services en Logiciels Libres Technopôle Montesquieu 33650 MARTILLAC Tel : 05 56 64 15 32 Fax : 05 56 64 15 32 Mail: dan...@firewall-services.com Web : http://www.firewall-services.com ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/