Hi, Boniforti Flavio wrote on 2011-06-07 11:00:24 +0200 [Re: [BackupPC-users] Backup of VM images]: > [...] > So I'm right when thinking that rsync *does* transfer only the bits of a > file (no matter how big) which have changed, and *not* the whole file?
usually that's correct. Presuming rsync *can* determine which parts have changed, and presuming these parts *can* be efficiently transferred. For example, changing every second byte in a file obviously *won't* lead to a reduction of transfer bandwidth by 50%. So it really depends on *how* your files change. > [...] > Well, size is a critical parameter, because I can suppose that VM images > are quite *big* files. > But if the data transfer could be reduced by using rsync (over ssh of > course), there's no problem because the initial transfer would be done > by "importing" the VM images from a USB HDD. Therefore, only subsequent > "backups" (rsyncs) would transfer data. > > What do you think? First of all, you keep saying "VM images", but you don't mention from which VM product. Nobody says VM images are simple file based images of what the virtual disk looks like. They're some opaque structure optimized for whatever the individual VM product wants to handle efficiently (which is probably *not* rsyncability). Black boxes, so to say. There are probably people on this list who can tell you from experience how VMware virtual disks behave (or VirtualBox or whatever), and it might even be very likely that they all behave in similar ways (such as changing roughly the same amount of the virtual disk file for the same amount of changes within the virtual machine), but there's really no guarantee for that. You should try it out and see what happens in your case. Secondly, you say that the images are already somewhere, and your responsibility is simply to back them up. Hopefully, your client didn't have the smart idea to also encrypt the images and simply forget to tell you. Encryption would pretty much guarantee 0% rsync savings. Thirdly, as long as things work as they are supposed to, you are probably fine. But what if something malfunctions and, say, your client mistakenly drops an empty (0 byte) file for an image one day (some partition may have been full and an automated script didn't notice)? The backup of the 0-byte file will be quite efficient, but I don't want to think about the next backup. That may only be a problem if the 0-byte file actually lands in a backup that is used as a reference backup, but it's an example meant to illustrate that you *could* end up transferring the whole data set, and you probably won't notice until it congests your links. Nothing will ever malfunction? Ok, a virtual host is probably perfectly capable of actually *changing* the complete virtual disk contents if directed to (system update, encrypting the virtual host's file systems, file system defragmentation utility, malicious clobbering of data by an intruder ...). rsync bandwidth savings are a fine thing. Relying on them when you have no control over the data you are transferring may not be wise, though. And within BackupPC may not be the best place to handle problems. For instance, if you first made a local copy of the images and then backed up that *copy*, you could script just about any checks you want to, use bandwidth limiting, abort transfers of single images that take too long, use a specialized tool that handles your VM images more efficiently than rsync, split your images after transferring ... it really depends on what guarantees you are making, what constraints you want (or need) to apply, how much effort you want to invest (and probably other things I've forgotten). Hope that helps. Regards, Holger ------------------------------------------------------------------------------ EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev _______________________________________________ BackupPC-users mailing list [email protected] List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
