Re: rsync and moving files [Re: backup w/ snapshots]
* Charles Swiger [2005-08-30 10:49 -0400] On Aug 30, 2005, at 3:32 AM, Svein Halvor Halvorsen wrote: Yes, that's correct! But let's say I keep more than one snapshot around. I maybe didn't mention this, but this the sole purpose of using snapshots; for me to have more full backups laying around. A snapshot on the same disk does not qualify as a reliable backup of your data. Using rsync to copy a tree of stuff to another machine would. Please read the entire thread. I use rsync to mirror my disks remotely, then make snapshots on that remote computer. The snapshots are mounted read-only and nfs-exported back to the original computer. This satisfies both the need for offsite sorage of backups, the need for invremental backups and the need for all previous backups to be randomly accessible from the original computer. Thanks for your consern, though. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
* Garance A Drosihn [2005-08-30 12:50 -0400] Fwiw, I understand the problem you're trying to describe. And the basic issue is that rsync keeps no information between separate runs of it. It has no way of knowing that a given file on the source volume used to be at a different location. It does not even know that the destination volume was sync'ed by a previous run of rsync, so it does not even know that the file at the old location on the destination is the same as the file at the old location on the source. It knows nothing more than the information it has at the moment of any given run of rsync. You could kinda fudge that information for rsync by creating a lot of hard links, but that is probably going to create more of a mess than it will solve. So, you're left with doing something else outside of rsync. The script you are suggesting would probably be fairly easy to write in something like ruby, perl, or python. Use a key made up of the inode number + lastchange date, or maybe inode number + file size. Then save away the key-to-filename(s) mapping for every file. On the next run of rsync, see which files have moved on the source directory. If the destination volume has a file at the old location which matches the file-size or lastchange date (depending on which key you used...), then move it to the new location on the destination volume. Thanks! I think I will try to implement this, then! ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
* Greg Barniskis [2005-08-29 11:45 -0500] Eh? Bad assumptions about snapshots, I think. If a snapshot occupied even a tenth of the space of the data that it represented, we would quickly fill all our disks and the snapshot technology would be almost as painful as useful. A snapshot is essentially only an index of occupied disk space, not a copy of the actual data, and a snapshot is therefore much, much, much, much smaller than the data files that have changed. Read the relevant man pages and handbook sections again, and test your assumptions by measuring the actual change in snapshot size. I don't think your perceived problem really exists. Yes, that's correct! But let's say I keep more than one snapshot around. I maybe didn't mention this, but this the sole purpose of using snapshots; for me to have more full backups laying around. If I change the disk alot between snapshots. Eg. I rsync moved files (yes, within tha same fs), this will result in alot of file deletion and creation. Next, when I make the snapshot, a new list of occupied diskspace will be made, and all of these blocks will be marked in use, and therefore take up alot of diskspace. In reality the information change between the two snapshots, didn't change much at all, but the effect remains: my disk cannot longer store two snapshots (unless the backup disk is twice as large, which it is not). The solution: Somehow, I need to mirror all the move ops on the remote system before doing the rsync. This could probably be done by making a hash table of inodes/filenames pairs (or triplets, etc) each time i sync. Then the next time, I could compare the old table with the new, to find out which files are the same only with new names, then find those names on the remote system, change them to the new ones, and then rsyncing. If the inodes are recycled for brand new files between syncs, I don't think that would be a problem. The following rsync-job would recognize the diffs and sync that, which it would have done anyway, if the file is new. What do you think? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
* Norberto Meijome [2005-08-30 02:14 +1000] I take your word wrt to how it works. Assuming of course that you move within the same filesystem. Yes, I'm talking about the same filsystem. (touche). yup, that's what would happenbut tha's the nature of the beast :) don't keep too many snapshots ? ;) it'd be great if you could keep a log of all local-mv operations,and then replay them remotely via ssh. Yes, I thought about that myself. Only I thought I'd keep a list of filename/inode pairs from each sync, so before I do a sync I could compare the lists to find out which files appears to be the same, only with a new name. Then rename those files remotely. In cases where a inode-match does not represent a relink, but just plain inode recycling, so what? Rsync will make the new file up to date. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
* Bob Johnson [2005-08-29 12:44 -0400] Use a ggated(8) + ggatec(8) pair to establish a remote volume that looks local, then use gmirror to make it a mirror of the local drive. The big gotcha is that ggated/c only moves i/o requests and data via the net, it doesn't move ioctls, so some things just won't work remotely. Or at least, that's what I've read. Do you think this is allright for a 4M/640K link? The upstrem bandwith to the backupserver is 4 Mbps. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
Svein Halvor Halvorsen [EMAIL PROTECTED] writes: Only I thought I'd keep a list of filename/inode pairs from each sync, so before I do a sync I could compare the lists to find out which files appears to be the same, only with a new name. Doesn't dump(8)/restore(8) do pretty much this? But less crudely... ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
Svein Halvor Halvorsen wrote: * Greg Barniskis [2005-08-29 11:45 -0500] Eh? Bad assumptions about snapshots, I think. If a snapshot occupied even a tenth of the space of the data that it represented, we would quickly fill all our disks and the snapshot technology would be almost as painful as useful. A snapshot is essentially only an index of occupied disk space, not a copy of the actual data, and a snapshot is therefore much, much, much, much smaller than the data files that have changed. Read the relevant man pages and handbook sections again, and test your assumptions by measuring the actual change in snapshot size. I don't think your perceived problem really exists. Yes, that's correct! But let's say I keep more than one snapshot around. I maybe didn't mention this, but this the sole purpose of using snapshots; for me to have more full backups laying around. Ah. That does change things a bit, I guess. A previous post indicated file renames and replication followed by taking a new snapshot, and I thought it was implied your older snapshots were going away. If I change the disk alot between snapshots. Eg. I rsync moved files (yes, within tha same fs), this will result in alot of file deletion and creation. Next, when I make the snapshot, a new list of occupied diskspace will be made, and all of these blocks will be marked in use, and therefore take up alot of diskspace. In reality the information change between the two snapshots, didn't change much at all, but the effect remains: my disk cannot longer store two snapshots (unless the backup disk is twice as large, which it is not). The solution: Somehow, I need to mirror all the move ops on the remote system before doing the rsync. This could probably be done by making a hash table of inodes/filenames pairs (or triplets, etc) each time i sync. Then the next time, I could compare the old table with the new, to find out which files are the same only with new names, then find those names on the remote system, change them to the new ones, and then rsyncing. If the inodes are recycled for brand new files between syncs, I don't think that would be a problem. The following rsync-job would recognize the diffs and sync that, which it would have done anyway, if the file is new. What do you think? This is admittedly beyond my ken, at least within the limited number of brain cycles I can offer to the problem. Hopefully someone else will provide clues for you. Personally, I think you're violating the KISS principle unless there's a really compelling need to keep your previous file system states accessible online. Dumping older states to offline media and reclaiming that space would be my first order of business, but that's just me. Or just buy some whopping big disks appropriate to the task, since that's generally cheaper than admin time to create workarounds (unless you just consider this fun =). Good luck, -- Greg Barniskis, Computer Systems Integrator South Central Library System (SCLS) Library Interchange Network (LINK) gregb at scls.lib.wi.us, (608) 266-6348 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
On Aug 30, 2005, at 3:32 AM, Svein Halvor Halvorsen wrote: Yes, that's correct! But let's say I keep more than one snapshot around. I maybe didn't mention this, but this the sole purpose of using snapshots; for me to have more full backups laying around. A snapshot on the same disk does not qualify as a reliable backup of your data. Using rsync to copy a tree of stuff to another machine would. -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
At 9:32 AM +0200 8/30/05, Svein Halvor Halvorsen wrote: The solution: Somehow, I need to mirror all the move ops on the remote system before doing the rsync. This could probably be done by making a hash table of inodes/filenames pairs (or triplets, etc) each time i sync. Then the next time, I could compare the old table with the new, to find out which files are the same only with new names, then find those names on the remote system, change them to the new ones, and then rsyncing. Fwiw, I understand the problem you're trying to describe. And the basic issue is that rsync keeps no information between separate runs of it. It has no way of knowing that a given file on the source volume used to be at a different location. It does not even know that the destination volume was sync'ed by a previous run of rsync, so it does not even know that the file at the old location on the destination is the same as the file at the old location on the source. It knows nothing more than the information it has at the moment of any given run of rsync. You could kinda fudge that information for rsync by creating a lot of hard links, but that is probably going to create more of a mess than it will solve. So, you're left with doing something else outside of rsync. The script you are suggesting would probably be fairly easy to write in something like ruby, perl, or python. Use a key made up of the inode number + lastchange date, or maybe inode number + file size. Then save away the key-to-filename(s) mapping for every file. On the next run of rsync, see which files have moved on the source directory. If the destination volume has a file at the old location which matches the file-size or lastchange date (depending on which key you used...), then move it to the new location on the destination volume. vague_rambling Hmm. Thinking about this a little more, it's probably possible for rsync to catch some of these cases itself. It would require some coding changes to rsync, but it could take the list of files that it is deleting, compare it to the list of files that it is adding, and if the MD5-checksum + size of some to-be-deleted file is the same as some to-be-added file, it could try doing a 'mv' of that file before it does the remainder of its processing. I wonder how hard that would be to do. /vague_rambling -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
Charles Swiger [EMAIL PROTECTED] writes: On Aug 30, 2005, at 3:32 AM, Svein Halvor Halvorsen wrote: Yes, that's correct! But let's say I keep more than one snapshot around. I maybe didn't mention this, but this the sole purpose of using snapshots; for me to have more full backups laying around. A snapshot on the same disk does not qualify as a reliable backup of your data. No, but it is convenient to restore from, when it's intact. This is actually a very common case; accidental file deletions. Using rsync to copy a tree of stuff to another machine would. And as long as one is doing that, there's no reason not to use snapshots as well. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
Svein Halvor Halvorsen wrote: * Svein Halvor Halvorsen [2005-08-28 23:53 +0200] Does this sound reasonable? Is there any precautions I should take? Are there any other tools better suited for the task at hand? I'm responding to my own message. Let's say I happen to move all music from /music/artist - album/ to /music/artist/album. Even though a local snapshot would handle this well, rsync would create new files on the remote machine, and when I then take a snapshot there, it will be HUGE! isn't that the whole point of having a backup? to have *another* copy of your files? and I guess that yes, if the files are new in the remote system, when you take a snapshot the difference with the previous snapshot will be the size of the new data (only guessing from how snapshots work in Linux, so feel free to flame ..err..correct me :) ) Beto ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
On 8/29/05, Norberto Meijome [EMAIL PROTECTED] wrote: Svein Halvor Halvorsen wrote: * Svein Halvor Halvorsen [2005-08-28 23:53 +0200] Does this sound reasonable? Is there any precautions I should take? Are there any other tools better suited for the task at hand? I'm responding to my own message. Let's say I happen to move all music from /music/artist - album/ to /music/artist/album. Even though a local snapshot would handle this well, rsync would create new files on the remote machine, and when I then take a snapshot there, it will be HUGE! isn't that the whole point of having a backup? to have *another* copy of your files? and I guess that yes, if the files are new in the remote system, when you take a snapshot the difference with the previous snapshot will be the size of the new data (only guessing from how snapshots work in Linux, so feel free to flame ..err..correct me :) ) Beto ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] cat /usr/ports/sysutils/rsnapshot/pkg-descr -Erik- ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
* Norberto Meijome [2005-08-30 00:32 +1000] isn't that the whole point of having a backup? to have *another* copy of your files? Well, yes and no. The idea is that I have a main computer that I want to backup. I want the backup to be (a) remote, (b) incremental and (c) random-accessible. So I thought that every day my backup-server could rsync my main computer, creating a mirror of the relevant directory trees. Then, as soon as the rsync job completes, it takes a snapshot of the filesystem. This snapshot could be mounted r/o and nfs-exported back to the original computer. Now: If I have a file /foo/test on my main computer. After the first rsync-job this file will be copied, assigned an inode and put on the disk somewhere. If I change this file, a local snapshot will be smart enough to just store the changed sectors that this file now occupies. But: If I move the file from /foo/test to /bar/test on my main computer, rsync will create a BRAND NEW FILE in /bar (and delete the file in /foo, since I used the --delete option). Now this NEW file will have a new inode, and cover new sectors on disk. The snapshot will then tak considerable more diskspace. If I move a large directory tree this way, this will occupy huge amounts of diskspace. If I however, make the snapshot on my local disk, this is not a problem, as on this local filesystem /bar/test is not a new file. So how can I make rsync know that the files were just moved (renamed, relinked), and make rsync reflect this fact on the remote mirror? and I guess that yes, if the files are new in the remote system, when you take a snapshot the difference with the previous snapshot will be the size of the new data The files aren't new. Their names are! ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
* Hornet [2005-08-29 11:11 -0400] cat /usr/ports/sysutils/rsnapshot/pkg-descr It seems this is just a wrapper around the tools I was already planning on using. In this regard, it's a nice port. But won't this perl-script suffer for tha same shortcommings that rsync will? Or does it use rsync in more clever ways that I do? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
Svein Halvor Halvorsen [EMAIL PROTECTED] writes: * Svein Halvor Halvorsen [2005-08-28 23:53 +0200] Does this sound reasonable? Is there any precautions I should take? Are there any other tools better suited for the task at hand? I'm responding to my own message. Let's say I happen to move all music from /music/artist - album/ to /music/artist/album. Even though a local snapshot would handle this well, rsync would create new files on the remote machine, and when I then take a snapshot there, it will be HUGE! Can I resolve this? dump/restore is the only method I can think of offhand that will handle this problem. And even then, those huge moves have to stay within a single filesystem. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
Svein Halvor Halvorsen wrote: * Norberto Meijome [2005-08-30 00:32 +1000] isn't that the whole point of having a backup? to have *another* copy of your files? Well, yes and no. The idea is that I have a main computer that I want to backup. I want the backup to be (a) remote, (b) incremental and (c) random-accessible. So I thought that every day my backup-server could rsync my main computer, creating a mirror of the relevant directory trees. Then, as soon as the rsync job completes, it takes a snapshot of the filesystem. This snapshot could be mounted r/o and nfs-exported back to the original computer. yes, that sounds like it would meet your criteria. Now: If I have a file /foo/test on my main computer. After the first rsync-job this file will be copied, assigned an inode and put on the disk somewhere. If I change this file, a local snapshot will be smart enough to just store the changed sectors that this file now occupies. I take your word wrt to how it works. Assuming of course that you move within the same filesystem. But: If I move the file from /foo/test to /bar/test on my main computer, ( /foo/ and /bar/ being in the same filesystem) rsync will create a BRAND NEW FILE in /bar (and delete the file in /foo, since I used the --delete option). Now this NEW file will have a new inode, and cover new sectors on disk. The snapshot will then tak considerable more diskspace. If I move a large directory tree this way, this will occupy huge amounts of diskspace. (touche). yup, that's what would happenbut tha's the nature of the beast :) don't keep too many snapshots ? ;) it'd be great if you could keep a log of all local-mv operations,and then replay them remotely via ssh. hack replace mv with your own version which does local-mv and either does remote-mv over ssh or sends a msg to a service to perform the transaction/hack (yes, there may be other conditions that trigger the same effect as an mv...I just dont know which ones) I guess the proper way to do this (if you are REALLY REALLY worried about that extra spaced used for snapshots in the remote site) would be to implement a GEOM class that knows about the remote site and the 'mv' condition and communicate to the remote end. So how can I make rsync know that the files were just moved (renamed, relinked), and make rsync reflect this fact on the remote mirror? rsync would then be the wrong tool for the job. I would suggest that you just get more or larger drives for the remote site and live with the waste :) (though the GEOM class would be cool :D ) hmm...what about network operating systems like AFS or CODA (not that I know much about them, I just read some stuff on those being distruted,etc..) Let us know how you solve this. Regards, Beto ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
On 8/29/05, Norberto Meijome [EMAIL PROTECTED] wrote: Svein Halvor Halvorsen wrote: * Norberto Meijome [2005-08-30 00:32 +1000] I guess the proper way to do this (if you are REALLY REALLY worried about that extra spaced used for snapshots in the remote site) would be to implement a GEOM class that knows about the remote site and the 'mv' condition and communicate to the remote end. In principle, it already exists. In practice, it might need more work. Use a ggated(8) + ggatec(8) pair to establish a remote volume that looks local, then use gmirror to make it a mirror of the local drive. The big gotcha is that ggated/c only moves i/o requests and data via the net, it doesn't move ioctls, so some things just won't work remotely. Or at least, that's what I've read. [...] hmm...what about network operating systems like AFS or CODA (not that I know much about them, I just read some stuff on those being distruted,etc..) Although CODA would probably work, I think the GEOM solution, if it works, would be far easier to implement. Let us know how you solve this. Regards, Beto ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rsync and moving files [Re: backup w/ snapshots]
Svein Halvor Halvorsen wrote: But: If I move the file from /foo/test to /bar/test on my main computer, rsync will create a BRAND NEW FILE in /bar (and delete the file in /foo, since I used the --delete option). Now this NEW file will have a new inode, and cover new sectors on disk. The snapshot will then tak considerable more diskspace. If I move a large directory tree this way, this will occupy huge amounts of diskspace. Eh? Bad assumptions about snapshots, I think. If a snapshot occupied even a tenth of the space of the data that it represented, we would quickly fill all our disks and the snapshot technology would be almost as painful as useful. A snapshot is essentially only an index of occupied disk space, not a copy of the actual data, and a snapshot is therefore much, much, much, much smaller than the data files that have changed. Read the relevant man pages and handbook sections again, and test your assumptions by measuring the actual change in snapshot size. I don't think your perceived problem really exists. -- Greg Barniskis, Computer Systems Integrator South Central Library System (SCLS) Library Interchange Network (LINK) gregb at scls.lib.wi.us, (608) 266-6348 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]