Re: Understanding btrfs and backups => automatic snapshot script
Marc MERLIN posted on Thu, 20 Mar 2014 22:57:33 -0700 as excerpted: > On Sun, Mar 16, 2014 at 10:42:24PM -0700, Marc MERLIN wrote: >> On Thu, Mar 06, 2014 at 09:33:24PM +, Duncan wrote: >> > However, best snapshot management practice does progressive snapshot >> > thinning, so you never have more than a few hundred snapshots to >> > manage at once. >> >> I'm happy to share my script with others if that helps: >> http://marc.merlins.org/linux/scripts/btrfs-snaps > > Now added to > http://marc.merlins.org/perso/btrfs/post_2014-03-21_Btrfs-Tips_-How-To- Setup-Netapp-Style-Snapshots.html Hmm... I hadn't actually looked that closely at scripted snapshotting. Now that I did, and see how easy it is to manage both snapshotting and thinning, I just might. But I recently switched to systemd, including replacing my crons with timer-unit scripts (which I setup like cron.hourly.d, daily.d, etc, but didn't have but those two to worry about, so didn't setup weekly or beyond). I've not actually unmerged cron yet, but I probably will one of these days. Anyway, I might well find myself setting up weekly/quarterly/ whatever too, with your script or something like it modified for systemd- timer usage. It'd give me an excuse to practice my unit-file setup skills some more. =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups => automatic snapshot script
On Sun, Mar 16, 2014 at 10:42:24PM -0700, Marc MERLIN wrote: > On Thu, Mar 06, 2014 at 09:33:24PM +, Duncan wrote: > > However, best snapshot management practice does progressive snapshot > > thinning, so you never have more than a few hundred snapshots to manage > > at once. Think of it this way. If you realize you deleted something you > > needed yesterday, you might well remember about when you deleted it and > > can thus pick the correct snapshot to mount and copy it back from. But > > if you don't realize you need it until a year later, say when you're > > doing your taxes, how likely are you to remember the specific hour, or > > even the specific day, you deleted it? A year later, getting a copy from > > the correct week, or perhaps the correct month, will probably suffice, > > and even if you DID still have every single hour's snapshots a year > > later, how would you ever know which one to pick? So while a day out, > > hourly snapshots are nice, a year out, they're just noise. > > I'm happy to share my script with others if that helps: > http://marc.merlins.org/linux/scripts/btrfs-snaps Now added to http://marc.merlins.org/perso/btrfs/post_2014-03-21_Btrfs-Tips_-How-To-Setup-Netapp-Style-Snapshots.html (mostly to seed google and the archives) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups => automatic snapshot script
On Thu, Mar 06, 2014 at 09:33:24PM +, Duncan wrote: > However, best snapshot management practice does progressive snapshot > thinning, so you never have more than a few hundred snapshots to manage > at once. Think of it this way. If you realize you deleted something you > needed yesterday, you might well remember about when you deleted it and > can thus pick the correct snapshot to mount and copy it back from. But > if you don't realize you need it until a year later, say when you're > doing your taxes, how likely are you to remember the specific hour, or > even the specific day, you deleted it? A year later, getting a copy from > the correct week, or perhaps the correct month, will probably suffice, > and even if you DID still have every single hour's snapshots a year > later, how would you ever know which one to pick? So while a day out, > hourly snapshots are nice, a year out, they're just noise. I'm happy to share my script with others if that helps: http://marc.merlins.org/linux/scripts/btrfs-snaps Or for the list archives/google: #!/bin/bash # By Marc MERLIN # License GPL-2 or BSD at your option. # This lets you create sets of snapshots at any interval (I use hourly, # daily, and weekly) and delete the older ones automatically. # Usage: # This is called from /etc/cron.d like so: # 0 * * * * root btrfs-snaps hourly 3 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )' # 1 0 * * * root btrfs-snaps daily 4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )' # 2 0 * * 0 root btrfs-snaps weekly 4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )' : ${BTRFSROOT:=/mnt/btrfs_pool1} DATE="$(date '+%Y%m%d_%H:%M:%S')" type=${1:-hourly} keep=${2:-3} cd "$BTRFSROOT" for i in $(btrfs subvolume list -q . | grep "parent_uuid -" | awk '{print $11}') do # Skip duplicate dirs once a year on DST 1h rewind. test -d "$BTRFSROOT/${i}_${type}_$DATE" && continue echo "Making snapshot of $type" /sbin/btrfs subvolume snapshot "$BTRFSROOT"/$i "$BTRFSROOT/${i}_${type}_$DATE" count="$(ls -d ${i}_${type}_* | wc -l)" clip=$(( $count - $keep )) if [ $clip -gt 0 ]; then echo "Will delete the oldest $clip snapshots for $type" for sub in $(ls -d ${i}_${type}_* | head -n $clip) do #echo "Will delete $sub" /sbin/btrfs subvolume delete "$sub" done fi done -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
On Mar 7, 2014, at 7:03 AM, Eric Mesa wrote: > > Duncan - thanks for this comprehensive explanation. For a huge portion of > your reply...I was all wondering why you and others were saying snapshots > aren't backups. They certainly SEEMED like backups. But now I see that the > problem is one of precise terminology vs colloquialisms. In other words, > snapsshots are not backups in and of themselves. They are like Mac's Time > Machine. BUT if you take these snapshots and then put them on another media > - whether that's local or not - THEN you have backups. Am I right, or am I > still missing something subtle? Hmm, yes because snapshots on a mirrored drive are on another media but that's still not considered a backup. I think what makes a backup is separate device and separate file system. That's because the top vectors for data loss are: user induced, device failure, and file system corruption. These are substantially mitigated by having backup files located both on separate file systems and device. Also, Time Machine qualifies as a backup because it copies files to a separate device with a separate file system. (There is a feature in recent OS X versions that store hourly incremental backups on the local drive when the usual target device isn't available - these are arguably not backups but rather snapshots that are pending backups. Once the target device is available, the snapshots are copied over to it.) If you have data you feel is really important, my suggestion is that you have a completely different backup/restore method than what you're talking about. It needs to be bullet proof, well tested. And consider all the Btrfs send/receive work you're doing as testing/work-in-progress. There are still cases on the list where people have had problems with send/receive, both the send and receive code have a lot of churn, so I don't know that anyone can definitively tell you that a btrfs send/receive only based backup is going to reliably restore in one month let alone three years. Should it? Yes of course. Will it? Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
On Sun, 9 Mar 2014 03:30:44 PM Duncan wrote: > While I realize that was in reference to the "up in flames" comment and > presumably if there's a need to worry about that, offsite backup /is/ of > some value, for some people, offsite backup really isn't that valuable. Actually I missed that comment altogether, it was really just an illustration of why people should think about it - and then come to a decision about whether or not it makes sense for them. In your case maybe not, but for me (and my wife) it certainly does. All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC signature.asc Description: This is a digitally signed message part.
Re: Understanding btrfs and backups
Eric Mesa posted on Fri, 07 Mar 2014 14:03:44 + as excerpted: > Duncan - thanks for this comprehensive explanation. For a huge portion > of your reply...I was all wondering why you and others were saying > snapshots aren't backups. They certainly SEEMED like backups. But now I > see that the problem is one of precise terminology vs colloquialisms. In > other words, snapsshots are not backups in and of themselves. They are > like Mac's Time Machine. BUT if you take these snapshots and then put > them on another media - whether that's local or not - THEN you have > backups. Am I right, or am I still missing something subtle? You got it. =:^) Tho as I just mentioned in a reply on a different subthread, it's worth noting that btrfs send/receive is still a bit buggy at present and is giving people with corner-cases some errors. To my knowledge, if both the send and receive sides complete without error, it's a perfectly reliable backup. The problem is, they aren't always completing without errors at present, and I'd hate to have to actually need a current backup shortly after those send/receives started triggering errors, before I had a chance to put a different solution in place. So at this point I'd recommend having that other solution in place from the beginning, just in case. IOW, it's fine to play with send/receive right now, but don't depend on it with your life, or the life of your data! In a year or even six months, hopefully those bugs should be worked out and it'll be reliable as the sun rise, but I wouldn't count on that for my own data ATM, and I'd recommend you don't either. Tho as I said, to the best of my knowledge, if both sides complete without error, it's as reliable as btrfs itself is ATM. (Tho while kernel 3.13 did tone down the "might-eat-your-babies" warning on the kernel's btrfs config option, it's still what I'd classify as "semi- stable", so keep those backups updated and tested, and run current kernels since older kernels do still mean known bugs that are fixed in current!) > I think the most important thing you said was at the end and I'd like a > little clarification on that if it's OK with you. > > "As with local snapshots, old ones can >> be deleted on both the send and receive ends, as long as at least one >> common reference snapshot is maintained on both ends, so diffs taken >> against the send side reference can be applied to an appropriately >> identical receive side reference, thereby updating the receive side to >> match the new read-only snapshot on the send side." > > So, let's say I have everything set up. This means I created the > read-only shot on my home btrfs volume and sent it to the backup drive. > I'm making hourly snapshots and after each snapshot is made, it's sent > to the backup drive. So, obviously the backup drive needs to be at least > as big as the home drive so it can store what's on home plus the > snapshot-diffs. Now let's be extreme and say that in the course of a > year I touch and somehow change every single file on the home drive. > That means if I only had one snapshot I'd need home drive x 2 space. > (for used space, not unused space, naturally) Well, not strictly as you said. If you changed every BLOCK of every file over that year, THEN you'd need 2X the space. But if a lot of those files are say half-gig-plus ISOs and you only changed say one word of one file on each ISO, then no, it wouldn't be the whole files changed, only a single individual (btrfs size, 4 KiB AFAIK) block within the file, and 4 KiB out of half a gig is under 1/10 of 1 percent, so you wouldn't need 2X the space in a scenario like that. > So I might want my backups to have last's year's data, but wouldn't want > to need to upgrade the size of my actual home drive. So I would want to > maintain less snapshots on my home drive than my backup drive. (It's > possible I'm missing something here...something subtle that makes this > not necessary) So do I only need to make sure I have the latest snapshot > or maybe latest plus n-1 on the home drive while the backup drive can > have all snapshots since the beginning? I THINK that can be the case > based on reading your sentence, but I just want to make sure. In general, yes. Tho if you're doing hourly snapshots I'd probably keep say a day's worth locally, plus one a day for a week, and 1 weekly snapshot before that, just to cover the case of the my needing to recover a backup and finding that the remote backup just keeled over 12 hours ago. Unless you're writing/erasing heavily, snapshots take up very nearly zero space, so keeping a few extra around isn't going to hurt a whole lot. Meanwhile, however, I'd suggest a reasonable thinning down script on the remote backup as well, because at least at present, there are overhead issues once you get over several hundred snapshots. But realistically, if you decide you need a file 11 months old, are you really going to care or even kno
Re: Understanding btrfs and backups
Wolfgang Mader posted on Fri, 07 Mar 2014 11:13:51 +0100 as excerpted: > Duncan, thank you for this comprehensive post. Really helpful as always! > > [...] > >> As for restoring, since a snapshot is a copy of the filesystem as it >> existed at that point, and the method btrfs exposes for accessing them >> is to mount that specific snapshot, to restore an individual file from >> a snapshot, you simply mount the snapshot you want somewhere and copy >> the file as it existed in that snapshot over top of your current >> version > > Please, how do I list mounted snapshots only? > > [...] I personally don't use snapshots a whole lot (tho I like the concept) as they don't really fit my use-case. So in general I won't try to answer usage-detail questions such as that. That said, see the "Managing snapshots" section on the sysadmin guide page on the wiki, for some general snapshot management hints. https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Managing_snapshots The main point from there is to leave the top level of the filesystem empty but for the subvolumes/snapshots (see the tree diagrams) and to set a default subvolume that will be your normal subvolume-mount if you don't specify one. Then you can mount the root subvolume (subvolid=0, see the fstab line for /media/btrfs) when you want to manage snapshots. But the example there is full snapshot rollback. To restore an individual file instead of that, you'd just mount the root subvolume and the snapshots would all appear as subdirs, such that you could browse them as you would a normal filesystem, diving into the snapshot and its subdirs until you find the file you want to restore, and then copying it over to the working copy/snapshot. That doesn't directly answer how to list mounted snapshots only, but given the above tree layout, I don't really see that you'd /need/ to list mounted snapshots only, since presumably you'd have only the default mounted, plus the root subvolume, where you could browse into all the snapshots just as if they were normal directories. Also see the subvolumes and snapshots section of the FAQ: https://btrfs.wiki.kernel.org/index.php/FAQ#Subvolumes >> Since a snapshot is an image of the filesystem as it was at that >> particular point in time, and btrfs by nature copies blocks elsewhere >> when they are modified, all (well, not "all" as there's metadata like >> file owner, permissions and group, too, but that's handled the same >> way) the snapshot does is map what blocks composed each file at the >> time the snapshot was taken. > > Is it correct, that e.g. ownership is recorded separately from the data > itself, so if I would change the owner of all my files, the respective > snapshot would only store the old owner information? Yes. If you change the owner of the files in your "current" subvolume, the previous snapshots will retain their old ownership. Owner/ permissions/etc are metadata, stored separately from the actual data, with both data and metadata being snapshotted. [ on btrfs send/receive ] > > Is the receiving side a complete file system in its own right? Normally, yes. However, send normally serializes its output to STDOUT and that output can be sent to a specific file on some other filesystem (like ext4), or to tape or whatever, instead. In this case you can read back from that file using cat (or netcat if it's over the network, or whatever), directing its output to btrfs receive, to turn that data back into a filesystem. Used like this, you can think of the original send as a full backup (to tape or whatever), and child sends as incremental backups. Obviously, if stored in this form, in ordered to restore the incrementals you'd need the full backup they were based upon, just as you would if doing the same thing using conventional backup to tape or whatever. > If so, I only need to maintain one common reference in order to apply > the received snapshot, right. If I would in any way get the send and > receive side out of sync, such that they do not share a common > reference any more, only the send/receive would fail, but I still would > have the complete filesystem on the receiving side, and could copy it > all over (cp, rscync) to the send side in case of a disaster on the > send side. Is this correct? In the normal case (not stored as a file or serialized data stream as described above), yes. Meanwhile, given that we're talking of btrfs send/receive in the context of backups, it's worth explicitly making note of the current on-list reports and bugfixes in area of send/receive. In general, we're talking about an in-principle feature that should eventually be reliable enough to use as backup in the way discussed. However, at present, if it's data you'd really miss were it to disappear, please back it up using another method (say rsync or conventional backups) as well. To my knowledge, if the send and receive both occur without error, it should be a
Re: Understanding btrfs and backups
Chris Samuel posted on Sun, 09 Mar 2014 15:13:42 +1100 as excerpted: > On Fri, 7 Mar 2014 04:14:16 PM Sander wrote: > >> But if the filesystem or underlaying disk goes up in flames, the >> snapshots are toast as well. So you need additional backups, preferably >> not on the same hardware, for real protection against data loss. > > ...and don't forget to think about off-site backups too. > > http://www.flickr.com/photos/94482242@N00/7746409996/ While I realize that was in reference to the "up in flames" comment and presumably if there's a need to worry about that, offsite backup /is/ of some value, for some people, offsite backup really isn't that valuable. I figure if something like that happens here, I'll have FAR more pressing things to worry about for awhile than restoring my computer. And by the time life does get somewhat back to normal and I can think about the data that was on the computer, I might as well do over from scratch, like I will have done with much of the rest of my life by that point. The real valuable data is backed up where it counts -- to my head -- and if I lose that, well, I won't be very worried about it any more, will I? Of course if I were a bush doctor like the guy who owned the computer in that photo apparently was, then there'd be other people's medical records and the like to worry about too, and having offsite backups of that /would/ be important! And of course the same would apply if I had a bunch of family pictures on the computer to worry about, but for that I'd need a family first... -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
On Fri, 7 Mar 2014 04:14:16 PM Sander wrote: > But if the filesystem or underlaying disk goes up in flames, the > snapshots are toast as well. So you need additional backups, > preferably not on the same hardware, for real protection against > data loss. ...and don't forget to think about off-site backups too. http://www.flickr.com/photos/94482242@N00/7746409996/ cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
Eric Mesa wrote (ao): > Duncan - thanks for this comprehensive explanation. For a huge portion of > your reply...I was all wondering why you and others were saying snapshots > aren't backups. They certainly SEEMED like backups. But now I see that the > problem is one of precise terminology vs colloquialisms. In other words, > snapsshots are not backups in and of themselves. They are like Mac's Time > Machine. BUT if you take these snapshots and then put them on another media > - whether that's local or not - THEN you have backups. Am I right, or am I > still missing something subtle? Snapshots are backups, but only protect you against a limited amount of disasters. Snapshots are very convenient to quickly go back in time for some or all files and directories. But if the filesystem or underlaying disk goes up in flames, the snapshots are toast as well. So you need additional backups, preferably not on the same hardware, for real protection against data loss. The convenience of snapshots is that you can (almost) make them as often as you want, fully automated, with (almost) no impact on performance, without the need for extra hardware, and a restore is no more than a simple copy. Sander -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
Duncan <1i5t5.duncan cox.net> writes: > *But*, btrfs snapshots by themselves remain on the existing btrfs > filesystem, and thus are subject to many of the same risks as the > filesystem itself. As you mentioned raid is redundancy not backup, > snapshots aren't backup either; snapshots are multiple logical copies > thus protecting you from accidental deletion or bad editing, but pointed > at the same data blocks without redundancy, and if those data blocks or > the entire physical media go bad... > > Which is where real backups, separate copies on separate physical media, > come in, which is where btrfs send/receive, as the ars-technica article > was describing, comes in. > > The idea is to make a read-only snapshot on the local filesystem, read- > only so it can't change while it's being sent, and then use btrfs send to > send that snapshot to be stored on some other media, which can optionally > be over the network to a machine and media at a different site, altho it > can be to a different device on the same machine, as well. > > The first time you do this, there's no existing copy at the other end, so > btrfs send sends a full copy and btrfs receive writes it out. After > that, the receive side has a snapshot identical to the one created on the > send side and further btrfs send/receives to the same set simply > duplicate the differences between the reference and the new snapshot from > the send end to the receive end. As with local snapshots, old ones can > be deleted on both the send and receive ends, as long as at least one > common reference snapshot is maintained on both ends, so diffs taken > against the send side reference can be applied to an appropriately > identical receive side reference, thereby updating the receive side to > match the new read-only snapshot on the send side. > > Hopefully that's clearer now. =:^) > Duncan - thanks for this comprehensive explanation. For a huge portion of your reply...I was all wondering why you and others were saying snapshots aren't backups. They certainly SEEMED like backups. But now I see that the problem is one of precise terminology vs colloquialisms. In other words, snapsshots are not backups in and of themselves. They are like Mac's Time Machine. BUT if you take these snapshots and then put them on another media - whether that's local or not - THEN you have backups. Am I right, or am I still missing something subtle? I think the most important thing you said was at the end and I'd like a little clarification on that if it's OK with you. "As with local snapshots, old ones can > be deleted on both the send and receive ends, as long as at least one > common reference snapshot is maintained on both ends, so diffs taken > against the send side reference can be applied to an appropriately > identical receive side reference, thereby updating the receive side to > match the new read-only snapshot on the send side." So, let's say I have everything set up. This means I created the read-only shot on my home btrfs volume and sent it to the backup drive. I'm making hourly snapshots and after each snapshot is made, it's sent to the backup drive. So, obviously the backup drive needs to be at least as big as the home drive so it can store what's on home plus the snapshot-diffs. Now let's be extreme and say that in the course of a year I touch and somehow change every single file on the home drive. That means if I only had one snapshot I'd need home drive x 2 space. (for used space, not unused space, naturally) So I might want my backups to have last's year's data, but wouldn't want to need to upgrade the size of my actual home drive. So I would want to maintain less snapshots on my home drive than my backup drive. (It's possible I'm missing something here...something subtle that makes this not necessary) So do I only need to make sure I have the latest snapshot or maybe latest plus n-1 on the home drive while the backup drive can have all snapshots since the beginning? I THINK that can be the case based on reading your sentence, but I just want to make sure. In case you were wondering, this is based on what's happened to me with Back in Time. I had to reduce the number of backups I was keeping because my home drive wasn't at 100%, but the backupdrive was at 100% because I'd added and deleted some VMs and other large files (video files I think). And Back in Time intelligently does not remove the oldest backup off the top until it knows it has made a new backup - which it couldn't do because it was at 100%. So I had to delete the top 1 or 2 backups and then tell it to keep less backups. Your description of snapshots makes it seems much less likely that this would be an issue. Although Back in Time is an incremental backup, its takes up more space. If I may venture to see if I've learned something from your response, is it because when I change a file Back in Time stores the entire changed file while btrfs only stores the bi
Re: Understanding btrfs and backups
Duncan, thank you for this comprehensive post. Really helpful as always! [...] > As for restoring, since a snapshot is a copy of the filesystem as it > existed at that point, and the method btrfs exposes for accessing them is > to mount that specific snapshot, to restore an individual file from a > snapshot, you simply mount the snapshot you want somewhere and copy the > file as it existed in that snapshot over top of your current version > (which will have presumably already been mounted elsewhere, before you > mounted the snapshot to retrieve the file from), then unmount the > snapshot and go about your day. =:^) Please, how do I list mounted snapshots only? [...] > > Since a snapshot is an image of the filesystem as it was at that > particular point in time, and btrfs by nature copies blocks elsewhere > when they are modified, all (well, not "all" as there's metadata like > file owner, permissions and group, too, but that's handled the same way) > the snapshot does is map what blocks composed each file at the time the > snapshot was taken. Is it correct, that e.g. ownership is recorded separately from the data itself, so if I would change the owner of all my files, the respective snapshot would only store the old owner information? [...] > > The first time you do this, there's no existing copy at the other end, so > btrfs send sends a full copy and btrfs receive writes it out. After > that, the receive side has a snapshot identical to the one created on the > send side and further btrfs send/receives to the same set simply > duplicate the differences between the reference and the new snapshot from > the send end to the receive end. As with local snapshots, old ones can > be deleted on both the send and receive ends, as long as at least one > common reference snapshot is maintained on both ends, so diffs taken > against the send side reference can be applied to an appropriately > identical receive side reference, thereby updating the receive side to > match the new read-only snapshot on the send side. Is the receiving side a complete file system in its own right? If so, I only need to maintain one common reference in order to apply the received snapshot, right. If I would in any way get the send and receive side out of sync, such that they do not share a common reference any more, only the send/receive would fail, but I still would have the complete filesystem on the receiving side, and could copy it all over (cp, rscync) to the send side in case of a disaster on the send side. Is this correct? Thank you! Best, Wolfgang -- Wolfgang Mader wolfgang.ma...@fdm.uni-freiburg.de Telefon: +49 (761) 203-7710 Institute of Physics Hermann-Herder Str. 3, 79104 Freiburg, Germany Office: 207 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
Eric Mesa posted on Thu, 06 Mar 2014 18:18:15 + as excerpted: > apologies if this is a resend - it appeared to me that it was rejected > because of something in how Gmail was formatting the message. I can't > find it in the Gmane archives which leads me to believe it was never > delivered. Probably HTML-formatted. AFAIK vger.kernel.org (the list-serv for many kernel lists) is set to reject that. Too bad more list-servs don't do likewise. =:^( > I was hoping to gain some clarification on btrfs snapshops and how they > function as backups. Looking at the below it does indeed appear you are confused, but this is the place to post the questions necessary to get unconfused. =:^) > I did a bit of Googling and found lots of examples of bash commands, but > no one seemed to explain what was going on to a level that would satisfy > me for my data needs. You don't mention whether you've seen/read the btrfs wiki or not. That's the most direct and authoritative place to look... and to bookmark. =:^) https://btrfs.wiki.kernel.org > I read this Ars Technica article today > http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic- cows-inside-next-gen-filesystems/ > > First of all, the btrfs-raid1 sounds awesome. Because it helps protect > against one of RAID1's failings - bit rot issues. But raid1 is not > backup, it's just redundancy. > > Second, the article mentions using snapshots as a backup method. Well, this is where you start to be confused. Snapshots are not backups either, altho they're sort of opposite raid in that while raid is redundancy-only, snapshots are rollback-only, without the redundancy (I'll explain...). > Page 3 section: Using the features. > > He makes a snapshot and sends that. Then he sends what changed the > second time. He mentions that because btrfs knows what's changed it's a > quick process. OK, what that is discussing is btrfs send/receive, with snapshots simply part of the process of doing that. Think rsync in effect, but btrfs- specific and much more efficient. Btrfs send/receive does use snapshots but only as part of making the send/receive process more reliable and efficient. I'll discuss snapshots (and COW) first, below, then bring in btrfs send/receive at the end. > Right now on my Linux computer I use Back in Time which, I think, is > just an rsync frontend. It takes a long time to complete the backup for > my 1 TB /home drive. The copy part is nice and quick, but the comparison > part takes a long time and hammers the CPU. I have it setup to run at > night because if it runs while I'm using the computer, things can crawl. > > So I was wondering if btrfs snapshots are a substitute for this. Right > now if I realize I deleted a file 5 days ago, I can go into Back in Time > (the gui) or just navigate to it on the backup drive and restore that > one file. > From what I've read about btrfs, I'd have to restore the entire home > drive, right? Which means I'd lose all the changes from the past five > days. If that's the case, it wouldn't really solve my problem - > although maybe I'm just not thinking creatively. No, in snapshot terms you don't restore the entire drive. Rather, the snapshots are taken on the local filesystem, storing (like one still frame in a series that makes a movie, thus the term snapshot) the state of the filesystem at the point the snapshot was taken. Files can be created/deleted/moved/altered after the snapshot, and only the differences between snapshots and between the last snapshot and the current state are changed. The fact that btrfs is a copy-on-write (COW) filesystem makes snapshotting very easy... trivial... since it's a byproduct of the COW nature of the filesystem and thus comes very nearly for free, with only hooking up some way to access specific bits of functionality that's already there necessary in ordered to get snapshotting. A copy-on-write illustration (please view with a monospace font for proper alignment): Suppose each letter of the following string represents a block of a particular size (say 4KiB) of a file, with the corresponding block addresses noted as well: 0111 1234567890123456 abcdefgxijklmnop It's the first bit of the alphabet, but notice the x where h belongs. Now someone notices and edits the file, correcting the problem: abcdefghijklmnop Except when they save the file, a COW-based filesystem will make the change like this: 00050111 1234567390123456 ||| abcdefg ijklmnop | h The unchanged blocks of the file all remain in place. The only change is to the one block, which unlike normal filesystems, isn't edited in-place, but rather, is written into a new location, and the filesystem simply notes that the new location (53) should be used to read that file block now, instead of the old location (08). Of course as illustrated here, the addresses each take up two characters while the data
Re: Understanding btrfs and backups
On 2014/03/06 09:27 PM, Eric Mesa wrote: Brian Wong wrote: a snapshot is different than a backup [snip] ... Three hard drives: A, B, and C. Hard drives A and B - btrfs RAID-1 so that if one drive dies I can keep using my system until the replacement for the raid arrives. Hard drive C - gets (hourly/daily/weekly/or some combination of the above) snapshots from the RAID. (Starting with the initial state snapshot) Each timepoint another snapshot is copied to hard drive C. [snip]... So if that's what I'm doing, do snapshots become a way to do backups? An important distinction for anyone joining the conversation is that snapshots are *not* backups, in a similar way that you mentioned that RAID is not a backup. If a hard drive implodes, its snapshots go with it. Snapshots can (and should) be used as part of a backup methodology - and your example is almost exactly the same as previous good backup examples. I think most of the time there's mention of an external "backup server" keeping the backups, which is the only major difference compared to the process you're looking at. Btrfs send/receive with snapshots can make the process far more efficient compared to rsync. Rsync doesn't have any record as to what information has changed so it has to compare all the data (causing heavy I/O). Btrfs keeps a record and can skip to the part of sending the data. I do something similar to what you have described on my Archlinux desktop - however I haven't updated my (very old) backup script to take advantage of btrfs' send/receive functionality. I'm still using rsync. :-/ / and /home are on btrfs-raid1 on two smallish disks /mnt/btrfs-backup is on btrfs single/dup on a single larger disk See https://btrfs.wiki.kernel.org/index.php/Incremental_Backup for a basic incremental methodology using btrfs send/receive -- __ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
Brian Wong wrote: a snapshot is different than a backup, with a snapshot you're still accessing a read-only version of the live filesystem. i don't know the specifics of btrfs but if you take daily snapshots, you should be able to restore a single file from the five-days-ago snapshot by browsing that snapshot's directory tree and then copying the file to the live version of the filesystem, if that makes sense. in the snapshot case the live filesystem serves the same function as the full backup would if you did full backups then incrementals. the snapshots are the incrementals of the live filesystem, only going backwards in time whereas with backup you would take a full backup then go forward in time with incrementals. the filesystem takes care of making sure every snapshot is complete. in the snapshot case redundancy is then more important because you may not have a bunch of full backups (i.e. full copies) lying around. so full backups still are useful. -- OK, I THINK I understand things a bit better. So from the point of view of restoring a single file, that functionality is there. Excellent. And I guess you're saying that because the snapshots are diffs off the live system, that I'd need a backup of the live system - ie snapshots wouldn't be enough. But what if my first snapshot was a clone of the system at that point (as it seems from the article) And I back that up to a separate drive. Let me illustrate with what I plan to do exactly. Three hard drives: A, B, and C. Hard drives A and B - btrfs RAID-1 so that if one drive dies I can keep using my system until the replacement for the raid arrives. Hard drive C - gets (hourly/daily/weekly/or some combination of the above) snapshots from the RAID. (Starting with the initial state snapshot) Each timepoint another snapshot is copied to hard drive C. So in the case of a file disappearing on me or being over-written or w/e - I reach into the directory of the snapshot that contains the file just as I would now with the backup. So if that's what I'm doing, do snapshots become a way to do backups? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding btrfs and backups
Brian Wong wrote: a snapshot is different than a backup, with a snapshot you're still accessing a read-only version of the live filesystem. i don't know the specifics of btrfs but if you take daily snapshots, you should be able to restore a single file from the five-days-ago snapshot by browsing that snapshot's directory tree and then copying the file to the live version of the filesystem, if that makes sense. in the snapshot case the live filesystem serves the same function as the full backup would if you did full backups then incrementals. the snapshots are the incrementals of the live filesystem, only going backwards in time whereas with backup you would take a full backup then go forward in time with incrementals. the filesystem takes care of making sure every snapshot is complete. in the snapshot case redundancy is then more important because you may not have a bunch of full backups (i.e. full copies) lying around. so full backups still are useful. -- OK, I THINK I understand things a bit better. So from the point of view of restoring a single file, that functionality is there. Excellent. And I guess you're saying that because the snapshots are diffs off the live system, that I'd need a backup of the live system - ie snapshots wouldn't be enough. But what if my first snapshot was a clone of the system at that point (as it seems from the article) And I back that up to a separate drive. Let me illustrate with what I plan to do exactly. Three hard drives: A, B, and C. Hard drives A and B - btrfs RAID-1 so that if one drive dies I can keep using my system until the replacement for the raid arrives. Hard drive C - gets (hourly/daily/weekly/or some combination of the above) snapshots from the RAID. (Starting with the initial state snapshot) Each timepoint another snapshot is copied to hard drive C. So in the case of a file disappearing on me or being over-written or w/e - I reach into the directory of the snapshot that contains the file just as I would now with the backup. So if that's what I'm doing, do snapshots become a way to do backups? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html