Hi Craig, If you'd like a better understanding of deduplication, I very much recommend: http://www.tarsnap.com/deduplication-explanation.html (new page from 6 weeks ago, so regular tarsnap users probably haven't seen it)
I made up an example of "wordification" (where we make multiple backups of a short phrase that changes slightly) as an analogy to backing up a hard drive. The whole point of that page is to clarify questions like yours, so please let me know if anything isn't clear on that page! :) One specific thing to correct: > Now, even though (technically speaking) FILE.EXT was not in the newest > archive because it has not changed since the initial back-up, FILE.EXT is definitely part of that archive! You can test this for yourself by listing all of its filenames with: tarsnap -t -f NEWEST-ARCHIVE I very much agree with Jamie's email where he says "it's far more simple than you think". Let's pretend that you make a daily backup of one directory. Then each backup is "what did that directory contain on that day". - Want to see it from 3 days ago? Restore that archive. - Want to see it from yesterday? Restore that archive. - Do you care about the contents from 2 days ago? No? Ok, delete that archive. Your ability to see the version from 1 or 3 days ago is completely unaffected by your deleting the 2-day-old version. Cheers, - Graham On Sat, Nov 24, 2018 at 03:57:03AM -0800, Craig Hartnett wrote: > Hi Niels, > > Thanks for your reply. Yes, one of my questions -- about intentionally > deleting a file and then wanting it gone from all back-ups too -- was > hypothetical, but the knowledge of how to accomplish that (if necessary) > is (of course) useful, both to more fully understand Tarsnap and (in > case the need ever arises) to actually do it. > > And since this is just a laptop, RAID is not a realistic option. An > interesting one, yes, but not practical. > > Further to my original email, in another thread I tried to restore a > file that has not changed since I did my initial back-up, but I > specified the most recent archive: > > tarsnap -x -f NEWEST-ARCHIVE media/USER/PATH/DIRECTORY/FILE.EXT > > Now, even though (technically speaking) FILE.EXT was not in the newest > archive because it has not changed since the initial back-up, the > restore command still worked. I assume Tarsnap is just smart enough to > know that I'm stupid and specified the "wrong" archive, and got the file > for me from wherever it was residing. But I assume, going back to one of > my original questions, that if I had deleted the initial archive, that > file would not have been there for Tarsnap to find until after my next > scheduled back-up. > > > Craig > > > > On Sat, 2018-11-24 at 08:05 +0100, Niels Kobschaetzki wrote: > > That is one of the idea of backups: protect you from accidentally deleting > > files (they protect you also from hardware failure but redundancy and RAIDs > > a re a better choice here because of possibility to continue the device > > during the “outage”) > > Thus if you truly want to have a file deleted you need it to delete also > > from the backups. Most backup systems in my experience only know the > > concept of volumes which need to be deleted then. Thus a file is only gone > > when all volumes are gone that contain the file. Thus in that case you have > > to wait until the file is rotated away or destroy all the volumes. > > Rotation has the added benefit of saving on space which means in the end > > saving money (with any backup system because with other system you will > > need more drives, tapes whatever with time). > > > > Niels > > > > > > > On 24. Nov 2018, at 04:04, Craig Hartnett <cr...@1811.spamslip.com> wrote: > > > > > > Hi again, > > > > > > OK, so I did read that I'm supposed to forget everything I know about > > > back-ups, but frankly that wasn't much. :) Not that I know nothing, but > > > it hasn't been something I've spent a *lot* of time thinking about. > > > > > > But as I think about Tarsnap, deleted files, rotating/deleting archives, > > > daily storage charges (increasing, of course, as the amount of data > > > stored slowly increases), etc., I start wondering about what happens to > > > files I intentionally delete from my hard drive. If I understand Tarsnap > > > correctly, a file that I backed up in my initial back-up and that hasn't > > > since changed only exists in that initial back-up archive because (a) it > > > hasn't changed so there has been no need to re-upload any part of it and > > > (b) archives are immutable. If I delete that initial archive I assume (I > > > could be wrong, so this is part of my series of questions) that Tarsnap > > > will realise that and back up those files again. Am I right? > > > > > > So if I delete my initial archive today, Tarsnap will realise that it > > > has to upload pretty much everything -- not everything, but almost > > > everything -- again, right? > > > > > > And what if I delete a file -- any file -- on my hard drive that has > > > been backed up in the past? Of course Tarsnap won't upload a null file, > > > but does that file continue to exist in the archives unless or until I > > > delete the last archive that contains it? In other words, it's *my* > > > responsibility to curate my archives, right? (I'm quite happy to curate > > > my own stuff. Just want to make sure.) > > > > > > And what if I want to delete a file from my hard drive *and* my > > > back-ups? Since the archives are immutable, and this file was in my > > > initial back-up, am I right that there is no way to delete that single > > > file from the back-up archives without deleting the whole archive, and > > > consequently re-uploading most of the original archive again? > > > > > > Which leads me to the conclusion that I should pick a time frame -- say, > > > 90 days -- or come up with some traditional, staggered rotation system, > > > and start deleting archives older than that *except* the initial > > > archive, right? > > > > > > Or am I completely out to lunch here? :) > > > > > > Thanks for any light you can shed on this, via links to documentation > > > that covers it of course if I have missed it. > > > > > > > > > Craig > > > >