Hi Graham, That definitely sounds useful. Is the proposed probability value intended to be a user configurable value? I'm thinking that keeping it low for normal backups as you suggested, then ramping up to a higher value once every x backups. Similar idea to running deltas most of the time, then a full one every period.
Cheers, Scott On 25 June 2019 7:09:54 am AEST, Graham Percival <gperc...@tarsnap.com> wrote: >Hi Scott, > >Thanks for clarifying the use case! Colin had the idea of using the >Tarsnap >cache to detect disk errors. Namely, if the filesystem reports that a >file >hasn't changed, there would be a random chance that the tarsnap client >would >read the file anyway, and compare the chunk hashes against the expected >values >(from previous backups). If you wanted to be paranoid, you could >specify a >probability of 100%, but more likely you'd pick a value like 10% so >that it >didn't impact performance too much. > >This wouldn't warn you about a disk failure which changed the file >modification time or size, but it would be perfect for a disk which >flipped a >few bits in a file. >https://github.com/Tarsnap/tarsnap/issues/19 > >The good news is that I have a proof-of-concept implementation of this. >I ended up putting it on the back-burner, but I've been looking at the >code >this morning and I still think it's plausible. Does this sound useful? > >Cheers, >- Graham > >On Sun, Jun 23, 2019 at 06:16:52PM +1000, Scott Dickinson wrote: >> Thanks Colin & Jacob. >> With several hundred Gb's of data being archived, the local >tarbell >> option is probably not going to work for me. >> Does "tarsnap -t -f" show file modification date based on what the >> filesystem is reporting, on when tarsnap detects a change? >> To provide more details, I had a number of sectors on an SSD >silently >> faile so I needed to identify and restore files that were >corrupted by >> this evemt. The filesystem did not report any change in >modification >> date on these files, so couldn't rely on this to identify which >files >> to restore. Hence my question around reporting on the files >impacted by >> block changes between archives, to both identify an expected >change, >> and recover from this. >> If tarsnap can't do this, perhaps I need to start capturing a hash >of >> each file at the time of backup, and compare those between >archives. >> Cheers, >> Scott >> >> On 19/6/19 7:15 am, Jacob Larsen wrote: >> >> I had the same issue a while back. I was told it was not easily >> fixed due to the layers in Tarsnap. I ended up making a regular >> tarball and fed that to tarsnap. That way I had a local tarball >that >> matched the actual data in the archive. Then I could extract it >and >> compere at the next backup. A bit data heavy process but it gave >me >> what I needed. It is scriptable, so it is possible to let your >> backup script log the changed files on each backup run, but it >has a >> pretty high cost in disk I/O, plus you need to keep a copy of >your >> data around between backups. >> /Jacob >> On 18/06/2019 13.49, Scott Dickinson wrote: >> >> Hi, >> I'm trying to work out how to generate a report on files that >are >> new or changed in a particular archive. I can't seem to find an >easy >> way to do this, so hoping someone can help. >> Here is the scenario I'm working through. >> 1. Backup directory "x" on 1st May 2019. First time archive, all >> 10Gb are sent as expected. >> 2. Backup directory "x" on 1st June 2019. Second time archive, >25Mb >> are sent. >> How can I report on which files that 25Mb of delta's are part >of? In >> this scenario, I wasn't expecting any changes to the files over >the >> month, so am surprised there were anything above the metadata to >be >> backed up. My understanding is that Tarsnap needs to know which >> files the changed blocks belong to, therefore in theory this >> metadata should be extractable. >> The closest I've found to locate this is "tarnsap -t -f 'x' -v >> --iso-dates", but this doesn't natively provide the details I'm >> after. Ideally I'd like tarsnap to be able to report which files >> were uploaded at the time or archive with an option similar to >> --print-stats. >> Anyone got any ideas? >> Cheers, >> Scott -- Sent from my Android device with K-9 Mail. Please excuse my brevity.