Hi Graham,

That definitely sounds useful. Is the proposed probability value intended to be 
a user configurable value? I'm thinking that keeping it low for normal backups 
as you suggested, then ramping up to a higher value once every x backups. 
Similar idea to running deltas most of the time, then a full one every period. 

Cheers,
Scott

On 25 June 2019 7:09:54 am AEST, Graham Percival <gperc...@tarsnap.com> wrote:
>Hi Scott,
>
>Thanks for clarifying the use case!  Colin had the idea of using the
>Tarsnap
>cache to detect disk errors.  Namely, if the filesystem reports that a
>file
>hasn't changed, there would be a random chance that the tarsnap client
>would
>read the file anyway, and compare the chunk hashes against the expected
>values
>(from previous backups).  If you wanted to be paranoid, you could
>specify a
>probability of 100%, but more likely you'd pick a value like 10% so
>that it
>didn't impact performance too much.
>
>This wouldn't warn you about a disk failure which changed the file
>modification time or size, but it would be perfect for a disk which
>flipped a
>few bits in a file.
>https://github.com/Tarsnap/tarsnap/issues/19
>
>The good news is that I have a proof-of-concept implementation of this.
>I ended up putting it on the back-burner, but I've been looking at the
>code
>this morning and I still think it's plausible.  Does this sound useful?
>
>Cheers,
>- Graham
>
>On Sun, Jun 23, 2019 at 06:16:52PM +1000, Scott Dickinson wrote:
>>    Thanks Colin & Jacob.
>>    With several hundred Gb's of data being archived, the local
>tarbell
>>    option is probably not going to work for me.
>>    Does "tarsnap -t -f" show file modification date based on what the
>>    filesystem is reporting, on when tarsnap detects a change?
>>    To provide more details, I had a number of sectors on an SSD
>silently
>>    faile so I needed to identify and restore files that were
>corrupted by
>>    this evemt. The filesystem did not report any change in
>modification
>>    date on these files, so couldn't rely on this to identify which
>files
>>    to restore. Hence my question around reporting on the files
>impacted by
>>    block changes between archives, to both identify an expected
>change,
>>    and recover from this.
>>    If tarsnap can't do this, perhaps I need to start capturing a hash
>of
>>    each file at the time of backup, and compare those between
>archives.
>>    Cheers,
>>    Scott
>> 
>>    On 19/6/19 7:15 am, Jacob Larsen wrote:
>> 
>>      I had the same issue a while back. I was told it was not easily
>>      fixed due to the layers in Tarsnap. I ended up making a regular
>>      tarball and fed that to tarsnap. That way I had a local tarball
>that
>>      matched the actual data in the archive. Then I could extract it
>and
>>      compere at the next backup. A bit data heavy process but it gave
>me
>>      what I needed. It is scriptable, so it is possible to let your
>>      backup script log the changed files on each backup run, but it
>has a
>>      pretty high cost in disk I/O, plus you need to keep a copy of
>your
>>      data around between backups.
>>      /Jacob
>>      On 18/06/2019 13.49, Scott Dickinson wrote:
>> 
>>      Hi,
>>      I'm trying to work out how to generate a report on files that
>are
>>      new or changed in a particular archive. I can't seem to find an
>easy
>>      way to do this, so hoping someone can help.
>>      Here is the scenario I'm working through.
>>      1. Backup directory "x" on 1st May 2019. First time archive, all
>>      10Gb are sent as expected.
>>      2. Backup directory "x" on 1st June 2019. Second time archive,
>25Mb
>>      are sent.
>>      How can I report on which files that 25Mb of delta's are part
>of? In
>>      this scenario, I wasn't expecting any changes to the files over
>the
>>      month, so am surprised there were anything above the metadata to
>be
>>      backed up. My understanding is that Tarsnap needs to know which
>>      files the changed blocks belong to, therefore in theory this
>>      metadata should be extractable.
>>      The closest I've found to locate this is "tarnsap -t -f 'x' -v
>>      --iso-dates", but this doesn't natively provide the details I'm
>>      after. Ideally I'd like tarsnap to be able to report which files
>>      were uploaded at the time or archive with an option similar to
>>      --print-stats.
>>      Anyone got any ideas?
>>      Cheers,
>>      Scott

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Reply via email to