Hi Patrik, et al,

Thanks for taking the time to expand on Eric's reply.  More below ...

On 2/7/26 5:31 AM, Patrik Dufresne wrote:
Hi Leland,

I want to add a couple of important points to Eric's response:

Regarding the hardware issue, this type of corruption would affect ANY backup software when the underlying hardware has memory errors. It's not specific to rdiff-backup. Any tool would write corrupted data when running on faulty RAM. This is exactly why using ECC memory on backup servers is so critical. Data integrity depends on reliable hardware, and ECC memory can detect and correct these errors before they corrupt your backups on disk.

Yes.  Just to be clear, my original post wasn't intended to point to any fault in 'rdiff-backup'.  I just wanted to make sure I was correctly understanding where things stood with regard to my backups as a result of my _server's_ problems.  And yes, _everything_ on that machine is now suspect.

Obviously, if I had it to do over again I would have gone with ECC RAM.  But the system is only for personal use (i.e. not supporting any larger community of users) and ... well ... you know ... money doesn't grow on trees.


Also, your archive is NOT entirely lost. Unlike blob-based backup software (like Borg, Restic, etc.), rdiff-backup stores the current backup as a normal mirror of readable files. This gives you several recovery advantages:

1.

   All uncorrupted files are immediately accessible. You can copy them
   directly using standard file system tools without needing
   rdiff-backup at all. Only the specific corrupted file(s) are
   affected, not the entire archive. But I think you already know that.

Yes.  In fact, it appears that under certain circumstances, the archive _can_ be recovered.  Specifically, if the current mirror has corrupt files, and if the originals of those files (i.e. the files of which the current mirror is a copy) have not changed, then one can copy the originals over the corrupt files in the current mirror.  At least, after that 'rdiff-backup verify ...' says "All files verified successfully".  For whatever that's worth.


2.

   You can surgically remove the problematic file using
   |rdiff-backup-delete <corrupted_file_path>| to completely delete the
   history of just that corrupted file from the repository. This should
   eliminate the verification error entirely while preserving
   everything else.

Ha!  I did _not_ know about 'rdiff-backup-delete'!  Thanks for that!  That will definitely be useful.  Once I get my server fixed I can continue to use those archives.  I just have to remember that they are no longer complete.


So to answer your question directly: No, the entire archive is not lost. You can recover all non-corrupted files immediately, and you have options to repair or remove the corrupted file's history.

That said, Eric's suggestion about creating a new baseline periodically is still excellent practice for long-term backup hygiene, but with good hardware, I can recover files from a 15 years old backup without problem.
[...]

Yes.  I'm going to have to work that into my back up plan ... such as it is.  I had been using 'rsync' to copy all my archives to an entirely separate hard drive assuming that hard drive failure was the most likely failure mode ... but, of course, the memory errors have also corrupted that process.  Ugh.  At least I can easily move that drive to another computer where I can work on them without doing further damage.

Thanks again for all your help.

Cheers
Leland

[...]



On 2/7/26 04:45, Eric L. wrote:
Hi Leland,

everything you write is correct.I would have expected the backup action to detect when something gets corrupt, at time of writing, but that's difficult to reproduce and test, so no guarantee (if you know which file, you could check in past backup logs). But even if it's the case, that doesn't help you anymore.

The only way to address this would be to create a new repository from time to time, to save a new baseline.

KR, Eric

On 04/02/2026 07:48, Leland C. Best wrote:
Hi All,

First, I've used 'rdiff-backup' for a long time (20 years?). I've had to use my backups to recover everything from a few accidentally deleted files to complete system restores to bare metal (although other tools are also needed to do the latter). As such, I want to thank everybody who has contributed, and is contributing, to this outstanding project.

I have a question about the integrity of a backup archive under certain conditions.

As I understand it, the current (i.e. most recent) backup is simply a "mirror" of the source directory.  The next most recent backup can then be reconstructed by applying a set of diffs (an "increment"?) to the current backup.  Another (additional) set of diffs applied to that would reconstruct the next most recent backup.  And so on.

Lets suppose that, somehow, the current backup (the mirror) becomes corrupted.  Given how I think things work in 'rdiff-backup', it seems to me that that would mean the _entire_ archive would be corrupted.  That is, doing a 'rdiff-backup regress' would _not_ recover the previous backup.  Is that correct?

I'm asking because my backup server has developed _very_ intermittent memory errors.  I only discovered this _because_ an 'rdiff-backup verify ...' on the most recent backup failed.  [I ultimately verified it was a memory problem via 'memtest86+'.] The error was of the form

    ERROR:   Computed SHA1 digest of file <some file>
    '4e45b5128111db53558b1135898386bbaac5c4b2' doesn't match recorded
    digest of 'a671cd065bd97e16b6c5a3cf789e37447fa13fa9'. Your backup
    repository may be corrupted!

The point being that, if I'm understanding correctly, then at this point the entire archive is now basically lost.  Again, is this correct?

Thanks in advance for any info.

Cheers
Leland





Reply via email to