Hi Patrik, et al,
Thanks for taking the time to expand on Eric's reply. More below ...
On 2/7/26 5:31 AM, Patrik Dufresne wrote:
Hi Leland,
I want to add a couple of important points to Eric's response:
Regarding the hardware issue, this type of corruption would affect ANY
backup software when the underlying hardware has memory errors. It's
not specific to rdiff-backup. Any tool would write corrupted data when
running on faulty RAM. This is exactly why using ECC memory on backup
servers is so critical. Data integrity depends on reliable hardware,
and ECC memory can detect and correct these errors before they corrupt
your backups on disk.
Yes. Just to be clear, my original post wasn't intended to point to any
fault in 'rdiff-backup'. I just wanted to make sure I was correctly
understanding where things stood with regard to my backups as a result
of my _server's_ problems. And yes, _everything_ on that machine is now
suspect.
Obviously, if I had it to do over again I would have gone with ECC RAM.
But the system is only for personal use (i.e. not supporting any larger
community of users) and ... well ... you know ... money doesn't grow on
trees.
Also, your archive is NOT entirely lost. Unlike blob-based backup
software (like Borg, Restic, etc.), rdiff-backup stores the current
backup as a normal mirror of readable files. This gives you several
recovery advantages:
1.
All uncorrupted files are immediately accessible. You can copy them
directly using standard file system tools without needing
rdiff-backup at all. Only the specific corrupted file(s) are
affected, not the entire archive. But I think you already know that.
Yes. In fact, it appears that under certain circumstances, the archive
_can_ be recovered. Specifically, if the current mirror has corrupt
files, and if the originals of those files (i.e. the files of which the
current mirror is a copy) have not changed, then one can copy the
originals over the corrupt files in the current mirror. At least, after
that 'rdiff-backup verify ...' says "All files verified successfully".
For whatever that's worth.
2.
You can surgically remove the problematic file using
|rdiff-backup-delete <corrupted_file_path>| to completely delete the
history of just that corrupted file from the repository. This should
eliminate the verification error entirely while preserving
everything else.
Ha! I did _not_ know about 'rdiff-backup-delete'! Thanks for that!
That will definitely be useful. Once I get my server fixed I can
continue to use those archives. I just have to remember that they are
no longer complete.
So to answer your question directly: No, the entire archive is not
lost. You can recover all non-corrupted files immediately, and you
have options to repair or remove the corrupted file's history.
That said, Eric's suggestion about creating a new baseline
periodically is still excellent practice for long-term backup hygiene,
but with good hardware, I can recover files from a 15 years old backup
without problem.
[...]
Yes. I'm going to have to work that into my back up plan ... such as it
is. I had been using 'rsync' to copy all my archives to an entirely
separate hard drive assuming that hard drive failure was the most likely
failure mode ... but, of course, the memory errors have also corrupted
that process. Ugh. At least I can easily move that drive to another
computer where I can work on them without doing further damage.
Thanks again for all your help.
Cheers
Leland
[...]
On 2/7/26 04:45, Eric L. wrote:
Hi Leland,
everything you write is correct.I would have expected the backup
action to detect when something gets corrupt, at time of writing, but
that's difficult to reproduce and test, so no guarantee (if you know
which file, you could check in past backup logs). But even if it's
the case, that doesn't help you anymore.
The only way to address this would be to create a new repository from
time to time, to save a new baseline.
KR, Eric
On 04/02/2026 07:48, Leland C. Best wrote:
Hi All,
First, I've used 'rdiff-backup' for a long time (20 years?). I've
had to use my backups to recover everything from a few accidentally
deleted files to complete system restores to bare metal (although
other tools are also needed to do the latter). As such, I want to
thank everybody who has contributed, and is contributing, to this
outstanding project.
I have a question about the integrity of a backup archive under
certain conditions.
As I understand it, the current (i.e. most recent) backup is simply
a "mirror" of the source directory. The next most recent backup can
then be reconstructed by applying a set of diffs (an "increment"?)
to the current backup. Another (additional) set of diffs applied to
that would reconstruct the next most recent backup. And so on.
Lets suppose that, somehow, the current backup (the mirror) becomes
corrupted. Given how I think things work in 'rdiff-backup', it
seems to me that that would mean the _entire_ archive would be
corrupted. That is, doing a 'rdiff-backup regress' would _not_
recover the previous backup. Is that correct?
I'm asking because my backup server has developed _very_
intermittent memory errors. I only discovered this _because_ an
'rdiff-backup verify ...' on the most recent backup failed. [I
ultimately verified it was a memory problem via 'memtest86+'.] The
error was of the form
ERROR: Computed SHA1 digest of file <some file>
'4e45b5128111db53558b1135898386bbaac5c4b2' doesn't match recorded
digest of 'a671cd065bd97e16b6c5a3cf789e37447fa13fa9'. Your backup
repository may be corrupted!
The point being that, if I'm understanding correctly, then at this
point the entire archive is now basically lost. Again, is this
correct?
Thanks in advance for any info.
Cheers
Leland