Hi Guillermo!

On 2024-08-21 12:40, Guillermo Rozas wrote:
>     4) ... but there is no means for BackupPC itself to *react* on the issue; 
> it simply warns, but there is no recovery.
> 
>     Instead, even on a new backup when the original file is still available 
> on one of the host, the server copy of the file is just taken to be "fine". 
> I.e., BackupPC detects that a file with the same hash is already existing in 
> the pool, so it happily assumes there's no need to retransfer the file. It's 
> simply kept as is, although it was detected as broken in step 3) - this 
> information is not propagated to or used within subsequent backup jobs.
> 
>     *If* this is really correct, it's a serious resilience problem with 
> BackupPC. At least, it fails any "principle of least surprise" check.
> 
>     Of course, BackupPC can't be expected to recreate data that is lost on 
> both server and clients.
>     But *some* remedial action can and should be taken. One somewhat sane 
> reaction, IIUC, would be to move the broken file from the pool into some 
> "quarantine/damaged" area (it could still be "mostly" okay and contain useful 
> information after all, if anything else is lost).
> 
>     At the very least, the pool file should be marked as "suspicious" such 
> that, if it is found on some host during a backup again, a fresh copy will be 
> created in the pool. Or, if you are concerned about hash collisions, a new 
> copy with _1 appended should be recreated and used for the new backups from 
> this point onwards.
> 
>     The same approach should be taken about attrib files. Same logic: If the 
> folder on the host remained unchanged, a new backup should recover any 
> information that BackupPC_fsck detected as lost on the server.
> 
>     I'd totally understand if some manual intervention is required (stopping 
> BackupPC, running some rescue commands etc.) - but from what I understand 
> from Ghislain, there's nothing to help apart from microsurgery or creating an 
> entirely new BackupPC instance, losing all history. And that's the opposite 
> of rock-solid.
>     I've yet to confirm, but my own experience from the last couple of weeks 
> seems to support this observation.
> 
> 
> From https://backuppc.github.io/backuppc/BackupPC.html 
> <https://backuppc.github.io/backuppc/BackupPC.html>:
> 
> <quote>
> "An rsync "full" backup now uses --checksum (instead of --ignore-times), 
> which is much more efficient on the server side - the server just needs to 
> check the full-file checksum computed by the client, together with the mtime, 
> nlinks, size attributes, to see if the file has changed. If you want a more 
> conservative approach, you can change it back to --ignore-times, which 
> requires the server to send block checksums to the client."
> <\quote>
> 
> By default V4 uses --checksum for full backups, but that has the (slight) 
> risk of missing file corruption on the server because it trusts the hash 
> calculated the first time it got the file (that's why I wrote the script I 
> mentioned before).

Haha, thanks for 

        # Thanks
        To Alexander Kobel that originally gave me the idea of the script. To 
Craig Barratt for the great piece of software that is BackupPC.

there! ;-)

> If you change it back to --ignore-times it will re-test the server files by 
> comparing block-by-block checksums with the one on the client. If a file is 
> corrupted in the server this will detect the difference and update it as a 
> "new version" of the file from then on. However, --ignore-times is MUCH 
> slower than --checksum, so you may want to run it only in response to a 
> corruption suspicion and not regularly.


That's a quite reasonable explanation, yes. Nevertheless, I wonder if the 
proper reaction for BackupPC_fsck should be to move an obviously invalid file 
out of the way (i.e. to a "quarantine" location or so), to enforce that check - 
after all, the corruption has already been detected and confirmed.

@ Ghislain: So the assumption is that both running a backup without --checksum 
or deleting the corrupted pool files should serve the same purpose (recreation 
from the host).
I can confirm with my installation that deleted pool files are properly 
recreated (at least the original data files, not sure about attrib files); 
perhaps now you can confirm what happens for a backup without --checksum...


Cheers,
Alex


_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/

Reply via email to