Hi Guillermo! On 2024-08-21 12:40, Guillermo Rozas wrote: > 4) ... but there is no means for BackupPC itself to *react* on the issue; > it simply warns, but there is no recovery. > > Instead, even on a new backup when the original file is still available > on one of the host, the server copy of the file is just taken to be "fine". > I.e., BackupPC detects that a file with the same hash is already existing in > the pool, so it happily assumes there's no need to retransfer the file. It's > simply kept as is, although it was detected as broken in step 3) - this > information is not propagated to or used within subsequent backup jobs. > > *If* this is really correct, it's a serious resilience problem with > BackupPC. At least, it fails any "principle of least surprise" check. > > Of course, BackupPC can't be expected to recreate data that is lost on > both server and clients. > But *some* remedial action can and should be taken. One somewhat sane > reaction, IIUC, would be to move the broken file from the pool into some > "quarantine/damaged" area (it could still be "mostly" okay and contain useful > information after all, if anything else is lost). > > At the very least, the pool file should be marked as "suspicious" such > that, if it is found on some host during a backup again, a fresh copy will be > created in the pool. Or, if you are concerned about hash collisions, a new > copy with _1 appended should be recreated and used for the new backups from > this point onwards. > > The same approach should be taken about attrib files. Same logic: If the > folder on the host remained unchanged, a new backup should recover any > information that BackupPC_fsck detected as lost on the server. > > I'd totally understand if some manual intervention is required (stopping > BackupPC, running some rescue commands etc.) - but from what I understand > from Ghislain, there's nothing to help apart from microsurgery or creating an > entirely new BackupPC instance, losing all history. And that's the opposite > of rock-solid. > I've yet to confirm, but my own experience from the last couple of weeks > seems to support this observation. > > > From https://backuppc.github.io/backuppc/BackupPC.html > <https://backuppc.github.io/backuppc/BackupPC.html>: > > <quote> > "An rsync "full" backup now uses --checksum (instead of --ignore-times), > which is much more efficient on the server side - the server just needs to > check the full-file checksum computed by the client, together with the mtime, > nlinks, size attributes, to see if the file has changed. If you want a more > conservative approach, you can change it back to --ignore-times, which > requires the server to send block checksums to the client." > <\quote> > > By default V4 uses --checksum for full backups, but that has the (slight) > risk of missing file corruption on the server because it trusts the hash > calculated the first time it got the file (that's why I wrote the script I > mentioned before).
Haha, thanks for # Thanks To Alexander Kobel that originally gave me the idea of the script. To Craig Barratt for the great piece of software that is BackupPC. there! ;-) > If you change it back to --ignore-times it will re-test the server files by > comparing block-by-block checksums with the one on the client. If a file is > corrupted in the server this will detect the difference and update it as a > "new version" of the file from then on. However, --ignore-times is MUCH > slower than --checksum, so you may want to run it only in response to a > corruption suspicion and not regularly. That's a quite reasonable explanation, yes. Nevertheless, I wonder if the proper reaction for BackupPC_fsck should be to move an obviously invalid file out of the way (i.e. to a "quarantine" location or so), to enforce that check - after all, the corruption has already been detected and confirmed. @ Ghislain: So the assumption is that both running a backup without --checksum or deleting the corrupted pool files should serve the same purpose (recreation from the host). I can confirm with my installation that deleted pool files are properly recreated (at least the original data files, not sure about attrib files); perhaps now you can confirm what happens for a backup without --checksum... Cheers, Alex _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/