Hi,

On 2024-08-21 15:31, backu...@kosowsky.org wrote:
> Alexander Kobel wrote at about 14:38:22 +0200 on Wednesday, August 21, 2024:
>  > On 2024-08-21 12:40, Guillermo Rozas wrote:
>  > > If you change it back to --ignore-times it will re-test the server files 
> by comparing block-by-block checksums with the one on the client. If a file 
> is corrupted in the server this will detect the difference and update it as a 
> "new version" of the file from then on. However, --ignore-times is MUCH 
> slower than --checksum, so you may want to run it only in response to a 
> corruption suspicion and not regularly.
>  > 
>  > 
>  > That's a quite reasonable explanation, yes. Nevertheless, I wonder if the 
> proper reaction for BackupPC_fsck should be to move an obviously invalid file 
> out of the way (i.e. to a "quarantine" location or so), to enforce that check 
> - after all, the corruption has already been detected and confirmed.
>  > 
> 
> I don't think moving it to a "quarantine" is a good idea
> 
> [...]
> 
> Even better might be to do the following. Rename corrupted files in
> place with an additional terminal '_corrupt' suffix.
> Change the logic in BackupPC as follows:
> - If fsck finds a file to be corrupt or any pool access to the file
>   fails to complete properly, rename the corresponding pool file in
>   place  with a '_corrupt' suffix.
>   Optionally, in addition to logging such corruption, it would be good
>   to maintain a file containing a list of all such corruptions.
> 
> - If same file needs to be backed up again, treat the '_corrupt' as a
>   hash collision and backup with an incremented '_N' suffix as per any
>   hash collision (this will effectively create a new copy)
> 
> - If original, corrupt file needs to be read
>     * Change lookup algorithm so that if the file can't be found in
>         the pool at the original attrib lookup location, say
>         '<original-hash>', then look at '<original-hash>_corrupt' instead
>     * Log a warning that a corrupted file has been referenced

This is precisely what I thought of and simply called "quarantine" (poor lingo, 
admittedly).

> The above would require only trivial changes to the code and would not
> impose any performance penalty except when referencing a corrupted file.
> 
> I also think it would be helpful to show the number of corrupt files
> in one of the GUI status pages together with a link to a browseable
> list of the corrupted files to date

Yep. "Trivial" would be a pleasant surprise, but I'm nowhere near familiar 
enough with Perl or BackupPC's code base to assess that.


Cheers,
Alex


_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/

Reply via email to