Guillermo,

Yes, that's an excellent point.  Actually v3 suffers from this too since,
with cached block and full-file checksums, it doesn't recheck the file
contents either.  However, v3 had a
parameter $Conf{RsyncCsumCacheVerifyProb} (default 0.01 == 1%) that caused
rsync to verify that random fraction of the file contents.  Other xfer
methods (eg, tar and smb) always do a full-file compare during a full, so
shouldn't be undetected server-side corruption with those XferMethods.

Thanks for the script.  While it's helpful to check the pool, it isn't
obvious how to fix any errors.  So it's probably best to have rsync-bpc
implement the old $Conf{RsyncCsumCacheVerifyProb} setting.  It could do
that by randomly skipping the --checksum short-circuit during a full.  For
that fraction of files, it would do a full rsync check and update, which
would update the pool file if they are not identical.

If folks agree with that approach, that's what I'll implement.

Craig

On Mon, Jun 8, 2020 at 10:16 AM Guillermo Rozas <guille2...@gmail.com>
wrote:

> I've attached the script I'm using. It's very rough, so use at your own
> risk!
>
> I run it daily checking 4 folders of the pool per day, sequentially,
> so it takes 32 days to check them all. You can modify the external
> loop to change this. The last checked folder is saved in an auxiliary
> file.
>
> The checksum is done uncompressing the files in the pool using
> zlib-flate (line 25), but it can be changed to pigz or BackupPC_zcat.
> On my severely CPU-limited server (Banana Pi) both pigz and zlib-flate
> are much faster than BackupPC_zcat, they take around a quarter of the
> time to check the files (pigz is marginally faster than zlib-flate).
> On the other hand, BackupPC_zcat puts the lowest load on the CPU,
> zlib-flate's load is 30-35% higher, and pigz's is a whooping 80-100%
> higher.
>
> However, as BackupPC_zcat produces slightly modified gzip files, there
> is a (very) small chance that a BackupPC_zcat compressed file is not
> properly uncompressed by the other two (line 28 in the script). If
> that happens, you need to re-check every zlib-flate or pigz failure
> with BackupPC_zcat before calling it a real error. I think this gets
> the best balance between load on the system and time spent checking
> the pool (at least for my server and pool...).
>
> Best regards,
> Guillermo
>
>
> On Mon, Jun 8, 2020 at 1:28 PM <backu...@kosowsky.org> wrote:
> >
> > Good point...
> > Craig - would it make sense to add a parameter to BackupPC_nightly
> > that would check a user-settable percentage of the files each night,
> > say NightlyChecksumPercent. So if set to 3%, the pool would be checked
> > (sequentially) over the period of ~1 month
> >
> > Guillermo Rozas wrote at about 11:12:39 -0300 on Monday, June 8, 2020:
> >  > Yes, I wouldn't worry about collisions by chance.
> >  >
> >  > However, there is a second aspect that is not covered here: if you
> >  > rely only on saved checksums in the server, it will not check again
> >  > unmodified pool files. This risks you missing file system corruption
> >  > or bit rot in the backup files that were previously caught by the V3
> >  > behaviour (which periodically checksummed the pool files).
> >  >
> >  > Two solutions:
> >  > - put the pool in a file system with checksum verification included
> >  > - use a script to periodically traverse the pool and chesum the files
> >  >
> >  > Best regards,
> >  > Guillermo
> >  >
> >  >
> >  >
> >  > On Mon, Jun 8, 2020 at 10:58 AM G.W. Haywood via BackupPC-users
> >  > <backuppc-users@lists.sourceforge.net> wrote:
> >  > >
> >  > > Hi there,
> >  > >
> >  > > On Mon, 8 Jun 2020, Jeff Kosowsky wrote:
> >  > >
> >  > > > ... presumably a very rare event ...
> >  > >
> >  > > That's putting it a little mildly.
> >  > >
> >  > > If it's really all truly random, then if you tried random
> collisions a
> >  > > million times per picosecond you would (probably) need of the order
> of
> >  > > ten trillion years to have a good chance of finding one...
> >  > >
> >  > > $ echo ' scale=2; 2^128 / 10^6 / 10^12 / 86400 / 365 / 10^12 ' | bc
> >  > > 10.79
> >  > >
> >  > > I think it's safe to say that it's not going to happen by chance.
> >  > >
> >  > > If it's truly random.
> >  > >
> >  > > --
> >  > >
> >  > > 73,
> >  > > Ged.
> >  > >
> >  > >
> >  > > _______________________________________________
> >  > > BackupPC-users mailing list
> >  > > BackupPC-users@lists.sourceforge.net
> >  > > List:
> https://lists.sourceforge.net/lists/listinfo/backuppc-users
> >  > > Wiki:    http://backuppc.wiki.sourceforge.net
> >  > > Project: http://backuppc.sourceforge.net/
> >  >
> >  >
> >  > _______________________________________________
> >  > BackupPC-users mailing list
> >  > BackupPC-users@lists.sourceforge.net
> >  > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> >  > Wiki:    http://backuppc.wiki.sourceforge.net
> >  > Project: http://backuppc.sourceforge.net/
> >
> >
> > _______________________________________________
> > BackupPC-users mailing list
> > BackupPC-users@lists.sourceforge.net
> > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> > Wiki:    http://backuppc.wiki.sourceforge.net
> > Project: http://backuppc.sourceforge.net/
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to