Jeffrey writes: > True - I haven't seen any mention in the documentation of any 'flag' > that would send checksums.
There is an rsync option --checksum that will compute and send a full-file MD5 digest from the client for every file as part of the initial file list. It is there as an alternative to attribute checking (ie: mtime and size) to see if a file should be skipped or inspected further (ie: for incremental backups). In 4.x I have implemented full file MD5 digests (as you proposed :)), to match rsync 3.x. Although I probably won't support it initially in 4.0, my plan would be to use the --checksum option to pre-match potential files in the pool if there isn't an existing file with the same path (ie: for new or renamed files). As several people have pointed out, this isn't possible in BackupPC 3.x. The only time rsync can do an efficient transfer in 3.x is if there is an existing file for that host/share with the same path already backed up. In 4.x, --checksum would allow an efficient transfer of any new file that was already in the pool. I would plan to use "--checksum" for full backups (it's too expensive on the client for incrementals). I'll probably make it a user-configured option whether a "full" does this shortcut based only on the full-file MD5 matching (ie: skips block digest checking), or whether a full also requires block digest matching too even if the full-file MD5 matches; it could be a probability so that any corruption or digest collisions (very unlikely with full-file MD5, although examples are now well known) are slowly fixed. If you are comfortable with a full backup just comparing full-file MD5 digests (and all file attributes too), then there in a massive reduction in server load since the MD5 digest is now stored in the attribute file (since it's the path to the pool file; no hardlinks remember) - it's essentially no more effort to compare MD5 digests as it is comparing the other file attributes. Basically the client does most of the work for a full since it needs to read every file computing the full-file MD5 digests. But the server has no more work to do than an incremental if files haven't changed. If you are more cautious you could increase the "block-digest-check" probability to, eg, 1%, 10%, or 100%. The last case would make it behave like 3.x - every file in a full does block digest checking (and consequently full file digest checking too). However, the client load will be higher since each file will be read twice in this case. Craig ------------------------------------------------------------------------------ Sell apps to millions through the Intel(R) Atom(Tm) Developer Program Be part of this innovative community and reach millions of netbook users worldwide. Take advantage of special opportunities to increase revenue and speed time-to-market. Join now, and jumpstart your future. http://p.sf.net/sfu/intel-atom-d2d _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/