Alex Waite (alexq...@gmail.com) wrote on 2 November 2011 20:09: > Recently I learned that rsync does a checksum of every file >transferred. I thought it might be interesting to record the path and >checksum of each file in a table. On future backups, the checksum of >a file being backed up could be looked up in the table. If there's a >matching checksum, a hard link will be created to the match instead of >storing a new copy. This means that the use of hard link won't be >limited to just the immediately preceding snapshot (as is the case >with my current setup). Instead a hard link could be created to an >identical file located in a different machine's snapshot. ... > Is this approach even possible, or am I missing something? I know >my labs have a lot of duplicate data across many machines, so this >could save me hundreds of GiBs, maybe even a TiB or two.
It is but the management of it all is up to you; it's not rsync's job. > If this is possible, how can I save the resulting checksum of a >file from rsync? You'll have to use at least rsync v3 in the source machines and in the backup one you need v3.1. Configure --out-format with %C to have the md5 in the log. Note that rsync only puts the md5 when it pulls the file (or you use -c); if it does a hardlink itself the md5 is not computed, so it's not put in the log. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html