-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Rsync 3.0.9 here.
I am using a rsync script like: """ rsync -z --numeric-ids -a -H --inplace --delete --delete-excluded - --stats --progress -v --itemize-changes SOURCE DESTINATION """ I detected the following issue when RSYNCing a bunch of Mercurial repositories. It is very dangerous, because it will corrupt files. When cloning a local repository, Mercurial uses hardlinks for performance and disk use. When one of the clones updates a file, the file is "unlinked" and replaced by a new file, so history can diverge gracefully. The problem can be trivially reproduced like this: 1. Create a text file "a.txt" with a bunch of caracters inside. 2. Create a hardlink to that file, called "b.txt". 3. Use "rsync -z --numeric-ids -a -H --inplace --delete - --delete-excluded --stats --progress -v --itemize-changes SOURCE DESTINATION" to replicate the directory. 4. Verify that a new directory is created, with two files "a.txt" and "b.txt", hardlinked. Nice. 5. Now delete the original "b.txt" and create a new file "b.txt", with new DIFFERENT content. So you now have two different files in the source. 6. Rerun the "rsync" script. 7. In the destination directory you will have two files, "a.txt" and "b.txt". They are still the same file, hardlinked. Both will have the same content. The content of the original "a.txt" *OR* "b.txt" file. 8. Rerun the "rsync" script a few times. Each time, the destination will have two hardlinked files, with the same content, alternating between the "a.txt" and "b.txt" files. So origin and destination will never synchronize (each time you rsync, destination will alternate content), and destination will be "corrupt", since different files in the origin are the same file in the destination. Two years of backups are spoiled, because of this :-(. I know that source and destination files can have a different link count for a variety of valid reasons, but rsync should know, when using "-H", that two hardlinked files in the destination are not hardlinked in the origin anymore. That should be quite easy to detect, since rsync track inodes already (when using "-H"), and can detect that two files inside the destination path hardlinked are not hardlinked in the origin. Even if I stop using "-H", that I rather not, the destination will be permanently corrupted UNTIL we delete it and start over again. In my particular case, not using "-H" will explode my disk usage, but using "-H" will corrupt the destination. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ j...@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:j...@jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBUCRfCZlgi5GaxT1NAQIYzgP8DgYS+9RKwoR57KjcX+jAyhQmizZ3UG1y 3mSJmz0a77NiCiRhXDbaxEqBbmdNk6pZDWjva2CVKITjUqbIaPyR87NtD1kNd24q LNWpTkS7KXEM7DzNs93URllT4jrnfx5W98EORXC7D6A8lg62WBipX4b91Xlx+/yj 63X7F4I7hIc= =G19k -----END PGP SIGNATURE----- -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html