Recently I was thinking about --link-dest= and if it was possible to use rsync to de-duplicate two nearly-identical directory structures.

Normally I would use a tool like hardlink, jdupes, or rdfind, but in this case the files are huge and numerous, so hashing them would take forever. I did a test run and these tools mostly choked to death after a few hours.

These directories were made using rsync in the first place, so I know the files are duplicate and I would be willing to use rsync's quick-check (path/filename, mtime, size) to assume uniqueness of the files.

My objective is to hard-link files with the same relative path/filename, mtime, and size. Nothing more. Files which are different should not be touched. Files which exist in the destination but not the source should not be deleted. Files which exist in the source but not the destination should not be transferred.

The problem is that I don't want to create any new files in the destination. That's the sticking point.

I thought maybe I could do something wacky like 'rsync -a --ignore-existing --ignore-non-existing --link-dest="../new/" old/ new', but that doesn't work. The existing files get ignored and nothing is linked.

Is there a way to do this with rsync?



--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to