On Wed, Jul 01, 2015 at 02:05:50PM +0100, Simon Hobson said: >As I read this, the default is to look at the file size/timestamp and if they match then do nothing as they are assumed to be identical. So unless you have specified this, then files which have already been copied should be ignored - the check should be quite low in CPU, at least compared to the "cost" of generating a file checksum etc.
This belies the issue of many rsync users not sufficiently abusing rsync to do backups like us idiots do! :) You have NO IDEA how long it takes to scan 100M files on a 7200 rpm disk. It becomes the dominant issue - CPU isnt the issue at all. (Additionally, I would think that metadata scanning could max out only 2 cores anyway - 1 for rsync's userland gobbling of another core of kernel running the fs scanning inodes). This is why throwing away all that metadata seems silly. Keeping detailed logs and parsing them before copy would be good, but requires an external selection script before rsync starts, the script handing rsync a list of files to copy directly. Unfortunate because rsync's scan method is quite advanced, but doesnt avoid this pitfall. Additionally, I dont know if linux (or freebsd or any unix) can be told to cache metadata more aggressively than data - not much point for the latter on a backup server. The former would be great. I dont know how big metadata is in ram either for typical OS's, per inode. /kc -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html