On Wed, Jul 01, 2015 at 02:05:50PM +0100, Simon Hobson said:

  >As I read this, the default is to look at the file size/timestamp and if
  they match then do nothing as they are assumed to be identical. So unless
  you have specified this, then files which have already been copied should be
  ignored - the check should be quite low in CPU, at least compared to the
  "cost" of generating a file checksum etc.

This belies the issue of many rsync users not sufficiently abusing rsync to do
backups like us idiots do! :) You have NO IDEA how long it takes to scan 100M 
files
on a 7200 rpm disk. It becomes the dominant issue - CPU isnt the issue at all.
(Additionally, I would think that metadata scanning could max out only 2 cores
anyway - 1 for rsync's userland gobbling of another core of kernel running the
fs scanning inodes).

This is why throwing away all that metadata seems silly. Keeping detailed logs
and parsing them before copy would be good, but requires an external selection
script before rsync starts, the script handing rsync a list of files to copy
directly. Unfortunate because rsync's scan method is quite advanced, but doesnt
avoid this pitfall.

Additionally, I dont know if linux (or freebsd or any unix) can be told to cache
metadata more aggressively than data - not much point for the latter on a backup
server. The former would be great. I dont know how big metadata is in ram either
for typical OS's, per inode.

/kc
-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to