If your goal is to reduce storage, and scanning inodes doesnt matter, use --link-dest for targets. However, that'll keep a backup for every time that you run it, by link-desting yesterday's copy.
Y end up with a backup tree dir per day, with files hardlinked against all other backup dirs. My (and many others) here's solution is to mv $ancientbackup $today; rsync --del --link-dest=$yest source:$dirs $today creating gaps in the ancient sequence of days of backups - so I end up keeping (very roughly) 1,2,3,4,7,10,15,21,30,45,60,90,120,180 days old backups (of course this isnt how it works, there's some binary counting going on in there, so the elimination isnt exactly like that - every day each of those gets a day older. There are some tower of hanoi-like solutions to this for automated backups.) This means something twice as old has twice as few backups for the same time range, meaning I keep the same frequency*age value for each backup timerange into the past. The result is a set of dirs dated (in my case) 20150630 for eg, which looks exactly like the actual source tree i backed up, but only taking up space of changed files since yesterday. (caveat: it's hardlinked against all the other backups, thus using no more space on disk HOWEVER, some server stuff like postfix doenst like hardlinked files in its spool due to security concerns - so if you should boot/use the backup itself without making a plain copy (which is recommended) 1) postfix et al will yell 2) you will be modifying the whole set of dirs that point to the inode you just booted/used). My solution avoids scanning the source twice (which in my case of backing up 5x 10M files off servers daily is a huge cost), important because the scantime takes longer than the backup/xfer time (gigE network for a mere 20,000 changed files per 10M seems average per box of 5). Also it's production gear - as little time as possible thrashing the box (and its poor metadata cache) is important for performance. Getting the backups done during the night lull is therefore required. I dont have time to delete (nor the disk RMA cycle patience) 10M files on the receiving side just to spend 5 hours recreating them; 20,000 seems better to me. You could also use --backup and --backup-dir, but I dont do it that way. /kc On Tue, Jun 30, 2015 at 10:32:31AM +0200, Dirk van Deun said: >Hi, > >I used to rsync a /home with thousands of home directories every >night, although only a hundred or so would be used on a typical day, >and many of them have not been used for ages. This became too large a >burden on the poor old destination server, so I switched to a script >that uses "find -ctime -7" on the source to select recently used homes >first, and then rsyncs only those. (A week being a more than good >enough safety margin in case something goes wrong occasionally.) > >Is there a smarter way to do this, using rsync only ? I would like to >use rsync with a cut-off time, saying "if a file is older than this, >don't even bother checking it on the destination server (and the same >for directories -- but without ending a recursive traversal)". Now >I am traversing some directories twice on the source server to lighten >the burden on the destination server (first find, then rsync). > >Best, > >Dirk van Deun >-- >Ceterum censeo Redmond delendum >-- >Please use reply-all for most replies to avoid omitting the mailing list. >To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html