Re: rsync --server command line options
. is the 'current directory' notation in unix. .. is the parent directory. /kc On Mon, Oct 08, 2018 at 01:57:09PM -0700, Parke via rsync said: >Hello, > >I ran the following commands: > >rsync /tmp/foo remote: >rsync remote:/tmp/foo . > >On the remote computer, the following commands were executed: > >rsync --server -e.LsfxC . . >rsync --server --sender -e.LsfxC . /tmp/foo > >Does anyone know, what is the meaning of the three dots/periods in the >above two commands? The first command ends with two dots (". .") and >the second command has one dot (namely, the dot before /tmp/foo). > >(Yes, I know that --server and --sender are intended for internal use >only. Despite that, I want to try to get two rsync children to talk >to each other over a pipe created by a non-rsync parent.) > >Thank you, > >Parke > >-- >Please use reply-all for most replies to avoid omitting the mailing list. >To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - Heavy Computing Inc. - Guelph Canada -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Solution for rsync overall progress data display
with --no-i-r you at least get some idea of the # of files to check. /kc On Sun, Nov 26, 2017 at 09:34:25PM +, Simon Hobson via rsync said: >> I looking for a solution to display overall rsync progress on an LCD display as a bargraph. >> I have found 2 parameters: >> >> --progress >> This option tells rsync to print information showing the >> progress of the transfer. This gives a bored user something to >> watch. Implies --verbose if it wasn't already specified. >> >> While rsync is transferring a regular file, it updates a >> progress line that looks like this: >> >> 782448 63% 110.64kB/s0:00:04 >> >> But they are not showing the overall progress during the transfer what I need. > >Bear in mind that until the sync is almost finished, rsync does NOT know how much is left to do. AIUI, one thread is running a compare, working down the directory tree and building a list of files that aren't up to date on the target. Another thread is then taking files from this list and syncing them. >So at any point in time, there is a queue of files to be synced which is NOT complete, and a process that's syncing those files one at a time. Until the first thread is done, there isn't even a list of files, and until the sync is running, there isn't information on how much needs to be transferred for each of those files. > >It's well worth reading Andrew Tridgell's PHD thesis where the algorithm is detailed. It's quite readable and gives a good insight into how rsync works. >https://www.samba.org/~tridge/phd_thesis.pdf > > >-- >Please use reply-all for most replies to avoid omitting the mailing list. >To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - Heavy Computing Inc. Guelph Canada -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: How do you exclude a directory that is a symlink?
Considering you cant INCLUDE a directory that is a symlink... which would be really handy right now for me to resolve a mapping of 103 -> meaningful_name for backups, instead im resorting to temporary bind mounts of 103 onto meaningful_name, and when the bind mount isnt there, the --del is emptying meaningful_name accidentally at times. I think both situations could benefit from a --resolve-cmd-line-links switch to resolve COMMAND LINE-SUPPLIED symlinks. http://unix.stackexchange.com/questions/153262/get-rsync-to-dereference-symlinked-dirs-presented-on-cmdline-like-find-h /kc On Fri, Mar 03, 2017 at 07:41:10AM -0500, Steve Dondley said: >A thousand greetings, > >I'm trying to rsync a directory from a server to my local machine that has >a symbolic link to a directory I don't want to download. I have an >"exclude" option to exclude the symlink which works fine. However, if I add >a --copy-links option to the command, it appears to override my "exclude" >directive and the contents of the symlinked directory gets downloaded >anyway. > >I suspect I need some kind of --filter option. I read the documentation (or >at least tried do) regarding the --filter option but a mortal, casual user >like me could not make heads or tails of it. > >Thanks. >-- >Please use reply-all for most replies to avoid omitting the mailing list. >To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: large rsync fails with assertion error - bug #11506 was #6542 not fixed
I opened a new bug for this but didnt see it come across the list. I have a file tree I can reproduce this on readily (tho it's quite huge :/ ). Any way to get around this? This is a major travesty for a backup scripting situation I have, I expect others to bump into this too. new bug is https://bugzilla.samba.org/show_bug.cgi?id=11506 i can execute specific tests if required and log output, please advise. /kc On Wed, Sep 09, 2015 at 01:42:20AM -0400, Ken Chase said: >Ok I found a bug about this: > >https://bugzilla.samba.org/show_bug.cgi?id=6542 > >and it says fixed by upgrade. I found a way to upgrade. Using: > >rsync version 3.1.1 protocol version 31 > on receiving side that issues the rsync command, and > >rsync version 3.1.1 protocol version 31 > on the remote sending side. > >Im still getting the same thing: > >rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' failed. > >/kc > > >On Wed, Sep 09, 2015 at 12:58:30AM -0400, Ken Chase said: > >rsyncing a tree of perhaps 30M files, getting this: > > > >rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' failed. > > > >then a bit more output and the parent catches up to the child: > > > >rsync: writefd_unbuffered failed to write 8 bytes to message fd [receiver]: Broken pipe > >(32) > >rsync error: error in rsync protocol data stream (code 12) at io.c(1532) [receiver=3.0.9] > > > >it's from a remote system. No errors visible (kernel or otherwise) on either end. > >Hints? > > > >source: > >rsync version 3.1.1 protocol version 31 > > > >dest, where commands are issued from: > >rsync version 3.0.9 protocol version 30 > > > >ill have to try upgrading dest to 3.1.1 but its not in wheezy-backports > >and dont really want to mess with this production machine too much. > > > >/kc > >-- > >Ken Chase - Toronto Canada > > > >-- > >Please use reply-all for most replies to avoid omitting the mailing list. > >To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync > >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: large rsync fails with assertion error
Ok I found a bug about this: https://bugzilla.samba.org/show_bug.cgi?id=6542 and it says fixed by upgrade. I found a way to upgrade. Using: rsync version 3.1.1 protocol version 31 on receiving side that issues the rsync command, and rsync version 3.1.1 protocol version 31 on the remote sending side. Im still getting the same thing: rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' failed. /kc On Wed, Sep 09, 2015 at 12:58:30AM -0400, Ken Chase said: >rsyncing a tree of perhaps 30M files, getting this: > >rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' failed. > >then a bit more output and the parent catches up to the child: > >rsync: writefd_unbuffered failed to write 8 bytes to message fd [receiver]: Broken pipe >(32) >rsync error: error in rsync protocol data stream (code 12) at io.c(1532) [receiver=3.0.9] > >it's from a remote system. No errors visible (kernel or otherwise) on either end. >Hints? > >source: >rsync version 3.1.1 protocol version 31 > >dest, where commands are issued from: >rsync version 3.0.9 protocol version 30 > >ill have to try upgrading dest to 3.1.1 but its not in wheezy-backports >and dont really want to mess with this production machine too much. > >/kc >-- >Ken Chase - Toronto Canada > >-- >Please use reply-all for most replies to avoid omitting the mailing list. >To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
large rsync fails with assertion error
rsyncing a tree of perhaps 30M files, getting this: rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' failed. then a bit more output and the parent catches up to the child: rsync: writefd_unbuffered failed to write 8 bytes to message fd [receiver]: Broken pipe (32) rsync error: error in rsync protocol data stream (code 12) at io.c(1532) [receiver=3.0.9] it's from a remote system. No errors visible (kernel or otherwise) on either end. Hints? source: rsync version 3.1.1 protocol version 31 dest, where commands are issued from: rsync version 3.0.9 protocol version 30 ill have to try upgrading dest to 3.1.1 but its not in wheezy-backports and dont really want to mess with this production machine too much. /kc -- Ken Chase - Toronto Canada -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 3099] Please parallelize filesystem scan
I dont understand - scanning metadata is sped up by thrashing the head all over the disk instead of mostly-sequentially scanning through? How does that work out? /kc On Fri, Jul 17, 2015 at 02:37:21PM +, samba-b...@samba.org said: https://bugzilla.samba.org/show_bug.cgi?id=3099 --- Comment #8 from Chip Schweiss c...@innovates.com --- I would argue that optionally all directory scanning should be made parallel. Modern file systems perform best when request queues are kept full. The current mode of rsync scanning directories does nothing to take advantage of this. I currently use scripts to split a couple dozen or so rsync jobs in to literally 100's of jobs. This reduces execution time from what would be days to a couple hours every night. There are lots of scripts like this appearing on the net because the current state of rsync is inadequate. This ticket could reasonably combined with 5124. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 3099] Please parallelize filesystem scan
Sounds to me like maintaining the metadata cache is important - and tuning the filesystem to do so would be more beneficial than caching writes, especially with a backup target where a write already written will likely never be read again (and isnt a big deal if it is since so few files are changed compared to the total # of inodes to scan). Your report of the minutes for the re-sync shows the unthrashed cache is highly valuable. So all we need to do is tune the backup target (and even the operational servers themselves) to maintain more metadata. I dont know how much ram is used per inode, but I'd throw in another 4-8gb just for metadata caching per box, or even more, if it meant scanning was sped up. (Really, actually, one only needs it in the backup target - if you can run all the backups in parallel, and there's N servers to backup, they can all run at 1/N speed, as long as scanning metadata on the backup target is fast enough to keep up with it all -- my total data written is only 20-30GB for example, which at reasonable speed (20-30MB/s even, which is slow) is only 15 minutes total writing. Even 200-300GB changed would be 150 minutes at that rate, and the rate could easily be 4x faster. So, tuning caches to prefer metadata seems to be key. How? As we've discussed before, letting the filesystem at it throws away precious metadata cache, and so tracking your own changes (since the backup system will never be used for anything else, right? :) would be beneficial. Of course the danger is using the backup system for anything else and changing any of the target info - inconsistencies would crop up and make the backup worthless very quickly. /kc On Fri, Jul 17, 2015 at 03:18:02PM +, Schweiss, Chip said: Modern file systems have many internal queues, and service many clients simultaneously. They arrange their work to maximize throughput in both read and write operations.This is the norm on any enterprise file system, be it Hitachi, Oracle, Dell, HP, Isilon, etc. You will get significantly higher throughput if you hit it with multiple threads. These systems have elaborate predictive read ahead caches and perform best when multiple threads hit them. Using the test case of a single server with a simple file system such as ext3/4, or xfs, no gains will be seen in multithreading rsync. Use an enterprise file system with 100's of TBs and the more threads you use the faster you will go. Metadata and data on these systems ends up across 100's of disks. Single threads end up severely bound by latency. This is why multi-threading should be optional. It doesn't help everyone. For example, one of my rsync jobs moving from a ZFS system in St. Louis, Missouri to a Hitachi HNAS in Minneapolis, Minnesota has over 100 million files. Each day 50 to 100 thousand files get added or updated. A single rsync job would take weeks to parse this job and send the changes. I split it into 120 jobs and it typically completes in 2 hours when no humans are using the systems. A re-sync immediately afterwards, again with 120 jobs, scans both ends in minutes. -Chip -Original Message- From: rsync [mailto:rsync-boun...@lists.samba.org] On Behalf Of Ken Chase Sent: Friday, July 17, 2015 9:51 AM To: samba-b...@samba.org Cc: rsync...@samba.org Subject: Re: [Bug 3099] Please parallelize filesystem scan I dont understand - scanning metadata is sped up by thrashing the head all over the disk instead of mostly-sequentially scanning through? How does that work out? /kc On Fri, Jul 17, 2015 at 02:37:21PM +, samba-b...@samba.org said: https://bugzilla.samba.org/show_bug.cgi?id=3099 --- Comment #8 from Chip Schweiss c...@innovates.com --- I would argue that optionally all directory scanning should be made parallel. Modern file systems perform best when request queues are kept full. The current mode of rsync scanning directories does nothing to take advantage of this. I currently use scripts to split a couple dozen or so rsync jobs in to literally 100's of jobs. This reduces execution time from what would be days to a couple hours every night. There are lots of scripts like this appearing on the net because the current state of rsync is inadequate. This ticket could reasonably combined with 5124. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting
Re: Fwd: rsync --link-dest and --files-from lead by a change list from some file system audit tool (Was: Re: cut-off time for rsync ?)
yeah, i read somewhere that zfs DOES have separate tuning for metadata and data cache, but i need to read up on that more. as for heavy block duplication: daily backups of the whole system = alot of dupe. /kc On Thu, Jul 16, 2015 at 05:42:32PM +, Andrew Gideon said: On Mon, 13 Jul 2015 17:38:35 -0400, Selva Nair wrote: As with any dedup solution, performance does take a hit and its often not worth it unless you have a lot of duplication in the data. This is so only in some volumes in our case, but it appears that zfs permits this to be enabled/disabled on a per-volume basis. That would work for us. Is there a way to save cycles by offering zfs a hint as to where a previous copy of a file's blocks may be found? - Andrew -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync --link-dest and --files-from lead by a change list from some file system audit tool (Was: Re: cut-off time for rsync ?)
And what's performance like? I've heard lots of COW systems performance drops through the floor when there's many snapshots. /kc On Tue, Jul 14, 2015 at 08:59:25AM +0200, Paul Slootman said: On Mon 13 Jul 2015, Andrew Gideon wrote: On the other hand, I do confess that I am sometimes miffed at the waste involved in a small change to a very large file. Rsync is smart about moving minimal data, but it still stores an entire new copy of the file. What's needed is a file system that can do what hard links do, but at the file page level. I imagine that this would work using the same Copy On Write logic used in managing memory pages after a fork(). btrfs has support for this: you make a backup, then create a btrfs snapshot of the filesystem (or directory), then the next time you make a new backup with rsync, use --inplace so that just changed parts of the file are written to the same blocks and btrfs will take care of the copy-on-write part. Paul -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync --link-dest and --files-from lead by a change list from some file system audit tool (Was: Re: cut-off time for rsync ?)
inotifywatch or equiv, there's FSM stuff (filesystem monitor) as well. constantData had a product we used years ago - a kernel module that dumped out a list of any changed files out some /proc or /dev/* device and they had a whole toolset that ate the list (into some db) and played it out as it constantly tried to keep up with replication to a target (kinda like drdb but async). They got eaten by some large backup company and the product was later priced at 5x what we had paid for it (in the mid $x000s/y) This 2003-4 technolog is certainly available in some format now. If you only copy the changes, you're likely saving a lot of time. /kc On Mon, Jul 13, 2015 at 01:53:43PM +, Andrew Gideon said: On Mon, 13 Jul 2015 02:19:23 +, Andrew Gideon wrote: Look at tools like inotifywait, auditd, or kfsmd to see what's easily available to you and what best fits your needs. [Though I'd also be surprised if nobody has fed audit information into rsync before; your need doesn't seem all that unusual given ever-growing disk storage.] I wanted to take this a bit further. I've thought, on and off, about this for a while and I always get stuck. I use rsync with --link-desk as a backup tool. For various reasons, this is not something I want to give up. But, esp. for some very large file systems, doing something that avoids the scan would be desirable. I should also add that I mistrust time-stamp, and even time-stamp+file- size, mechanism for detecting changes. Checksums, on the other hand, are prohibitively expensive for backup of large file systems. These both bring me to the idea of using some file system auditing mechanism to drive - perhaps with an --include-from or --files-from - what rsync moves. Where I get stuck is that I cannot envision how I can provide rsync with a limited list of files to move that doesn't deny the benefit of --link- dest: a complete snapshot of the old file system via [hard] links into a prior snapshot for those files that are unchanged. Has anyone done something of this sort? I'd thought of preceding the rsync with a cp -Rl on the destination from the old snapshot to the new snapshot, but I still think that this will break in the face of hard links (to a file not in the --files-from list) or a change to file attributes (ie. a chmod would effect the copy of a file in the old snapshot). Thanks... Andrew -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 11378] Please add a '--line-buffered' option to rsync to make logging/output more friendly with pipes/syslog/CI systems/etc.
Imagine it, all those updates to a transfering large files with -P , 100,000 lines of progress PER file... /kc On Sat, Jul 04, 2015 at 06:56:21PM +, samba-b...@samba.org said: https://bugzilla.samba.org/show_bug.cgi?id=11378 --- Comment #3 from Karl O. Pinc k...@meme.com --- On Sat, 04 Jul 2015 17:56:25 + samba-b...@samba.org wrote: --- Comment #2 from Nathan Neulinger nn...@neulinger.org --- Perhaps the naming is not correct on my suggested option (and I'll admit, I completely missed the outbuf option) - unfortunately, outbuf doesn't actually solve the problem. The goal is to get incremental progress output while running rsync through a build system or similar. What would happen if you piped the rsync output through tr and changed \r to \n? Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: cut-off time for rsync ?
Yes if rsync could keep a 'last state file' that'd be great, which would require the target be unchanged by any other process/usage - this is however the case with many of our uses here - as a backup only target. Then it could just load the target statefile, and only scan the source for changes vs the last-state file. Cant think of any way around this issue with rsync alone without some external parsing of previous logs, etc. This is unfortunately why I never use 5400/5900 rpm disks on my backup targets, and use raid 10 not 5, for speed. Little more $ in the end, but necessary to scan 50-80M inodes per night in my ~6hr backup window. /kc On Thu, Jul 02, 2015 at 11:43:37AM +0200, Dirk van Deun said: What is taking time, scanning inodes on the destination, or recopying the entire backup because of either source read speed, target write speed or a slow interconnect between them? It takes hours to traverse all these directories with loads of small files on the backup server. That is the limiting factor. Not even copying: just checking the timestamp and size of the old copies. The source server is the actual live system, which has fast disks, so I can afford to move the burden to the source side, using the find utility to select homes that have been touched recently and using rsync only on these. But it would be nice if a clever invocation of rsync could remove the extra burden entirely. Dirk van Deun -- Ceterum censeo Redmond delendum -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: cut-off time for rsync ?
On Wed, Jul 01, 2015 at 02:05:50PM +0100, Simon Hobson said: As I read this, the default is to look at the file size/timestamp and if they match then do nothing as they are assumed to be identical. So unless you have specified this, then files which have already been copied should be ignored - the check should be quite low in CPU, at least compared to the cost of generating a file checksum etc. This belies the issue of many rsync users not sufficiently abusing rsync to do backups like us idiots do! :) You have NO IDEA how long it takes to scan 100M files on a 7200 rpm disk. It becomes the dominant issue - CPU isnt the issue at all. (Additionally, I would think that metadata scanning could max out only 2 cores anyway - 1 for rsync's userland gobbling of another core of kernel running the fs scanning inodes). This is why throwing away all that metadata seems silly. Keeping detailed logs and parsing them before copy would be good, but requires an external selection script before rsync starts, the script handing rsync a list of files to copy directly. Unfortunate because rsync's scan method is quite advanced, but doesnt avoid this pitfall. Additionally, I dont know if linux (or freebsd or any unix) can be told to cache metadata more aggressively than data - not much point for the latter on a backup server. The former would be great. I dont know how big metadata is in ram either for typical OS's, per inode. /kc -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 11378] New: Please add a '--line-buffered' option to rsync to make logging/output more friendly with pipes/syslog/CI systems/etc.
How about @andrewTO alias unbuf='stdbuf -i0 -o0 -e0' then unbuf rsync i have not tested this in any way. --progress would be some interesting stuff to parse, esp with all the screen redrawing of the K/s line as well as background deletes and scans overwriting while --progress of the previous file occurs. Ever try to parse ANSI screendraw output? /kc On Thu, Jul 02, 2015 at 06:00:19PM +, samba-b...@samba.org said: https://bugzilla.samba.org/show_bug.cgi?id=11378 Bug ID: 11378 Summary: Please add a '--line-buffered' option to rsync to make logging/output more friendly with pipes/syslog/CI systems/etc. Product: rsync Version: 3.1.1 Hardware: All OS: All Status: NEW Severity: enhancement Priority: P5 Component: core Assignee: way...@samba.org Reporter: nn...@neulinger.org QA Contact: rsync...@samba.org Created attachment 11225 -- https://bugzilla.samba.org/attachment.cgi?id=11225action=edit patch to implement --line-buffered option Behavior change with --line-buffered would be primarily to --progress - which would output a newline after percentage update instead of just a carriage-return. During a normal operation with smaller files you'd never notice the difference, but with large files (say recurrent sync of ISO images or similar) - you'd get a MUCH more usable output trace in the build system and logs instead of it all being merged onto one line of output. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: cut-off time for rsync ?
What is taking time, scanning inodes on the destination, or recopying the entire backup because of either source read speed, target write speed or a slow interconnect between them? Do you keep a full new backup every day, or are you just overwriting the target directory? /kc On Wed, Jul 01, 2015 at 10:06:57AM +0200, Dirk van Deun said: If your goal is to reduce storage, and scanning inodes doesnt matter, use --link-dest for targets. However, that'll keep a backup for every time that you run it, by link-desting yesterday's copy. The goal was not to reduce storage, it was to reduce work. A full rsync takes more than the whole night, and the destination server is almost unusable for anything else when it is doing its rsyncs. I am sorry if this was unclear. I just want to give rsync a hint that comparing files and directories that are older than one week on the source side is a waste of time and effort, as the rsync is done every day, so they can safely be assumed to be in sync already. Dirk van Deun -- Ceterum censeo Redmond delendum -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: cut-off time for rsync ?
If your goal is to reduce storage, and scanning inodes doesnt matter, use --link-dest for targets. However, that'll keep a backup for every time that you run it, by link-desting yesterday's copy. Y end up with a backup tree dir per day, with files hardlinked against all other backup dirs. My (and many others) here's solution is to mv $ancientbackup $today; rsync --del --link-dest=$yest source:$dirs $today creating gaps in the ancient sequence of days of backups - so I end up keeping (very roughly) 1,2,3,4,7,10,15,21,30,45,60,90,120,180 days old backups (of course this isnt how it works, there's some binary counting going on in there, so the elimination isnt exactly like that - every day each of those gets a day older. There are some tower of hanoi-like solutions to this for automated backups.) This means something twice as old has twice as few backups for the same time range, meaning I keep the same frequency*age value for each backup timerange into the past. The result is a set of dirs dated (in my case) 20150630 for eg, which looks exactly like the actual source tree i backed up, but only taking up space of changed files since yesterday. (caveat: it's hardlinked against all the other backups, thus using no more space on disk HOWEVER, some server stuff like postfix doenst like hardlinked files in its spool due to security concerns - so if you should boot/use the backup itself without making a plain copy (which is recommended) 1) postfix et al will yell 2) you will be modifying the whole set of dirs that point to the inode you just booted/used). My solution avoids scanning the source twice (which in my case of backing up 5x 10M files off servers daily is a huge cost), important because the scantime takes longer than the backup/xfer time (gigE network for a mere 20,000 changed files per 10M seems average per box of 5). Also it's production gear - as little time as possible thrashing the box (and its poor metadata cache) is important for performance. Getting the backups done during the night lull is therefore required. I dont have time to delete (nor the disk RMA cycle patience) 10M files on the receiving side just to spend 5 hours recreating them; 20,000 seems better to me. You could also use --backup and --backup-dir, but I dont do it that way. /kc On Tue, Jun 30, 2015 at 10:32:31AM +0200, Dirk van Deun said: Hi, I used to rsync a /home with thousands of home directories every night, although only a hundred or so would be used on a typical day, and many of them have not been used for ages. This became too large a burden on the poor old destination server, so I switched to a script that uses find -ctime -7 on the source to select recently used homes first, and then rsyncs only those. (A week being a more than good enough safety margin in case something goes wrong occasionally.) Is there a smarter way to do this, using rsync only ? I would like to use rsync with a cut-off time, saying if a file is older than this, don't even bother checking it on the destination server (and the same for directories -- but without ending a recursive traversal). Now I am traversing some directories twice on the source server to lighten the burden on the destination server (first find, then rsync). Best, Dirk van Deun -- Ceterum censeo Redmond delendum -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync very slow with large include/exclude file list
This is similar to using fuzzy / -y in a large directory. O(n^2) behaviour occurs and can be incredibly slow. No caching of md5's for the directory occurs, it would seem (or even so, there are O(N^2) comparisons). /kc On Mon, Jun 15, 2015 at 06:02:14PM -0500, ray vantassle said: I investigated the rsync code and found the reason why. For every file in the source, it searches the entire filter-list looking to see if that filename is on the exclude/include list.** Most aren't, so it compares (350K - 72K) * 72K names (the non-listed files) plus (72K * 72K/2) names (the ones that are listed), for a total of about** 22,608,000,000 strcmp's.** That's 22 BILLION comparisons. (I may have left off a zero there, it might be 220 B). I'm working on a fix to improve this.** The first phase was to just improve the existing code without changing the methodology. The set I've been testing with is local-local machine, dry-run, 216K files in the source directory, 25,000 files in the exclude-from list. The original rsync takes 488 seconds. The improved code takes 300 seconds. The next phase was to improve the algorithm of handling large filter_lists.** Change the unsorted linear search to a sorted binary search (skiplist). This improved code takes 2 seconds. The original code does 4,492,304,682 strcmp's. The fully improved code does 6,472,564.** 98.5% fewer. I am cleaning up the code and will submit a patchfile soon. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
feature request: rsync dereference symlinks on cmdline
This post http://unix.stackexchange.com/questions/153262/get-rsync-to-dereference-symlinked-dirs-presented-on-cmdline-like-find-h explains most of what i want, but basically, looking for a find -H option to rsync. Reason is so that I can hit a source (or target!) dir in rsync by making a nice dir of symlink maps. For eg openVZ names their containers with ID#s which isnt very condusive to careful handling/recognition: 100/ 101/ 102/ 103/ etc Id like to create a dir of symlinks, a map (I think this would work on the target too?) customer1 - ../production/100 customer2 - ../production/101 customer3 - ../production/102 and have rsync write dirs customer1/ customer2/ customer3/ in the target. Obviously I could do this by iterating over the source's 100 101 102 adnd point at custom target names, etc, but that gets tedious and requires manually updating the script to get any new sources that are added. Obviously I dont want --copy-links, as I want only those links mentioned on the command line to be dereferenced, not those inside the tree. /kc -- Ken Chase - Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync --delete
Wow, it took me a few seconds to figure out what you were trying to do. What's wrong with rm? Also I think trying to leverage the side of disqualifying all source files just to get the delete effect (very clever but somewhat obtuse!) risks creating a temporary file of some kind in the target at the start of the operation, and if you cant even mkdir then that exceeds disk quota immediately and fails. /kc On Thu, Apr 16, 2015 at 12:20:52PM +0300, ? ?? said: Hi, Rsync. I want to help rsink delete a folder with a large number of files and folders. Tried this: rsync -a --no-D --delete /dev/null /home/rc-41/data/061/2015-04-01-07-04/ skipping non-regular file null rsync -a --no-D --delete /dev/zero /home/rc-41/data/061/2015-04-01-07-04/ skipping non-regular file zero That's how it turns out rsync -a --delete /empty_folder/ /home/rc-41/data/061/2015-04-01-07-04/ But this option is not satisfied as if the disk is 100% filled to create an empty folder does not work mkdir /empty folder/ Disk quota ekstseeded Got an error. find /home/rc-41/data/061/2015-04-01-07-04/ -delete I know not suitable rm -rf /home/rc-41/data/061/2015-04-01-07-04/ is also not suitable How to do it differently? -- Sincerely, Dugin Sergey mailto: d...@qwarta.ru QWARTA -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync --delete
problem is he's trying to rsync into the target dir and have the side effect of delete. so an empty dir would necessarily need to be in the target of course and thus created there, triggering the quota block. he tried to avoid this by using device files then 'blocking all device files' but i think rsync figures out first there's nothing to do, so it just stops and doesnt do the delete. wonder if --delete-first would help there perhaps. however, this is a REALLY obtuse way of running rm. unless of course he's trying to inject some kinda options into a script that can only run rsync or something wonky like that. /kc On Thu, Apr 16, 2015 at 11:23:59AM -0400, Kevin Korb said: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I don't understand what is wrong with rm either. But if you must have an empty directory is there a tmpfs where you can make one? Is there already an empty one like /var/empty? On 04/16/2015 10:13 AM, Ken Chase wrote: Wow, it took me a few seconds to figure out what you were trying to do. What's wrong with rm? Also I think trying to leverage the side of disqualifying all source files just to get the delete effect (very clever but somewhat obtuse!) risks creating a temporary file of some kind in the target at the start of the operation, and if you cant even mkdir then that exceeds disk quota immediately and fails. /kc On Thu, Apr 16, 2015 at 12:20:52PM +0300, ? ?? said: Hi, Rsync. I want to help rsink delete a folder with a large number of files and folders. Tried this: rsync -a --no-D --delete /dev/null /home/rc-41/data/061/2015-04-01-07-04/ skipping non-regular file null rsync -a --no-D --delete /dev/zero /home/rc-41/data/061/2015-04-01-07-04/ skipping non-regular file zero That's how it turns out rsync -a --delete /empty_folder/ /home/rc-41/data/061/2015-04-01-07-04/ But this option is not satisfied as if the disk is 100% filled to create an empty folder does not work mkdir /empty folder/ Disk quota ekstseeded Got an error. find /home/rc-41/data/061/2015-04-01-07-04/ -delete I know not suitable rm -rf /home/rc-41/data/061/2015-04-01-07-04/ is also not suitable How to do it differently? -- Sincerely, Dugin Sergey mailto: d...@qwarta.ru QWARTA -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html - -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ Kevin Korb Phone:(407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. ke...@futurequest.net (work) Orlando, Floridak...@sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ -BEGIN PGP SIGNATURE- Version: GnuPG v2 iEYEARECAAYFAlUv1A8ACgkQVKC1jlbQAQfaGACfR7g0t19aeY5KiUTcsxBJqEVy tjcAnR63Viq8B0NZ4p+GgwMO+ZENjdPZ =aHlw -END PGP SIGNATURE- -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Recycling directories and backup performance. Was: Re: rsync --link-dest won't link even if existing file is out of date (fwd)
How do you handle snapshotting? or do you leave that to the block/fs virtualization layer? /kc On Fri, Apr 17, 2015 at 01:35:27PM +1200, Henri Shustak said: Our backup procudures have provision for looking back at previous directories, but there is not much to be gained with recycled directories. Without recycling, and after a failure, the latest available backup may not have much in it. Just wanted to point out that LBackup has a number of checks in place to detect failures during a backup. If this happens, then that backup is not labeled as a successful snapshot. At present, when the next snap shot is started, the previous incomplete snapshot(s) are not used as a link-dest source. As mentioned, this is something I have been looking at for a while. However, there are some edge cases which need to be handled carefully if you use incomplete backups as a link-dest source. I am sure these problems are all contractable, I have simply not spend enough time. - This email is protected by LBackup, an open source backup solution. http://www.lbackup.org -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Can I let rsync only transer a part of file within specific byte ranges?
rsync doesnt do that why not use a range get with http server and wget client, or just ssh ssh remotehost 'dd if=file bs=500 count=1' file ? /kc On Wed, Apr 15, 2015 at 11:02:36AM +, Hongyi Zhao said: Hi all, Suppose I have a file on the remote rsync server: rsync://path/to/myfile And I want to only retrieve a part of the file based a ranges of bytes to my local host, say, 0-499, means only transfer the first 500 bytes of that file. Is this possible with rsync client? Regards -- .: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync --link-dest won't link even if existing file is out of date
/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync --link-dest won't link even if existing file is out of date
This has been a consideration. But it pains me that a tiny change/addition to the rsync option set would save much time and space for other legit use cases. We know rsync very well, we dont know ZFS very well (licensing kept the tech out of our linux-centric operations). We've been using it but we're not experts yet. Thanks for the suggestion. /kc On Mon, Apr 06, 2015 at 12:07:05PM -0400, Kevin Korb said: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Since you are in an environment with millions of files I highly recommend that you move to ZFS storage and use ZFS's subvolume snapshots instead of --link-dest. It is much more space efficient, rsync run time efficient, and the old backups can be deleted in seconds. Rsync doesn't have to understand anything about ZFS. You just rsync to the same directory every time and have ZFS do a snapshot on that directory between runs. On 04/06/2015 01:51 AM, Ken Chase wrote: Feature request: allow --link-dest dir to be linked to even if file exists in target. This statement from the man page is adhered to too strongly IMHO: This option works best when copying into an empty destination hierarchy, as rsync treats existing files as definitive (so it never looks in the link-dest dirs when a destination file already exists). I was suprised by this behaviour as generally the scheme is to be efficient/save space with rsync. When the file is out of date but exists in the --l-d target, it would be great if it could be removed and linked. If an option was supplied to request this behaviour, I'd actually throw some money at making it happen. (And a further option to retain a copy if inode permissions/ownership would otherwise be changed.) Reasoning: I backup many servers with --link-dest that have filesystems of 10+M files on them. I do not delete old backups - which take 60min per tree or more just so rsync can recreate them all in an empty target dir when 1% of files change per day (takes 3-5 hrs per backup!). Instead, I cycle them in with mv $olddate $today then rsync --del --link-dest over them - takes 30-60 min depending. (Yes, some malleability of permissions risk there, mostly interested in contents tho). Problem is, if a file exists AT ALL, even out of date, a new copy is put overtop of it per the above man page decree. Thus much more disk space is used. Running this scheme with moving old backups to be written overtop of accumulates many copies of the exact same file over time. Running pax -rpl over the copies before rsyncing to them works (and saves much space!), but takes a very long time as it traverses and compares 2 large backup trees thrashing the same device (in the order of 3-5x the rsync's time, 3-5 hrs for pax - hardlink(1) is far worse, I suspect a some non-linear algorithm therein - it ran 3-5x slower than pax again). I have detailed an example of this scenario at http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists which also indicates --delete-before and --whole-file do not help at all. /kc - -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ Kevin Korb Phone:(407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. ke...@futurequest.net (work) Orlando, Floridak...@sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ -BEGIN PGP SIGNATURE- Version: GnuPG v2 iEYEARECAAYFAlUirykACgkQVKC1jlbQAQc83ACfa7lawkyPFyO9kDE/D8aztql0 AkAAoIQ970yTCHB1ypScQ8ILIQR6zphl =ktEg -END PGP SIGNATURE- -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Ken Chase - ken att heavycomputing.ca Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
rsync --link-dest won't link even if existing file is out of date
Feature request: allow --link-dest dir to be linked to even if file exists in target. This statement from the man page is adhered to too strongly IMHO: This option works best when copying into an empty destination hierarchy, as rsync treats existing files as definitive (so it never looks in the link-dest dirs when a destination file already exists). I was suprised by this behaviour as generally the scheme is to be efficient/save space with rsync. When the file is out of date but exists in the --l-d target, it would be great if it could be removed and linked. If an option was supplied to request this behaviour, I'd actually throw some money at making it happen. (And a further option to retain a copy if inode permissions/ownership would otherwise be changed.) Reasoning: I backup many servers with --link-dest that have filesystems of 10+M files on them. I do not delete old backups - which take 60min per tree or more just so rsync can recreate them all in an empty target dir when 1% of files change per day (takes 3-5 hrs per backup!). Instead, I cycle them in with mv $olddate $today then rsync --del --link-dest over them - takes 30-60 min depending. (Yes, some malleability of permissions risk there, mostly interested in contents tho). Problem is, if a file exists AT ALL, even out of date, a new copy is put overtop of it per the above man page decree. Thus much more disk space is used. Running this scheme with moving old backups to be written overtop of accumulates many copies of the exact same file over time. Running pax -rpl over the copies before rsyncing to them works (and saves much space!), but takes a very long time as it traverses and compares 2 large backup trees thrashing the same device (in the order of 3-5x the rsync's time, 3-5 hrs for pax - hardlink(1) is far worse, I suspect a some non-linear algorithm therein - it ran 3-5x slower than pax again). I have detailed an example of this scenario at http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists which also indicates --delete-before and --whole-file do not help at all. /kc -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html