Re: purge-empty-dirs and max-file-size confusion
On Fri, Apr 24, 2009 at 02:19:42PM -0400, Ian! D. Allen wrote: There is no mention of the concept of transfer rule in the rsync man page. I offer some proposed man page wording changes, below. Thanks. I have committed some manpage changes that clarify this unexpected behavior. At some point rsync may allow actual filtering of files by their (non-name) attributes, which would avoid this situation. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: purge-empty-dirs and max-file-size confusion
On Fri, Apr 24, 2009 at 02:19:41PM -0400, Ian! D. Allen wrote: On Fri, Apr 24, 2009 at 07:51:35AM -0700, Wayne Davison wrote: This is because --min-size is a transfer rule, not an exclude rule. There is no mention of the concept of transfer rule in the rsync man page. There is another oblique reference to transfer rule in --compare-dest for which I offer this man page clarification: --compare-dest=DIR This transfer rule instructs rsync to use DIR on the destination machine as an additional hierarchy to compare destination files against doing transfers (if the files are missing in the destination directory). If a file is found in DIR that is identical to the sender's file, the file will NOT be transferred to the destination directory. This is useful for creating a sparse backup of just files that have changed from an earlier backup, though all the directories in the file-list will still be created (most of them likely empty). Unlike a filter/exclude rule, this option does not affect the file-list, so --prune-empty-dirs will not work with this option. -m, --prune-empty-dirs This option tells the receiving rsync to get rid of empty directories from the file-list, including nested directories that have no non-directory children. This is useful for avoiding the creation of a bunch of useless directories when the sending rsync is recursively scanning a hierarchy of files using include/exclude/filter rules. It does not prevent the creation of empty directories that result from the use of transfer rules such as --max-size, --min-size, or --compare-dest, since transfer rules do not affect the file-list. -- | Ian! D. Allen - idal...@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Open Source / Linux) via: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/ -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: purge-empty-dirs and max-file-size confusion
On Thu 23 Apr 2009, Ian! D. Allen wrote: In the man page it says in one place tells the receiving rsync to get rid of empty directories from the file-list and in another place it says prune empty directory chains from file-list. The latter sounds like it operates on the source list, not on the receiving list, and if rsync were Actually, to me it sounds quite like the same thing. I don't think the intention is to actually delete empty directories at the receiving end; only to prevent them being created. So once they're created due to perhaps an earlier invocation without purge-empty-dirs, you'll have to remove them by hand. Paul -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: purge-empty-dirs and max-file-size confusion
On Fri, Apr 24, 2009 at 10:23:06AM +0200, Paul Slootman wrote: I don't think the intention is to actually delete empty directories at the receiving end; only to prevent them being created. I have not yet found out how to prevent empty directories from being created when using --max-size or --min-size. As I showed in my original post to this list, --prune-empty-dirs does not do it. Either the man page is wrong/misleading/incomplete, I am misunderstanding it badly, or rsync is broken. I am fully prepared to believe that I am misunderstanding something and will happly work on a better man page wording when the truth is revealed to me. I've supplied a short script below that you can use to see the problem yourself. So once they're created due to perhaps an earlier invocation without purge-empty-dirs, you'll have to remove them by hand. As my script below shows, the destination directory does not even exist. There is no previously-created content in it at all, and yet rsync creates empty directories even though I say --prune-empty-dirs. Why? How do I make --prune-empty-dirs do what the man page says it does? #!/bin/sh -u # start with fresh empy directories for source and destination tmp1=/tmp/one$$ tmp2=/tmp/two$$ rm -rf $tmp1 $tmp2 echo '*** create the source directory with six subdirectories' for i in 1 2 3 4 5 6 ; do mkdir -p $tmp1/dir$i done echo '*** create three small files in dir1 dir2 dir3' for i in 1 2 3 ; do dd bs=1M count=1 if=/dev/zero of=$tmp1/dir$i/smallfile done echo '*** create three big files in dir4 dir5 dir6' for i in 4 5 6 ; do dd bs=1M count=11 if=/dev/zero of=$tmp1/dir$i/BIGFILE done echo '*** rsync should copy only the big files and prune all empty directories' rsync -ai --min-size 10M --prune-empty-dirs $tmp1 $tmp2 echo '*** find should show no empty directories, but there are three - why?' find $tmp2 -empty echo '*** replace --min-size with an --exclude and it works fine:' rm -r $tmp2 rsync -ai --exclude smallfile --prune-empty-dirs $tmp1 $tmp2 find $tmp2 -empty # shows no output - this is correct and expected echo *** Why doesn't --prune-empty-dirs work with --min-size and --max-size? rm -r $tmp1 $tmp2 -- | Ian! D. Allen - idal...@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Open Source / Linux) via: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/ -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: purge-empty-dirs and max-file-size confusion
On Wed, Apr 22, 2009 at 02:20:37AM -0400, Ian! D. Allen wrote: I want to use --min-size to copy just large files (and their necessary parent directories), but everything I've tried copies *all* the source directories, and creates them empty on the destination even if they don't have any big files in them. I only want the minimal directory hierarchies that contain the big files. This is because --min-size is a transfer rule, not an exclude rule. An exclude rule would affect deletions, and --min-size just affects what is transferred out of the full set of files that are present. Thus, the dirs with smaller files are not actually empty, they just don't have any files that match the transfer rule. There is not currently a way include/exclude files based on size in rsync. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: purge-empty-dirs and max-file-size confusion
On Fri, Apr 24, 2009 at 07:51:35AM -0700, Wayne Davison wrote: This is because --min-size is a transfer rule, not an exclude rule. There is no mention of the concept of transfer rule in the rsync man page. I offer some proposed man page wording changes, below. The man page says This option tells the receiving rsync to get rid of empty directories from the file-list - there is no mention that there must be two *kinds* of empty directories in the file list: (1) empty directories created by filter/exclude rules and (2) empty directories created by transfer rules. Or perhaps (2) doesn't really exist, but the sending rsync simply never gets around to sending the files that it says should be in those directories and so the receiving rsync does all that directory creation work but the promised files never arrive to fill them. There is not currently a way include/exclude files based on size in rsync. That is most awkward, given that --min-size sure sounds like it behaves this way. It is an annoyingly fine distinction to say that exclude and avoid transferring are two different kinds of operations when it comes to rsync pruning empty directories. This needs to be made much clearer in the man page. I offer these slightly reworded paragraphs: -m, --prune-empty-dirs This option tells the receiving rsync to get rid of empty directories from the file-list, including nested directories that have no non-directory children. This is useful for avoiding the creation of a bunch of useless directories when the sending rsync is recursively scanning a hierarchy of files using include/exclude/filter rules. It does not prevent the creation of empty directories that result from the use of transfer rules such as --max-size or --min-size, since transfer rules do not affect the file-list. --max-size=SIZE This transfer rule tells rsync to avoid transferring any file that is larger than the specified SIZE. Unlike a filter/exclude rule, it does not affect the file-list, so --prune-empty-dirs will not work with this option. --min-size=SIZE This transfer rule tells rsync to avoid transferring any file that is smaller than the specified SIZE, which can help in not transferring small, junk files. Unlike a filter/exclude rule, it does not affect the file-list, so --prune-empty-dirs will not work with this option. Thanks for keeping rsync alive and kicking! -- | Ian! D. Allen - idal...@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Open Source / Linux) via: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/ -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: purge-empty-dirs and max-file-size confusion
$ rsync -ai --min-size 10M --prune-empty-dirs /home/idallen/test /tmp/foo Have you tried --no-dirs? Why should I need it? I've explicitly told the receiving side don't create empty directories and that should be sufficient. I shouldn't need any other options. (In any case, I just tried --no-dirs and it didn't change the result. I still get piles of empty directories.) Perhaps the man page lies, and --prune-empty-dirs does not operate on the receiving side at all? In the man page it says in one place tells the receiving rsync to get rid of empty directories from the file-list and in another place it says prune empty directory chains from file-list. The latter sounds like it operates on the source list, not on the receiving list, and if rsync were operating on the source list it would explain the current misbehaviour. Has nobody ever wondered about this before? I suppose I shall have to Read The Source to find out what is wrong. Please someone enlighten me about what I'm missing, before I start digging around in there... -- | Ian! D. Allen - idal...@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Open Source / Linux) via: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/ -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
purge-empty-dirs and max-file-size confusion
I want to use --min-size to copy just large files (and their necessary parent directories), but everything I've tried copies *all* the source directories, and creates them empty on the destination even if they don't have any big files in them. I only want the minimal directory hierarchies that contain the big files. This doesn't work: $ rm -rf /tmp/foo $ rsync -ai --min-size 10M --prune-empty-dirs /home/idallen/test /tmp/foo cd+ test/ cd+ test/dir1/ cd+ test/dir2/ cd+ test/dir3/ cd+ test/dir4/ f+ test/dir4/BIGFILE cd+ test/dir5/ f+ test/dir5/BIGFILE cd+ test/dir6/ f+ test/dir6/BIGFILE Wrong. I don't want all those dir1, dir2, dir3 empty directories. I don't want *any* empty directories, at any level. What am I missing? -- | Ian! D. Allen - idal...@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Open Source / Linux) via: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/ -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html