Re: purge-empty-dirs and max-file-size confusion

2009-04-27 Thread Wayne Davison
On Fri, Apr 24, 2009 at 02:19:42PM -0400, Ian! D. Allen wrote:
 There is no mention of the concept of transfer rule in the rsync
 man page.  I offer some proposed man page wording changes, below.

Thanks.  I have committed some manpage changes that clarify this
unexpected behavior.  At some point rsync may allow actual filtering of
files by their (non-name) attributes, which would avoid this situation.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-25 Thread Ian! D. Allen
On Fri, Apr 24, 2009 at 02:19:41PM -0400, Ian! D. Allen wrote:
 On Fri, Apr 24, 2009 at 07:51:35AM -0700, Wayne Davison wrote:
  This is because --min-size is a transfer rule, not an exclude rule.
 
 There is no mention of the concept of transfer rule in the rsync
 man page.

There is another oblique reference to transfer rule in --compare-dest
for which I offer this man page clarification:

--compare-dest=DIR
This transfer rule instructs rsync to use DIR on the destination
machine as an additional hierarchy to compare destination files
against doing transfers (if the files are missing in the destination
directory).  If a file is found in DIR that is identical to the
sender's file, the file will NOT be transferred to the destination
directory. This is useful for creating a sparse backup of just files
that have changed from an earlier backup, though all the directories
in the file-list will still be created (most of them likely empty).
Unlike a filter/exclude rule, this option does not affect the
file-list, so --prune-empty-dirs will not work with this option.

-m, --prune-empty-dirs
This option tells the receiving rsync to get rid of empty directories
from the file-list, including nested directories that have no
non-directory children. This is useful for avoiding the creation of
a bunch of useless directories when the sending rsync is recursively
scanning a hierarchy of files using include/exclude/filter rules. It
does not prevent the creation of empty directories that result
from the use of transfer rules such as --max-size, --min-size,
or --compare-dest, since transfer rules do not affect the file-list.

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Paul Slootman
On Thu 23 Apr 2009, Ian! D. Allen wrote:
 
 In the man page it says in one place tells the receiving rsync to get
 rid of empty directories from the file-list and in another place it says
 prune empty directory chains from file-list.  The latter sounds like it
 operates on the source list, not on the receiving list, and if rsync were

Actually, to me it sounds quite like the same thing.
I don't think the intention is to actually delete empty directories at
the receiving end; only to prevent them being created. So once they're
created due to perhaps an earlier invocation without purge-empty-dirs,
you'll have to remove them by hand.


Paul
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Ian! D. Allen
On Fri, Apr 24, 2009 at 10:23:06AM +0200, Paul Slootman wrote:
 I don't think the intention is to actually delete empty directories at
 the receiving end; only to prevent them being created.

I have not yet found out how to prevent empty directories from being
created when using --max-size or --min-size.  As I showed in
my original post to this list, --prune-empty-dirs does not do it.
Either the man page is wrong/misleading/incomplete, I am misunderstanding
it badly, or rsync is broken.  I am fully prepared to believe that I
am misunderstanding something and will happly work on a better man page
wording when the truth is revealed to me.  I've supplied a short script
below that you can use to see the problem yourself.

 So once they're created due to perhaps an earlier invocation without
 purge-empty-dirs, you'll have to remove them by hand.

As my script below shows, the destination directory does not even exist.
There is no previously-created content in it at all, and yet rsync
creates empty directories even though I say --prune-empty-dirs.  Why?
How do I make --prune-empty-dirs do what the man page says it does?

#!/bin/sh -u

# start with fresh empy directories for source and destination
tmp1=/tmp/one$$
tmp2=/tmp/two$$
rm -rf $tmp1 $tmp2

echo '*** create the source directory with six subdirectories'
for i in 1 2 3 4 5 6 ; do
mkdir -p $tmp1/dir$i
done

echo '*** create three small files in dir1 dir2 dir3'
for i in 1 2 3 ; do
dd bs=1M count=1 if=/dev/zero of=$tmp1/dir$i/smallfile
done

echo '*** create three big files in dir4 dir5 dir6'
for i in 4 5 6 ; do
dd bs=1M count=11 if=/dev/zero of=$tmp1/dir$i/BIGFILE
done

echo '*** rsync should copy only the big files and prune all empty directories'
rsync -ai --min-size 10M --prune-empty-dirs $tmp1 $tmp2

echo '*** find should show no empty directories, but there are three - why?'
find $tmp2 -empty

echo '*** replace --min-size with an --exclude and it works fine:'
rm -r $tmp2
rsync -ai --exclude smallfile --prune-empty-dirs $tmp1 $tmp2
find $tmp2 -empty   # shows no output - this is correct and expected

echo *** Why doesn't --prune-empty-dirs work with --min-size and --max-size?

rm -r $tmp1 $tmp2

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Wayne Davison
On Wed, Apr 22, 2009 at 02:20:37AM -0400, Ian! D. Allen wrote:
 I want to use --min-size to copy just large files (and their necessary
 parent directories), but everything I've tried copies *all* the source
 directories, and creates them empty on the destination even if they
 don't have any big files in them.  I only want the minimal directory
 hierarchies that contain the big files.

This is because --min-size is a transfer rule, not an exclude rule.  An
exclude rule would affect deletions, and --min-size just affects what is
transferred out of the full set of files that are present.  Thus, the
dirs with smaller files are not actually empty, they just don't have any
files that match the transfer rule.

There is not currently a way include/exclude files based on size in
rsync.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Ian! D. Allen
On Fri, Apr 24, 2009 at 07:51:35AM -0700, Wayne Davison wrote:
 This is because --min-size is a transfer rule, not an exclude rule.

There is no mention of the concept of transfer rule in the rsync
man page.  I offer some proposed man page wording changes, below.

The man page says This option tells the receiving rsync to get rid of
empty directories from the file-list - there is no mention that there
must be two *kinds* of empty directories in the file list: (1) empty
directories created by filter/exclude rules and (2) empty directories
created by transfer rules.  Or perhaps (2) doesn't really exist, but the
sending rsync simply never gets around to sending the files that it says
should be in those directories and so the receiving rsync does all that
directory creation work but the promised files never arrive to fill them.

 There is not currently a way include/exclude files based on size in rsync.

That is most awkward, given that --min-size sure sounds like it behaves
this way.  It is an annoyingly fine distinction to say that exclude
and avoid transferring are two different kinds of operations when it
comes to rsync pruning empty directories.  This needs to be made much
clearer in the man page.  I offer these slightly reworded paragraphs:

-m, --prune-empty-dirs
  This option tells the receiving rsync to get rid of empty
  directories from the file-list, including nested directories that
  have no non-directory children. This is useful for avoiding the
  creation of a bunch of useless directories when the sending
  rsync is recursively scanning a hierarchy of files using
  include/exclude/filter rules. It does not prevent the creation of
  empty directories that result from the use of transfer rules such
  as --max-size or --min-size, since transfer rules do not affect
  the file-list.

--max-size=SIZE
  This transfer rule tells rsync to avoid transferring any file that
  is larger than the specified SIZE.  Unlike a filter/exclude rule, it
  does not affect the file-list, so --prune-empty-dirs will not work
  with this option.

--min-size=SIZE
  This transfer rule tells rsync to avoid transferring any file
  that is smaller than the specified SIZE, which can help in not
  transferring small, junk files.   Unlike a filter/exclude rule, it
  does not affect the file-list, so --prune-empty-dirs will not work
  with this option.

Thanks for keeping rsync alive and kicking!

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-23 Thread Ian! D. Allen
   $ rsync -ai --min-size 10M --prune-empty-dirs /home/idallen/test /tmp/foo
 Have you tried --no-dirs?

Why should I need it?  I've explicitly told the receiving side don't
create empty directories and that should be sufficient.  I shouldn't
need any other options.  (In any case, I just tried --no-dirs and it
didn't change the result.  I still get piles of empty directories.)

Perhaps the man page lies, and --prune-empty-dirs does not operate on
the receiving side at all?

In the man page it says in one place tells the receiving rsync to get
rid of empty directories from the file-list and in another place it says
prune empty directory chains from file-list.  The latter sounds like it
operates on the source list, not on the receiving list, and if rsync were
operating on the source list it would explain the current misbehaviour.

Has nobody ever wondered about this before?  I suppose I shall have to
Read The Source to find out what is wrong.  Please someone enlighten me
about what I'm missing, before I start digging around in there...

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


purge-empty-dirs and max-file-size confusion

2009-04-22 Thread Ian! D. Allen
I want to use --min-size to copy just large files (and their necessary
parent directories), but everything I've tried copies *all* the source
directories, and creates them empty on the destination even if they
don't have any big files in them.  I only want the minimal directory
hierarchies that contain the big files.  This doesn't work:

$ rm -rf /tmp/foo
$ rsync -ai --min-size 10M --prune-empty-dirs /home/idallen/test /tmp/foo
cd+ test/
cd+ test/dir1/
cd+ test/dir2/
cd+ test/dir3/
cd+ test/dir4/
f+ test/dir4/BIGFILE
cd+ test/dir5/
f+ test/dir5/BIGFILE
cd+ test/dir6/
f+ test/dir6/BIGFILE

Wrong.  I don't want all those dir1, dir2, dir3 empty directories.
I don't want *any* empty directories, at any level.
What am I missing?

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html