On Tuesday 09 June 2009 17:20:49 Etaoin Shrdlu wrote:
> On Tuesday 9 June 2009, 16:36, Neil Bothwick wrote:
> > On Tue, 09 Jun 2009 16:15:21 +0200, Joerg Schilling wrote:
> > > > find -H /usr/lib /lib -type f | xargs -d'\n' qfile --orphans
> > >
> > > No, this is definitely wrong: the right way to handle this is
> > > execplus (since 19 years).
> >
> > If it's been around 19 years, why doesn't Google know anything about
> > it? What is it?
>
> Well, google does not know everything :)
>
> Basically, using + instead of ; after -exec allows to run the specified
> command less times, each time with the highest possible number of
> arguments (instead of running it once per file, which is what happens
> with ;). And yes, that's been in POSIX for a long time now. Example:
>
>
> $ ls
> file1  file2  file3  file4  file5
>
> $ find . -type f -exec sh -c 'echo "number of arguments: $#"' sh {} \;
> number of arguments: 1
> number of arguments: 1
> number of arguments: 1
> number of arguments: 1
> number of arguments: 1
>
> $ find . -type f -exec sh -c 'echo "number of arguments: $#"' sh {} +
> number of arguments: 5
>
> So when you have to run a command on a very big number of files, say 1000
> or more, with ; you spawn 1000 processes, with + you span just one or
> two (well, depending on the maximum command line length on the system
> anyway). This is of course much less resource intensive.

Some numbers.

These are from memory, I ran these commands today while preparing a machine 
for an installation of a package. The directory tree at the starting point had 
about 5000 files, more than 80% with a UID not attached to an account:

chown -R <user>:<group> *
about 2 seconds

find . -nouser -o -nogroup -exec chown <user>:<group> {} +
about 30 seconds (wild guess)

find . -nouser -o -nogroup -exec chown <user>:<group> {} \;
I killed this one after 5 minutes and it was nowhere near complete

Admittedly, this was on a vmware guest with a rather poor disk configuration, 
but it does illustrate that the naive "find \;" performs extremely poorly. 
chown on it's own is foolish as the whole point of the exercise was to find 
files meeting certain criteria, and there was definitely some that didn't.

execplus is a fine middle ground giving the best possible bang for buck.

-- 
alan dot mckinnon at gmail dot com

Reply via email to