Bill

> cd /etc
> ls -lR *.conf
> 
> The above won't drill through subdirectories as expected. It
> functions as though the R wasn't specified.

This should probably become an FAQ.  This is the expected behavior.
Following is a mostly canned reply about file commands and -R.  This
was in response to a question about rm.  But it applies to most file
commands and so is pertinent to your question as well.  And then
another one as well.

Bob

> Help! Am I doing something wrong, or is this a true bug? man page says it
> should recurse...

This is not a bug.  The rm command is operating correctly.  This is
the same correct behavior as other typical programs such as ls -R,
chmod -R, chown -R, etc.  Try 'ls -R *.exe', for example.

> I edited the crontabs to run a script that would clean out any *.exe
> or *.EXE files.
[...]
> if given the option to rm -Rf * or rm -rf *, will recursively remove all
> files and directories  from the point of execution in the directoy tree.
> if given the option rm -Rf *.exe or rm -rf *.exe, will remove only *.exe
> files in the current directory, but will not recurse into subdirectories.

That is correct.  Here are the pieces of information you need to
understand what is happening.

The -r (and -R) option says that if any of the files listed on the
command line are a directory then recurse down through those
directories.  The exact wording of the man page might be a little bit
confusing but I believe it is exactly correct.  It says "Remove the
contents of directories recursively."  Only arguments to the program
which are directories are recursively acted upon.  So any program
argument which is a directory will be removed completely which would
mean recursing down that directory and removing anything below it.

Here is another piece of information to understand the behavior.  The
shell interpreter is expanding the command line glob characters prior
to handing the arguments off to your command.  This is a simple form
of regular expression matching designed to make file name matching
easier.  This provides a consistent interface to all programs since
the expansion code is common to all programs by being in the
interpreting shell instead of in the program itself.  Commands in UNIX
do not see the '*.exe' or any of the shell metacharacters.  Commands
see the expanded out names which the shell found matched file names in
current directory.

The '*' is the "glob" character because it matches a glob of
characters.  But it only matches files in the current directory.  It
does not go out and list files in other directories.  The shell
matches and expands glob characters and hands of the resulting
information to the command.

You can double check this by using the echo command.  This is built
into most command shells, for example into bash.  Try echo *.exe.  Try
echo */*.exe.  In your example the first would print out *.exe if
nothing matched but would print out all file names that did match.
The command would see the result and has no idea that you provided a
wild card to match against file names.

If you want to match files in subdirectories as well then you would
need to say so explicitly with */*.exe.  The first star would match
all file names in the current directory.  Then the second *.exe would
be matching files in the subdirectories under names already matched by
the first '*' glob.

All of that was to explain why things are working as they should.  But
here is what you really want to do.  If you want to search all
directories below a location, let's say your present working
directory, then you can use other UNIX commands such as find to do so.
Here is an example, untested, use at your own peril, that would do
your rm on all .exe files below your current working directory.

Works for small numbers of files:
  rm -f `find . -name '*.exe' -print`

The backticks (``) execute the find command, take the output of the
find command and place it right there on the command line in place of
the `` and hand the results off to the rm command.  Test this out with
the echo command prior to real usage!  Note that the '*.exe' is quoted
to keep the shell from expanding it.  The find command will do the
expansion itself in this case and so the '*' needs to be hidden in a
string to keep the shell from expanding it first.

  echo rm -f `find . -name '*.exe' -print`

Unfortunately this transferal of functionality from the command to the
shell comes at a cost.  There is a limited amount of argument space
that is available for this argument expansion of file names.  It is
different on different systems and getting larger as RAM gets cheaper
but almost always there is still limit.  20KB was typical for a time
and now 2MB is common but it is a limit regardless and additionally it
is usually shared with environment variable space.  The xargs command
was designed specifically to work around this limited argument space
limit.  If you have a HUGE subdirectory with thousands of files the
above command will fail execute.  Therefore a better method is to use
find coupled with xargs.

Traditional:
  find . -name '*.exe' -print | xargs rm -f

Robust and safer but not yet universally implemented using the -print0
option and zero terminated strings instead of newline terminated
strings:
  find . -name '*.exe' -print0 | xargs -0 rm -f

Or since you expressed the need to look for both upper and lower case
EXE names.
  find . \( -name '*.exe' -o -name '*.EXE' \) -print0 | xargs -0 rm -f

You could substitute a full path in place of the 'find .' such as
something like 'find /class/home'.  Note that these are pretty much
equivalent to the following which might be easier to understand what
is going on.

Hope this helps

Bob Proulx

The -R option says that if any of the files listed on the command line
are a directory then recurse down through those directories.  The
exact wording of the man page might be a little bit confusing.

:      -R, --recursive
:           list subdirectories recursively

Really this is not mean recusively on the current directory.  But only
on arguments to the program which are directies.  This means that you
can do the following behaviors correctly.

  chmod u+w .          # Change .
  chmod -R u+w .       # Change . and recurse down it since it is a directory

> Also if you give command like:
> chmod -R 700 *.html
> and do not have any .html files in you current directory but you have it
> in your sub directory, it complains with 'no matching files' and exits.

There are different types of regular expressions.  Command shells
which handle file name wildcards such as *.html use a type of regular
expression known as Pattern Matching Notation.  This is a simple form
of regular expression matching designed to make file name matching
easier.  The '*' is the "glob" character because it matches a glob of
characters.  But it only matches files in the current directory.  It
does not go out and list files in other directories.

If you want to match files in subdirectories as well then you would
need to say so explicitly with */*.html.  The first star would match
all file names in the current directory.  Then the second *.html would
be matching files in the subdirectories under names already matched by
the first '*' glob.

I think by now you can see that chmod is working correctly with
respect to -R and filename globbing.  You can double check this by
using the echo command.  This is built into most command shells, for
example into bash.  Try echo *.html.  Try echo */*.html.  In your
example the first would echo out *.html if nothing matched.

If you want to search all directories below a location, let's say your
present working directory, then you can use other UNIX commands such
as find to do so.  Here is an example, untested, use at your own
peril, that would do your chmod on all .html files below your current
working directory.

Traditional:
  find . -name '*.html' -print | xargs chmod 700

Robust and safer but not yet universally implemented:
  find . -name '*.html' -print0 | xargs -0 chmod 700

Hope this helps

Bob Proulx

Reply via email to