The following issue has been SUBMITTED. 
====================================================================== 
http://austingroupbugs.net/view.php?id=1273 
====================================================================== 
Reported By:                stephane
Assigned To:                
====================================================================== 
Project:                    1003.1(2016)/Issue7+TC2
Issue ID:                   1273
Category:                   Shell and Utilities
Type:                       Error
Severity:                   Objection
Priority:                   normal
Status:                     New
Name:                       Stephane Chazelas 
Organization:                
User Reference:              
Section:                    glob() 
Page Number:                1109, 1110 (in 2018 edition) 
Line Number:                35742, 35768 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2019-07-27 10:49 UTC
Last Modified:              2019-07-27 10:49 UTC
====================================================================== 
Summary:                    glob()'s GLOB_ERR/errfunc and non-directory files
Description: 
In the XSH glob() specification, 

For GLOB_ERR, the spec says:

> Cause glob() to return when it encounters a directory that it
> cannot open or read. Ordinarily, glob() continues to find
> matches.

(Note: it's not clear what "Ordinarily" means here. When errfunc
is set and returns non-zero, glob() doesn't continue, is it
ordinary?).

For errfunc:

> If, during the search, a directory is encountered that cannot
> be opened or read and errfunc is not a null pointer, glob()
> calls (*errfunc()) with two arguments.
[...]
>       2. The eerrno argument is the value of errno from the
>      failure, as set by opendir(), readdir(), or stat().
>      (Other values may be used to report other errors not
>      explicitly documented for those functions.)

(Note: does that mean glob() has to call those 3 functions (as
opposed to open(O_DIRECTORY)/getdents() or any other API)? Why
stat(), shouldn't that be lstat()?)

First (and that's still not the case I'm making here), it's not
obvious what /directories/ glob() will try to open.

It can be somewhat inferred from the spec, as the pathname
expansion specification refers to directories that must be
readable (which implies they are going to be read) and some that
only need to be searchable (implying they're not going to be
read).

But maybe the spec should be more explicit, as it's not  obvious
for instance that in */*.c the current directory and all the
subdirs are going to be read, while in */foo.c, only  the
current directory is read (and all subdirs/foo.c lstat()ed), so
if there's a non-readable subdir, only the former will fail (or
cause errfunc to be invoked).

Now, to get to the point, the spec refers to "directories" that
can't be opened.

What about a /etc/passwd/*.c glob. /etc/passwd is not a
directory, opendir("/etc/passwd") if called would fail with
ENOTDIR, does that mean glob() should not call opendir() here or
that it should ignore opendir()'s error when errno is  ENOTDIR?

What about */*.c where there's at least one non-directory
non-hidden file in the current directory? What if there's a
broken symlink or a symlink to a file that is not accessible
(and so for which we can't tell whether the symlink is a
directory or not)?

I've done tests with the FreeBSD 12.0, Solaris 10 and GNU libc
2.27 implementations of glob() and they all differ
significantly, the Solaris one being the least compliant to what
I can infer the spec to require, and FreeBSD's the most.

On Solaris /etc/passwd/*.c glob(GLOB_ERR) fails (and calls
errfunc with /etc/passwd, ENOTDIR), same for */*.c in a
directory that contains a non-hidden regular file.

Only FreeBSD's glob(GLOB_ERR) doesn't fail on non-existent/*.c
or */*.c in a directory that contains a broken symlink. The
other two call errfunc with ENOENT.

For */*.c in a directory that contains a symlink to a
non-accessible area, they all fail (call errfunc with EACCESS).
Same with */*/*.c if the current directory contains a subdir
that is readable but not searchable (note that whether glob()
could tell whether entries of that directory are directories or
not depends on whether readdir() returns that information or
not; either way, we can't tell for symlinks).

Desired Action: 
At this point, I just want to start the discussion as to how
best fix it.

- The "ordinarily" should probably be changed to "if errfunc is
  NULL"

- I don't think we want to force implementations to literally
  call opendir()/readdir()/lstat() (in any case, that "stat()"
  is wrong). Not sure how to phrase it though.

- we should probably clarify which directories glob() is meant
  to try opening, or which files glob() is meant to invoke
  opendir() or equivalent on.

- and then what to do for non-directories or files which we
  can't tell whether they're directories or not. Either require
  the FreeBSD or GNU behaviour or allow both. The Solaris
  behaviour is not useful IMO, but it's more flexible in that
  the caller can use a errfunc that ignores ENOENT/ENOTDIR to
  emulate the GNU/FreeBSD behaviour.

====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2019-07-27 10:49 stephane       New Issue                                    
2019-07-27 10:49 stephane       Name                      => Stephane Chazelas
2019-07-27 10:49 stephane       Section                   => glob()          
2019-07-27 10:49 stephane       Page Number               => 1109, 1110 (in 2018
edition)
2019-07-27 10:49 stephane       Line Number               => 35742, 35768    
======================================================================


Reply via email to