Harald van Dijk <a...@gigawatt.nl> wrote, on 25 Sep 2019:
>
> On 25/09/2019 10:22, Geoff Clare wrote:
> >Harald van Dijk <a...@gigawatt.nl> wrote, on 24 Sep 2019:
> >>
> >>>>Regardless, a single shell is not enough to say "most shells", not even if
> >>>>it is multiple versions of that single shell.
> >>>
> >>>I consider bash 4 on Linux and bash 3 on macOS to be different shells.
> >>>(Their build configuration is different.)
> >>
> >>I do not understand this logic. The build configuration does not differ in
> >>any way that is relevant to pathname expansion. Surely the NetBSD shell is
> >>not counted separately for each port listed on
> >><https://www.netbsd.org/ports/>, so why is bash different?
> >
> >Their behaviour is sufficiently different (in areas other than pathname
> >expansion) to consider them to be different shells.  The same is true
> >for ksh88 and ksh93.
> 
> So it is just that bash 3 and bash 4 are significantly different and both
> versions are still used on current versions of operating systems, it is not
> about build configuration?

There are differences due to the version number change and there are
differences due to the build configuration being different.  I only
mentioned the build configuration in order to preempt a response
claiming that the differences between bash 3 and bash 4 were not
sufficient to justify treating them as different shells.

> >Okay, I see your point now.  When putting part of a pathname in a
> >variable you have to know how it is going to be used in order to know
> >how backslash will be handled.  But this is just one aspect of a wider
> >problem - e.g. you have to know if the variable will be quoted or not
> >when used, which applies to the backslash-is-always-special behaviour
> >as well.
> 
> The shell script author does not necessarily have full control over this,
> though. In $dir/$file, how $dir is treated depends on whether $file contains
> metacharacters, and vice versa. Quoted vs. unquoted is something the shell
> script author does have full control over, and it is easy to check in
> typical scripts that all uses of $dir are quoted, or that all uses of $dir
> are unquoted.

Okay, I guess this counts as an entry in the "cons" columns for the
bash2/3/4 behaviour then.  I'm sure Stephane and others will argue
that it is outweighed by the "cons" for the bash5 behaviour, and I'm
inclined to agree.

> >In any case I see this as a very minor issue.  Putting a whole pattern
> >in a variable is a rare thing to do.  Putting part in a variable and
> >part direct is even more rare.  Coupled with the fact that using
> >backslash in patterns (that you want to be expanded) is also rare, the
> >likelihood of this causing problems is extremely small.
> 
> Putting a pattern in a variable is not that rare. The rest probably is, but
> see below.
> 
> >I wrote the above before I had fully thought it through, and having slept
> >on it my preference is now much stronger, and I certainly would object to
> >specifying the NetBSD sh behaviour.  The reason is because treating
> >backslash differently in different components in indirect shell patterns
> >is inconsistent with direct shell patterns, glob(), find -path, and the
> >pax pattern operand, none of which vary their treatment of backslash
> >across different components of a pattern that contains slashes.
> 
> Likewise, none of them vary their treatment of backslash according to
> whether (other) metacharacters are present. If a file named 'x' exists,
> find . -name '\x' will find it, despite '\x' not containing any
> metacharacters. The proposed resolution already treats backslashes
> differently to how they are treated in glob(), find, etc.

I see it as a separate decision whether to do matching against pathnames
or not.  If matching is done, the treatment of backslash is then the same
as in glob(), find, etc.  If matching is not done, the result is the
same as if matching had been done and no matching pathnames were found.

> >Personally I would prefer the backslash-is-always-special option, but
> >breaking autoconf when a %sn file exists was enough for me to accept
> >the bash2/3/4 behaviour as a compromise.
> 
> Earlier you wrote "the likelihood of this causing problems is extremely
> small". This applies here as well. How likely is it for a '%sn' file to
> exist? Other than as a deliberate attempt to cause the configure script to
> fail, that is, in which case it is doing exactly what the user wanted.

For '%sn' perhaps not very likely, but the fact that this case came
to light in a widely-used open source application means that other
similar cases are likely to exist in other open source applications
and in closed source applications, user's private scripts, etc.

> If you do think that is a problem, it is already a problem regardless of how
> backslash is handled in existing scripts, which pass URLs with query strings
> unquoted to curl or wget. That is, if a script contains
> 
>   curl https://some.site/path?name=value
> 
> you can break that script by creating a 'https:' directory, a 'some.site'
> directory in that, and a 'pathXname=value' file in that. This is not
> hypothetical, I have seen multiple scripts that did this. I have seen that
> they did this because I was experimenting with bash's failglob option, which
> of course reported it as not matching anything.
> 
> We are not changing the shell semantics to say that pathname expansion is no
> longer performed on words that look like URLs, we just accept that this is
> technically a bug in those scripts, but that it is a bug that is so unlikely
> to cause real problems that for practical purposes we can ignore it.

Those scripts can be fixed simply by adding quoting.  The autoconf
problem with bash5 can't be fixed that way.

> All the problems of all approaches are corner cases that are unlikely to
> cause real problems in practice.

And yet, as Stephane reports, there have been several bug reports
against bash5 because of the new behaviour.

-- 
Geoff Clare <g.cl...@opengroup.org>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Reply via email to