Date:        Wed, 11 Apr 2018 10:00:27 +0100
    From:        Geoff Clare <g...@opengroup.org>
    Message-ID:  <20180411090027.GA18582@lt2.masqnet>

  | There is nothing to suggest that this does not apply to the characters
  | which, when unquoted, have a special meaning within bracket expressions
  | ('!', '-', "[.", etc.)

In file name patterns that might be correct, (because file name expansion 
happens before quote removal) but if bug 985 is correct, then in case
atterns, the quoting would already be removed before the pattern
was examined, so given

        var=!a
        case b in
        ["$var"]) whatever;;
        esac

we expand (etc) the word first (nothing to do there) then each pattern (there 
is just one) in turn, first parameter expansion (etc), producing

        ["!a"]

then quote removal

        [!a]

and then we match ('b' is not 'a').   That the quotes used to be there is now no
longer apparent.

I suspect that the text in 985 needs to be revised to allow for this, or there
is no question but that the ksh93 interpretation is correct, and every other
shell is wrong.

In general, quoting in patterns has only ever been possible using \ and in
character classes, no quoting at all ([\]] is traditionally a class containg a
backslash, followed by a literal ']' not a class containing a ']'.

Since order in a class is irrelevant, ordering of the elements has been
used to allow any character to appear in the class) without needing a
quoting mechanism.

Shells have largely not been that strict, largely because (at least for the
older shells, I don't know how more modern ones do it) the posix requirement
that the quotes in quoted words be left intact in the result from the lexer
has largely been ignored, and quoting has been indicated in other ways,
which make it easier, and faster, to tell exactly what is quoted and what is
not every time later the shell needs to know (the lexer does the scanning
once, and after that nothing ever needs to count beginning and ending
quote chars, etc).   A side effect of that is that (with quote removal not 
being done - and this is why I assume the standard did not originally
specify it for case patterns) everything just works the way it is expected
(a quoted a and an unquoted a still match, but a quoted ! is not the
"not in class" character, only an unquoted ! can be that.

I suspect ksh93 has "fixed" all of this, and implements more what the
standard actually says.

We need to be much more precise about matching, and everything related
to it than we currently are, and 985 doesn't help, it makes things worse
(though I fully understand, and agree with, the motivation for that defect 
report.)

Incidentally, I know that this part of the 985 new text ...

        the first argument (pattern) is the same as patt, except each character
        that was quoted in patt and is not in a bracket expression is prefixed 
by a backslash

is intended to handle this problem, except it cannot - once we have done quote
removal, what "was quoted" is lost, either we have the quotes, and know what is
quoted, or we don't, and don't.   The only way to fix this is to remove quote
removal from case patterns, and instead specify more precisely how a
(possibly quoted) string is turned into a fnmatch pattern.

kre

Reply via email to