On 5/18/22 9:46 PM, Christoph Anton Mitterer via austin-group-l at The Open Group wrote:

The above, I'm not quite sure what these tell/prove...

I assume the ones with '?': that for all except bash/fnmatch   '?'
matches both, valid characters and a single byte that is no character.

And the ones with bracket expression, that these also work when the BE
has either a valid character or a byte (that is not a character) and
vice-versa?

If Chet is reading along, is the above intended in bash, or considered
a bug?

The bash matcher falls back to C-locale-like behavior only if the pattern
and the string both do not contain any valid multibyte characters. So if,
for example, the string contains a valid multibyte character, but the
pattern does not, the matcher will attempt multibyte (wide character,
really) matches.

This is why the string \243] (a valid multibyte character in Big5) does not
match [\243!]]: nothing in the bracket expression will match that
character, and that string will never match a pattern ending in `]'.


IMO it would have been interesting to see whether ? would also match
multiple bytes that are each for themselves and together no valid
character...

No, it wouldn't. You can make a case for `?' matching a single byte that is
not part of a valid multibyte character (there is no such thing as a single
byte that is "no valid character" when you are matching), but you cannot
make one for `?' matching more than one byte that does not compose a valid
multibyte character.


The tests involving \243 are run in a Big5 environment. In Big5,
\243\135 is the representation of β, a single valid character, even
though \135 on its own is still the single character ].

Seem also a bit strange to me,... all shells match \243 against ? ...
i.e. ? matches a single byte that is not a character... but later on it
doesn't work again with \243] and ?]

Because, as Harald says, \243] is a valid multibyte character in Big5
locales.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/

      • Re:... Robert Elz via austin-group-l at The Open Group
        • ... Harald van Dijk via austin-group-l at The Open Group
          • ... Christoph Anton Mitterer via austin-group-l at The Open Group
            • ... Harald van Dijk via austin-group-l at The Open Group
              • ... Christoph Anton Mitterer via austin-group-l at The Open Group
              • ... Harald van Dijk via austin-group-l at The Open Group
              • ... Christoph Anton Mitterer via austin-group-l at The Open Group
              • ... Harald van Dijk via austin-group-l at The Open Group
              • ... Christoph Anton Mitterer via austin-group-l at The Open Group
              • ... Harald van Dijk via austin-group-l at The Open Group
              • ... Chet Ramey via austin-group-l at The Open Group
              • ... Harald van Dijk via austin-group-l at The Open Group
          • ... Geoff Clare via austin-group-l at The Open Group
            • ... Harald van Dijk via austin-group-l at The Open Group
        • ... Geoff Clare via austin-group-l at The Open Group
      • Re:... Christoph Anton Mitterer via austin-group-l at The Open Group
  • [Issue 8 dra... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [Issue 8 dra... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [Issue 8 dra... Austin Group Bug Tracker via austin-group-l at The Open Group
    • Re: [Is... Christoph Anton Mitterer via austin-group-l at The Open Group
  • [Issue 8 dra... Austin Group Bug Tracker via austin-group-l at The Open Group

Reply via email to