On 5/18/22 9:46 PM, Christoph Anton Mitterer via austin-group-l at The Open
Group wrote:
The above, I'm not quite sure what these tell/prove...
I assume the ones with '?': that for all except bash/fnmatch '?'
matches both, valid characters and a single byte that is no character.
And the ones with bracket expression, that these also work when the BE
has either a valid character or a byte (that is not a character) and
vice-versa?
If Chet is reading along, is the above intended in bash, or considered
a bug?
The bash matcher falls back to C-locale-like behavior only if the pattern
and the string both do not contain any valid multibyte characters. So if,
for example, the string contains a valid multibyte character, but the
pattern does not, the matcher will attempt multibyte (wide character,
really) matches.
This is why the string \243] (a valid multibyte character in Big5) does not
match [\243!]]: nothing in the bracket expression will match that
character, and that string will never match a pattern ending in `]'.
IMO it would have been interesting to see whether ? would also match
multiple bytes that are each for themselves and together no valid
character...
No, it wouldn't. You can make a case for `?' matching a single byte that is
not part of a valid multibyte character (there is no such thing as a single
byte that is "no valid character" when you are matching), but you cannot
make one for `?' matching more than one byte that does not compose a valid
multibyte character.
The tests involving \243 are run in a Big5 environment. In Big5,
\243\135 is the representation of β, a single valid character, even
though \135 on its own is still the single character ].
Seem also a bit strange to me,... all shells match \243 against ? ...
i.e. ? matches a single byte that is not a character... but later on it
doesn't work again with \243] and ?]
Because, as Harald says, \243] is a valid multibyte character in Big5
locales.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/