Hi Rudolf,

Rudolf Sykora wrote on Wed, May 06, 2020 at 03:25:09PM +0200:

> is this an expected behaviour?

Yes, that is expected behaviour.

> odin$ ls v?k*
> ls: v?k*: No such file or directory
> odin$ ls v??k*

[ some file names containing two-byte UTF-8 sequences ]

> odin$ locale

The locale is totally irrelevant with respect to file names.
The locale is a user preference.  Different users can set different
locales.  Even the same user can set different locales for different
instances of programs they are running.

By contrast, names of files are system-wide properties.
They are necessarily the same for all users, no matter the locale.

> It seems I have to use double '??' to mach a single character.

Your misunderstandiing is that file names consist of characters.
They do not.  They consist of bytes, and to match two bytes,
you need two question marks.

You can use all kinds of bytes in file names, there are very few
restrictions: e.g. you cannot use a slash in a file name, and you
cannot manually create a file called "." or "..".

However, it is best practice to only use printable non-whitespace
ASCII bytes in file names.  It is well-known that using whitespace
characters in file names poses notorious security hazards.  Using
non-printable or non-ASCII bytes usually causes confusion, so it
certainly isn't recommended either.

That said, people sometimes have to deal with files named by other,
careless or reckless people, so OpenBSD tries to handle file names
in a best-effort manner when they contain unusual bytes.  For
example, when a file name contains byte sequences that can be
interpreted as UTF-8, ls(1) in xterm(1) displays these byte sequences
with the corresponding Unicode glyphs if you have set a UTF-8 locale.
But that doesn't imply filenames suddenly use UTF-8 in any sense.

Yours,
  Ingo

Reply via email to