I have a directory containing Latin, Cyrillic and Asian filenames.
After listing the directory with a German locale it took me a good while to figure out how the files were sorted:

• Latin filenames are sorted alphabetically (obviously)
• purely non-Latin filenames are sorted by their Unicode numbers (resulting in a pretty plausible order) • in filenames containing both Latin and non-Latin characters, the non-Latin ones are completely ignored, e.g. "三丐丑D" is treated like "D" • most confusingly, non-Latin filenames containing spaces (0x20, not non-breaking ones) are treated differently, resulting in this order:

$ LANG=de_DE.UTF-8 ls -1
«мно»
абв
абв где
вгд
где
эюя
一丁丂
七七丅
三丐丑
абв где
вгд ежз
丁丂 丆万丈
丒专 且丕
123
456
789
abc
bc de
bcd
бвC
cde
三丐丑D
def
xyz

The C.UTF-8 locale doesn’t treat the spaces differently, resulting in a less confusing list:

$ LANG=C.UTF-8 ls -1
123
456
789
abc
bc de
bcd
cde
def
xyz
«мно»
абв
абв где
абв где
бвC
вгд
вгд ежз
где
эюя
一丁丂
丁丂 丆万丈
七七丅
三丐丑
三丐丑D
丒专 且丕

With a Russian locale you still get the files with mixed Chinese/Latin filenames interspersed with the Latin ones but no separate blocks of files with/without spaces:

$ LANG=ru_RU.UTF-8 ls -1
一丁丂
七七丅
三丐丑
丁丂 丆万丈
丒专 且丕
123
456
789
abc
bc de
bcd
cde
三丐丑D
def
xyz
абв
абв где
абв где
бвC
вгд
вгд ежз
где
«мно»
эюя

While I assume the output above is allowed/required by a standard, I think it would be more helpful to merge the sorting of filenames with/without spaces.

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


<<< multipart/mixed; boundary=------------yIM5DyfoSAqss8tyvFkgIF0r; protected-headers=v1; boundary="----------=_1738803153-16794-1"; charset="UTF-8": Unrecognized >>>

Reply via email to