Thank you for taking the time to report this issue and help to improve
Ubuntu.
The sort order you're seeing is in fact correct according to the locale
that you're using. Sort, or collation, order is defined on a per-locale
basis, because languages don't all have the same alphabetization rules,
and for most locales the practice is to ignore "unknown" characters when
sorting. This behavior, while debatable, is not something that is ever
likely to change, because doing so will break existing software that
expects the current behavior from these locales.
You are correct both that setting LC_ALL=POSIX will fix the sorting
problem, and that it will break display of the output. The solution to
this is to instead set LC_COLLATE=C (or LC_COLLATE=POSIX, if you
prefer), which will let you change the sorting order independently of
the character set, output language, and other features of the locale.
** Changed in: coreutils (Ubuntu)
Status: New => Invalid
--
'sort' does not correctly sort non-latin utf-8 encoded text
https://bugs.launchpad.net/bugs/71386
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs