Am 10.04.2026 um 04:00 schrieb Collin Funk:
Thomas Wolff <[email protected]> writes:
Am 07.04.2026 um 12:28 schrieb Dan Jacobson:
I hereby propose Coreutils' sort(1) add the ability to sort Chinese
(actually CJK) numbers.
https://chinese.stackexchange.com/questions/64035/how-to-sort-chinese-numbers-with-a-computer
Isn't Chinese has the most native speakers in the world so it's high
time that sort(1) deal with the numbers, pun intended.
A suitable basis for such handling is file Unihan_NumericValues.txt in
the Unihan.zip from Unicode.org.
GNU libunistring has the uc_numeric_value function to convert Unicode
characters to numeric values.
I had previously considered proposing the functionality to 'numfmt'. I'm
not sure it is worth adding to 'sort'. My guess is that it will not be
used very frequently, but perhaps I am wrong.
If the feature were added, though, there would certainly be no point in
limiting things to Chinese numerals. There are many other symbols used
worldwide, see the many used in India alone for example [1].
But it seems at first glance that the Chinese numerals are the only ones
whose Unicode characters are not arranged in their numerical order.
Maybe the others would "sort out" implicitly?
Collin
[1] https://en.wikipedia.org/wiki/Hindu%E2%80%93Arabic_numeral_system#Symbols