Does AST sort on Linux support sorting of multibyte characters on Linux?
I tried to sort multibyte characters like in the test script below:
----------cut----------
% cat genunicodelist.sh
typeset -li16 i
rm x1 x2
for (( i=1 ; i < 0x3000 ; i++ )) ; do
printf "\u[${i/~(El)16#/}]\t- %s\n" "$i"
done >x1
~/bin/sort <x1 >x2
----------cut----------
if I look at the contents of file x2 I see that the sorting appears to
be based on the byte values:
----------cut----------
ჽ - 16#10fd
ჾ - 16#10fe
ჿ - 16#10ff
- 16#11
ᄀ - 16#1100
ᄁ - 16#1101
ᄂ - 16#1102
ᄃ - 16#1103
ᄄ - 16#1104
ᄅ - 16#1105
ᄆ - 16#1106
ᄇ - 16#1107
----------cut----------
Is this a bug in AST sort or Linux? We are using opensuse 12.2 on a
lenovo thinkpad and XENON servers.
AST sort version:
sort --version
version sort (AT&T Research) 2010-08-11
Ced
--
Cedric Blancher <[email protected]>
Institute Pasteur
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers