Does AST sort on Linux support sorting of multibyte characters on Linux?
I tried to sort multibyte characters like in the test script below:
----------cut----------
% cat genunicodelist.sh
typeset -li16 i

rm x1 x2

for (( i=1 ; i < 0x3000 ; i++ )) ; do
        printf "\u[${i/~(El)16#/}]\t- %s\n" "$i"
done >x1

~/bin/sort <x1 >x2
----------cut----------

if I look at the contents of file x2 I see that the sorting appears to
be based on the byte values:
----------cut----------
ჽ       - 16#10fd
ჾ       - 16#10fe
ჿ       - 16#10ff
        - 16#11
ᄀ      - 16#1100
ᄁ      - 16#1101
ᄂ      - 16#1102
ᄃ      - 16#1103
ᄄ      - 16#1104
ᄅ      - 16#1105
ᄆ      - 16#1106
ᄇ      - 16#1107
----------cut----------

Is this a bug in AST sort or Linux? We are using opensuse 12.2 on a
lenovo thinkpad and XENON servers.

AST sort version:
sort --version
  version         sort (AT&T Research) 2010-08-11

Ced
-- 
Cedric Blancher <[email protected]>
Institute Pasteur
_______________________________________________
ast-users mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-users

Reply via email to