Roozbeh Pournader <[EMAIL PROTECTED]>:
> http://anubis.dkuug.dk/jtc1/sc22/open/n2955.pdf
Thanks for that.
I was trying:
> reorder-after <U0106>
> <U0108> <ccirc>;<CIR>;<CAP>;IGNORE % Ĉ
> reorder-after <U0107>
> <U0109> <ccirc>;<CIR>;<MIN>;IGNORE % &ccirce;
In fact I should have U0043 and U0063 instead of U0106 and U0107 to
make [c-d] in regular expressions be equivalent to [c&ccirce;d].
As far as I can make out from a quick scan of the spec, only
"<ccirc>;<CIR>;<CAP>;IGNORE" is used for collating strings, but the
order of the lines matters for interpreting character ranges in
regular expressions.
Not all programs that use regular expressions are locale-sensitive in
this way. I haven't investigated why. One program that does have
locale-sensitive regular expressions is Mutt. At present a bug in
hu_HU prevents [a-z] from working in that locale, but [a-z] seems to
mean the same thing in all locales for egrep.
Edmund
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/