Roozbeh Pournader <[EMAIL PROTECTED]>:

>       http://anubis.dkuug.dk/jtc1/sc22/open/n2955.pdf

Thanks for that.

I was trying:

> reorder-after <U0106>
> <U0108> <ccirc>;<CIR>;<CAP>;IGNORE % Ĉ
> reorder-after <U0107>
> <U0109> <ccirc>;<CIR>;<MIN>;IGNORE % &ccirce;

In fact I should have U0043 and U0063 instead of U0106 and U0107 to
make [c-d] in regular expressions be equivalent to [c&ccirce;d].

As far as I can make out from a quick scan of the spec, only
"<ccirc>;<CIR>;<CAP>;IGNORE" is used for collating strings, but the
order of the lines matters for interpreting character ranges in
regular expressions.

Not all programs that use regular expressions are locale-sensitive in
this way. I haven't investigated why. One program that does have
locale-sensitive regular expressions is Mutt. At present a bug in
hu_HU prevents [a-z] from working in that locale, but [a-z] seems to
mean the same thing in all locales for egrep.

Edmund
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to