Re: uc_tolower (uc_toupper (x))

Mike Gran Thu, 10 Mar 2011 16:58:22 -0800

> From:Mark H Weaver <[email protected]>
> To:[email protected]
> Cc:
> Sent:Thursday, March 10, 2011 3:39 PM
> Subject:uc_tolower (uc_toupper (x))
> 
> I've noticed that srfi-13.c very frequently does:
> 
>   uc_tolower (uc_toupper (x))
> 
> Is there a good reason to do this instead of:
> 
>   uc_tolower (x)


Unicode defines a case folding algorithm as well as
a data table for case insensitive sorting.  Setting
things to lowercase is a decent approximation of
case folding.  But doing the upper->lower operation picks
up a few more of the corner cases, like U+03C2 GREEK
SMALL LETTER FINAL SIGMA and U+03C3 GREEK SMALL LETTER SIGMA
which are the same letter with different representations,
or U+00B5 MICRO SIGN and U+039C GREEK SMALL LETTER MU
which are supposed to have the same sort ordering.

Now that we've pulled in all of libunistring, it might
be a good idea to see if it has a complete implementation
of unicode case folding, because upper->lower is also not
completely correct.

-Mike

Re: uc_tolower (uc_toupper (x))

Reply via email to