> From:Mark H Weaver <m...@netris.org> > To:guile-devel@gnu.org > Cc: > Sent:Thursday, March 10, 2011 3:39 PM > Subject:uc_tolower (uc_toupper (x)) > > I've noticed that srfi-13.c very frequently does: > > uc_tolower (uc_toupper (x)) > > Is there a good reason to do this instead of: > > uc_tolower (x)
Unicode defines a case folding algorithm as well as a data table for case insensitive sorting. Setting things to lowercase is a decent approximation of case folding. But doing the upper->lower operation picks up a few more of the corner cases, like U+03C2 GREEK SMALL LETTER FINAL SIGMA and U+03C3 GREEK SMALL LETTER SIGMA which are the same letter with different representations, or U+00B5 MICRO SIGN and U+039C GREEK SMALL LETTER MU which are supposed to have the same sort ordering. Now that we've pulled in all of libunistring, it might be a good idea to see if it has a complete implementation of unicode case folding, because upper->lower is also not completely correct. -Mike