Ian wrote: > On 8 Dec 2012, at 03:17, corvid wrote: > > > <...lots of useful analysis of fltk's use of strcasecmp...> > > So, that's good stuff, and suggests that *probably* most of our usage is > "ASCII", or perhaps C locale based, and so therefore is fine "as is" for a > "normal" locale? > Does that sound about right? > > I guess if we do enable a Turkic locale we maybe have to ensure that these > usages all ignore the selected locale and always use the C locale instead, > then.
We initially wanted to do something like this with Dillo, but then furaisanjin reported some problem in a Japanese environment. *searches* http://lists.auriga.wearlab.de/pipermail/dillo-dev/2011-October/009113.html Hmm, so I had wanted to use the C locale and pretty much leave it that way, which turned out not to be so good. As for changing the locale around various usages instead of using an asciified comparison, I at least don't recall finding much information about how expensive that might be. > Exceptions might be: > > > > > fluid/factory.cxx:1262: if (!strcasecmp(name,table[i].name)) {v = > > table[i].value; return 1;} > > > > So these are all about ASCII English as well. (I notice, though, that > > lookup_symbol is called in places where the surrounding tests looked at > > least superficially similar, and those were using strcmp(). That might > > be right to be case-sensitive and not in others, and maybe not.) > > Where maybe being sensitive to the host locale may make sense? At least in > some case... Well, if it has "TINY_SIZE", "ITALIC_STYLE", "HIDDEN_BUTTON", etc., where a Turkic user wants to use them with lowercase dotless i's, that seems like incorrect usage. > > src/fl_utf8.cxx:186: UTF-8 aware strncasecmp - converts to lower case > > Unicode and tests. > > src/fl_utf8.cxx:194:int fl_utf_strncasecmp(const char *s1, const char *s2, > > int n) > > src/fl_utf8.cxx:215: UTF-8 aware strcasecmp - converts to Unicode and tests. > > src/fl_utf8.cxx:221:int fl_utf_strcasecmp(const char *s1, const char *s2) > > src/fl_utf8.cxx:223: return fl_utf_strncasecmp(s1, s2, 0x7fffffff); > > > > It looks like ucs_table_0041 does ASCII mappings. > > (I note that U+0130, capital i with dot, doesn't have a lowercase entry.) > > > Hmm, I guess ideally we'd want this to be self-consistent, so that converting > a char to upper and then to lower is an identity operation? > > E.g. if an english or C locale was active, then i -> I -> i > > But if Turkic was active then i -> I(dotted) -> i and also i(non-dot) -> I -> > i(non-dot)... ? It sounds good when possible... Some special cases: http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt _______________________________________________ fltk mailing list [email protected] http://lists.easysw.com/mailman/listinfo/fltk

