Re: [fltk.general] STR #2771 [Turkic locales and str(n)casecmp, toupper, tolower]

corvid Sat, 08 Dec 2012 14:45:58 -0800

Ian wrote:
> On 8 Dec 2012, at 03:17, corvid wrote:
> 
> 
> <...lots of useful analysis of fltk's use of strcasecmp...>
> 
> So, that's good stuff, and suggests that *probably* most of our usage is 
> "ASCII", or perhaps C locale based, and so therefore is fine "as is" for a 
> "normal" locale?
> Does that sound about right?
> 
> I guess if we do enable a Turkic locale we maybe have to ensure that these 
> usages all ignore the selected locale and always use the C locale instead, 
> then.


We initially wanted to do something like this with Dillo, but then furaisanjin
reported some problem in a Japanese environment. *searches*
http://lists.auriga.wearlab.de/pipermail/dillo-dev/2011-October/009113.html

Hmm, so I had wanted to use the C locale and pretty much leave it that way,
which turned out not to be so good.

As for changing the locale around various usages instead of using an
asciified comparison, I at least don't recall finding much information
about how expensive that might be.

> Exceptions might be:
> 
> 
> 
> > fluid/factory.cxx:1262:    if (!strcasecmp(name,table[i].name)) {v = 
> > table[i].value; return 1;}
> > 
> > So these are all about ASCII English as well. (I notice, though, that
> > lookup_symbol is called in places where the surrounding tests looked at
> > least superficially similar, and those were using strcmp(). That might
> > be right to be case-sensitive and not in others, and maybe not.)
> 
> Where maybe being sensitive to the host locale may make sense? At least in 
> some case... 

Well, if it has "TINY_SIZE", "ITALIC_STYLE", "HIDDEN_BUTTON", etc., where
a Turkic user wants to use them with lowercase dotless i's, that seems
like incorrect usage.

> > src/fl_utf8.cxx:186: UTF-8 aware strncasecmp - converts to lower case 
> > Unicode and tests.
> > src/fl_utf8.cxx:194:int fl_utf_strncasecmp(const char *s1, const char *s2, 
> > int n)
> > src/fl_utf8.cxx:215: UTF-8 aware strcasecmp - converts to Unicode and tests.
> > src/fl_utf8.cxx:221:int fl_utf_strcasecmp(const char *s1, const char *s2)
> > src/fl_utf8.cxx:223:  return fl_utf_strncasecmp(s1, s2, 0x7fffffff);
> > 
> > It looks like ucs_table_0041 does ASCII mappings.
> > (I note that U+0130, capital i with dot, doesn't have a lowercase entry.)
> 
> 
> Hmm, I guess ideally we'd want this to be self-consistent, so that converting 
> a char to upper and then to lower is an identity operation?
> 
> E.g. if an english or C locale was active, then i -> I -> i
> 
> But if Turkic was active then i -> I(dotted) -> i and also i(non-dot) -> I -> 
> i(non-dot)... ?

It sounds good when possible...

Some special cases:
http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt

_______________________________________________
fltk mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk

Re: [fltk.general] STR #2771 [Turkic locales and str(n)casecmp, toupper, tolower]

Reply via email to