Follow-up Comment #6, bug #15377 (project freeciv):
The situation is pretty bad.
The character functions in support.c are used in a lot of places, some legit
(like parsing of the registry files or capabilities) and some of which are on
utf-8 strings which is very wrong.
More problematic, they are used in functions like fcstrcasecmp,
remove_leading_trailing_spaces, and so on. These in turn are used in some
legit locations (again, registry file parsing) but in other places are used
on utf-8 strings. For instance fcstrcasecmp is used on player names which
may be utf-8, but it compares by going byte-by-byte and lower-casing each
byte. This is very, very wrong (though thankfully wont cause a crash; only
functions that modify strings are likely to cause major problems).
I really see no clear way out of this problem. tolower can't be used on
utf-8 strings with any validity, and without it a fcstrcasecmp function would
be extremely challenging, to say the least. Elsewhere going
function-by-function it is extremely hard to know which strings are utf-8,
and which are straight ascii, even within the registry code.
Unless we're willing to rework all the core code to use UCS2 or UCS4 as the
internal encoding, I dont think it's possible to ensure bug-free behavior.
The best we can do is fix places on a case-by-case basis when we encounter
problems.
_______________________________________________________
Reply to this item at:
<http://gna.org/bugs/?15377>
_______________________________________________
Message sent via/by Gna!
http://gna.org/
_______________________________________________
Freeciv-dev mailing list
[email protected]
https://mail.gna.org/listinfo/freeciv-dev