On 12 Aug 2013, at 16:56, Stefan Bidi <[email protected]> wrote: > There are a couple of reasons why to use UTF-16: > (1) The CF/Foundation APIs assume UTF-16. CFStringGetCharacterAtIndex() and > CFStringGetCharacters() would be extremely inefficient for anything that > isn't either ASCII, Latin1 or UTF-16. Just look at what base has to do to > support UTF-8. It traverses through the whole string every time you call > -characterAtIndex:. > (2) Almost all ICU APIs use UTF-16. > > To address your concern about endianness, I don't think this is a problem at > all. The API to the outside world is still the same. We store all strings > in the host endianness and export them with the BOM if > isExternalRepresentation is specified. > > I can't use libc functions on almost anything except the most basic string > functions. Not even printf can be used because of the %@ specifier.
I would say that CoreBase should do the same as base here. Hold most strings as latin1 to keep them small, expand to UTF-16 when required. I think your worry about UTF-8 is needless ... the slow code is used only for literal strings, which are almost always so short that occasional calls which need to step through them are not a performance issue. _______________________________________________ Gnustep-dev mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gnustep-dev
