Re: Merging CoreBase into Base

Richard Frith-Macdonald Mon, 12 Aug 2013 09:19:38 -0700

On 12 Aug 2013, at 16:56, Stefan Bidi <[email protected]> wrote:

> There are a couple of reasons why to use UTF-16:
> (1) The CF/Foundation APIs assume UTF-16.  CFStringGetCharacterAtIndex() and 
> CFStringGetCharacters() would be extremely inefficient for anything that 
> isn't either ASCII, Latin1 or UTF-16.  Just look at what base has to do to 
> support UTF-8.  It traverses through the whole string every time you call 
> -characterAtIndex:.
> (2) Almost all ICU APIs use UTF-16.
> 
> To address your concern about endianness, I don't think this is a problem at 
> all.  The API to the outside world is still the same.  We store all strings 
> in the host endianness and export them with the BOM if 
> isExternalRepresentation is specified.
> 
> I can't use libc functions on almost anything except the most basic string 
> functions.  Not even printf can be used because of the %@ specifier.


I would say that CoreBase should do the same as base here.  Hold most strings 
as latin1 to keep them small, expand to UTF-16 when required.
I think your worry about UTF-8 is needless ... the slow code is used only for 
literal strings, which are almost always so short that occasional calls which 
need to step through them are not a performance issue.


_______________________________________________
Gnustep-dev mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/gnustep-dev

Re: Merging CoreBase into Base

Reply via email to