On 2007-09-27, Duncan Coutts <[EMAIL PROTECTED]> wrote:
> In message <[EMAIL PROTECTED]> [EMAIL PROTECTED] writes:
>> On 2007-09-27, Deborah Goldsmith <[EMAIL PROTECTED]> wrote:
>> > On Sep 26, 2007, at 11:06 AM, Aaron Denney wrote:
>> >>> UTF-16 has no advantage over UTF-8 in this respect, because of  
>> >>> surrogate
>> >>> pairs and combining characters.
>> >>
>> >> Good point.
>> >
>> > Well, not so much. As Duncan mentioned, it's a matter of what the most  
>> > common case is. UTF-16 is effectively fixed-width for the majority of  
>> > text in the majority of languages. Combining sequences and surrogate  
>> > pairs are relatively infrequent.
>> 
>> Infrequent, but they exist, which means you can't seek x/2 bytes ahead
>> to seek x characters ahead.  All such seeking must be linear for both
>> UTF-16 *and* UTF-8.
>
> And in [Char] for all these years, yet I don't hear people complaining. Most
> string processing is linear and does not need random access to characters.

Yeah.  I'm saying the differences between them are going to be in the
constant factors, and that these constant factors will differ between 
workloads.  

-- 
Aaron Denney
-><-

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to