Re: perl unicode support [BACK OFF-TOPIC]

Marcin 'Qrczak' Kowalczyk Sat, 07 Apr 2007 11:22:19 -0700

Dnia 07-04-2007, sob o godzinie 10:21 -0400, Rich Felker napisał(a):

> I hope you generate an exception (or whatever the appropriate error
> behavior is) if the string contains a NUL byte other than the
> terminator when it's passed to C functions.


I do.

> Using UTF-8 would have accomplished the same thing without
> special-casing.

Then iterating over strings and specifying string fragments could not be
done by code point indices, and it’s not obvious how a good interface
should look like. Operations like splitting on whitespace would no
longer have simple implementations based on examining successive code
points.

> > This is too small advantage to overcome the inability of storing NULs
> > and the lack of O(1) length check (which rules out bounds checking on
> > indexing), and it’s impractical with garbage collection anyway.
> 
> C strings are usually used for small strings for which O(n) is O(1)
> because n is bounded by, say, 4096 (PATH_MAX).

This still rules out bounds checking. If each s[i] among 4096 indexing
operations has the cost of 4096-i, then 8M might become noticeable.

-- 
   __("<         Marcin Kowalczyk
   \__/       [EMAIL PROTECTED]
    ^^     http://qrnik.knm.org.pl/~qrczak/


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: perl unicode support [BACK OFF-TOPIC]

Reply via email to