>>>>> "Graham" == Graham Fawcett <[EMAIL PROTECTED]> writes:

    Graham> On Mon, Mar 17, 2008 at 11:22 AM, Kon Lovett <[EMAIL PROTECTED]> 
wrote:

    Graham> The Factor language borrowed from Larceny a
    Graham> clever mechanism for representing Unicode
    Graham> strings efficiently. Perhaps such a system is
    Graham> feasible for Chicken, and might eliminate some
    Graham> of these issues (at the cost of distancing our
    Graham> string type a bit more from C char arrays):

    Graham> http://factor-language.blogspot.com/2008_01_01_archive.html

    Graham> "The new representation is quite clever, and
    Graham> comes from Larceny Scheme. The idea is that
    Graham> strings are ASCII strings, but have an extra
    Graham> slot pointing to an 'auxiliary vector'.

This only adds news issues, and solves none of the old ones.
The representation itself is interesting, though it may in
fact be a pessimisation in many cases (utf8 is about the
fastest approach for parsing and regex matching, which are
the string operations where speed is the biggest issue to
begin with).

The problems we're having aren't even about string
representation though, they're about the semantics of the
string operations themselves.  Are the string indices byte
positions or character positions?  Different libraries
disagree.

-- 
Alex


_______________________________________________
Chicken-users mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/chicken-users

Reply via email to