>>>>> "Tobia" == Tobia Conforto <[EMAIL PROTECTED]> writes:

    Tobia> Graham Fawcett wrote:
    >> Here's another thought. It seems to me that if we
    >> were to represent strings as composite values, e.g. a
    >> two-slot record whose first slot is an encoding (the
    >> symbol 'utf8, or #f for 'byte' encoding), and whose
    >> second slot contains the string data, then the
    >> various string functions could dispatch on the type,
    >> and there would be no need to monkey-patch core
    >> string functions to get the desired semantics.

    Tobia> This is more or less how other languages, such as
    Tobia> Python, solved the issue.  Two kinds of strings,
    Tobia> byte and unicode, and overloading a few string
    Tobia> operations to have a slightly different meaning
    Tobia> when called on either, computing byte length
    Tobia> vs. character length.

I keep trying to say, this is *not* the issue! :)

The entire problem revolves around adding Unicode support as
an option, without modifying the core.  *If* we allow
ourselves to modify the core, then there is no problem at
all, and we can just copy the utf8 egg code over the
existing string procedures, and add in some procedures for
byte-level access.

That's just changing the procedures used to access strings.
Changing the fundamental string representation is a more
substantial change by an order of magnitude, involving
changes to the core compiler and the FFI among other things.
You're just proposing things that would cause even more
problems :)

-- 
Alex


_______________________________________________
Chicken-users mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/chicken-users

Reply via email to