Hi,

On Fri, Jul 6, 2012 at 10:07 PM, Sébastien Doeraene
<[email protected]> wrote:
> Hi everybody,
>
[snip]
>
> <VirtualString, aka VS> ::=
>      UnicodeString
>    | Atom, except '#' and 'nil'
>    | list of UnicodeChar (data type not implemented yet, but will come)
>    | Int (implicitly converted to decimal notation)
>    | Float (idem)
>    | #-tuple of zero to many <VS>'es (concatenation)
>    | decode(<Encoding> <VBS>)
>    | list of integers (implicitly interpreted as latin1 encoded - for
> compatibility)
>    | ByteString (implicitly interpreted as latin1 encoded - for
> compatibility)
>
> <VirtualByteString, aka VBS> ::=
>      ByteString
>    | list of integers which are bytes
>    | #-tuple of zero to many <VBS>'es
>    | encode(<Encoding> <VS>)
>
> <Encoding> ::= spec of an encoding, possible format: list of {latin1, utf8,
> utf16, utf32, littleEndian, bigEndian, bom}
>
> The mutually recursive definition of VS and VBS allows elaborate
> constructions.
>
> Given these definitions, APIs that expect textual data will always accept a
> VirtualString, and APIs that expect binary data (e.g., I/O) will accept a
> VirtualByteString.
>
> What do you think of this approach to re-unifying textual virtual strings
> and binary virtual strings?
>
> Cheers,
> Sébastien
>
> _________________________________________________________________________________
> mozart-hackers mailing list
> [email protected]
> http://www.mozart-oz.org/mailman/listinfo/mozart-hackers

I supported the idea of VirtualByteString because it is a natural
extension like String ==> VirtualString. But I believe "APIs execpting
binary data will accept VirtualByteString" would not be useful.

VirtualString is useful as it provides two functions:

1. Representing numbers as strings
2. Efficient concatenation

VirtualByteString does not support (1), and the need of (2) is
actually rare --- we often require writing ByteString to a stream, but
we're not interested in getting the intermediate ByteString, or even
not the final output as a ByteString again. I have written a Python 3
library which operates heavily on 'bytes' (the equivalent of
ByteString), but most of my use on 'bytes' are:

1. Extract a substring from the byte string
2. Decode the byte string into a structure (i.e. reinterpret_cast)
3. Decode the byte string into a string.

of which VirtualByteString will help none of these.

---

Also, I really dislike treating the 'encode' and 'decode' tuples as
Virtual(Byte)String, because it works like a lazy function call (but
it's not) and feels totally out of place where all other kinds of
VirtualStrings are just data structures interpreted almost as-is. A
real function call {ByteString.encode +EncodingsL +VS ?BS} like now is
better.

-- Kenny.
_________________________________________________________________________________
mozart-hackers mailing list                           
[email protected]
http://www.mozart-oz.org/mailman/listinfo/mozart-hackers

Reply via email to