On 1/4/10 3:15 PM, Julian Reschke wrote:
But what's the alternative? Decide the encoding in each case? The encoding not being predictable seems to be worse than anything else...
Well, one non-destructive alternative is to encode JS strings as bytes by converting each 16-bit code unit into a byte pair (in LE or BE order, as desired). This has the obvious drawback of stuffing null bytes into the header, as well as not round-tripping with the byte-inflation.
But that's the only non-destructive alternative (well, that and variants like base64-encoding to get around the null byte thing) I see, given that JS strings are actually arrays of arbitrary 16-bit integers. In particular, conversion to UTF-8 is in fact destructive, as is any other conversion that treats the string as Unicode of some sort.
If we don't have a requirement to preserve any possible JS string via this API, then we probably have more flexibility..
-Boris
