On Thu, Oct 14, 2010 at 7:35 PM, wren ng thornton <[email protected]> wrote:

> This is definitely an issue with the proposal. But if it can be
> surmounted, I think the stranded-string proposal is a nice one---
> certainly better than settling on any particular utf-N for everything.
> UTF-8 works well for European languages and about half the content of
> Asian languages, but that doesn't convince me that the other half of
> Asian languages should get screwed, or that utf-8 is the best internal
> representation in the world.
>
> Solving this issue may take a bit of sufficient smartness however. If
> we're trying to avoid that, then the API should have ways of tweaking
> the behavior of when we switch encodings--- at the very least, it should
> have some way of saying when a (short) string should be forced to be a
> single strand, using whatever strand width is necessary. Working out the
> details of the rest of the API could be tricky though.


I agree. But note that I'm talking about a strictly internal-to-runtime
representation that is not externally exposed. I'm proposing this as a
preferred reference implementation for general-purpose systems, not as a
mandated representation. I don't believe that the library standard
*should*mandate a representation.

shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to