On Thu, Oct 14, 2010 at 7:35 PM, wren ng thornton <[email protected]> wrote:
> This is definitely an issue with the proposal. But if it can be > surmounted, I think the stranded-string proposal is a nice one--- > certainly better than settling on any particular utf-N for everything. > UTF-8 works well for European languages and about half the content of > Asian languages, but that doesn't convince me that the other half of > Asian languages should get screwed, or that utf-8 is the best internal > representation in the world. > > Solving this issue may take a bit of sufficient smartness however. If > we're trying to avoid that, then the API should have ways of tweaking > the behavior of when we switch encodings--- at the very least, it should > have some way of saying when a (short) string should be forced to be a > single strand, using whatever strand width is necessary. Working out the > details of the rest of the API could be tricky though. I agree. But note that I'm talking about a strictly internal-to-runtime representation that is not externally exposed. I'm proposing this as a preferred reference implementation for general-purpose systems, not as a mandated representation. I don't believe that the library standard *should*mandate a representation. shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
