--- In [email protected], "Sheri" <sheri...@...> wrote: > > --- In [email protected], "entropyreduction" > <alancampbelllists+yahoo@> wrote: > > > > > > In the meantime, if you feel like experimenting, construct a utf8 > > string from several higher-plane characters, then do > > unicode.from_utf8(xxxx).to_utf8 and see if the same thing comes > > out as went in. > > > > Yes, that works. Also using upper plane characters that I can actually see as > proper looking characters on the table in Firefox, I can put them into my own > html doc as utf8 and see them equally well. They look like box characters in > unicode.messagebox or IE. I suppose this is a font or font script issue.
Dunno. When I have a chance might ionvestigate for messagebox. There are quite a few services in plugin that aren't right if surrogate pairs and (oh dear) combining character sequences are taken into account. length is wrong, all the get/set char stuff, index, slice, etc. Dealing properly with all that is messy. Best route probably is to get the IBM icu stuff http://icu-project.org and use it. Major rewrite. Will take a while, assuming I can get icu to work properly. Means unicode will depend on an icu dll.
