--- In [email protected], "entropyreduction" <alancampbelllists+ya...@...> wrote: > > --- In [email protected], "Sheri" <sherip99@> wrote: > > > > --- In [email protected], "entropyreduction" > > <alancampbelllists+yahoo@> wrote: > > > > > > > > > In the meantime, if you feel like experimenting, construct a utf8 > > > string from several higher-plane characters, then do > > > unicode.from_utf8(xxxx).to_utf8 and see if the same thing comes > > > out as went in. > > > > > > > Yes, that works. Also using upper plane characters that I can > > actually see as proper looking characters on the table in > > Firefox, I can put them into my own html doc as utf8 and see them > > equally well. They look like box characters in unicode.messagebox > > or IE. I suppose this is a font or font script issue.
> > Dunno. When I have a chance might ionvestigate for messagebox. > > There are quite a few services in plugin that aren't right if > surrogate pairs and (oh dear) combining character sequences are > taken into account. length is wrong, all the get/set char stuff, > index, slice, etc. > > Dealing properly with all that is messy. Best route probably is > to get the IBM icu stuff > > http://icu-project.org > > and use it. Major rewrite. Will take a while, assuming I can get > icu to work properly. Means unicode will depend on an icu dll. Are you sure there isn't something simpler to do? I see that Msft uses wchar_t not wchar for unicode. <http://msdn.microsoft.com/en-us/library/dtxesf6k%28VS.71%29.aspx> Maybe Sean can suggest something. Sean are you here? Regards, Sheri
