--- In [email protected], "Sheri" <sheri...@...> wrote:
>
> --- In [email protected], "entropyreduction" 
> <alancampbelllists+yahoo@> wrote:
> >
> > 
> > In the meantime, if you feel like experimenting, construct a utf8
> > string from several higher-plane characters, then do
> > unicode.from_utf8(xxxx).to_utf8 and see if the same thing comes
> > out as went in.
> >
> 
> Yes, that works. Also using upper plane characters that I can actually see as 
> proper looking characters on the table in Firefox, I can put them into my own 
> html doc as utf8 and see them equally well. They look like box characters in 
> unicode.messagebox or IE. I suppose this is a font or font script issue.

Dunno.  When I have a chance might ionvestigate for messagebox.

There are quite a few services in plugin that aren't right if surrogate pairs 
and (oh dear) combining character sequences are taken into account.  length is 
wrong, all the get/set char stuff, index, slice, etc.

Dealing properly with all that is messy.  Best route probably is to get the IBM 
icu stuff 

  http://icu-project.org

and use it.  Major rewrite.  Will take a while, assuming I can get icu to work 
properly.  Means unicode will depend on an icu dll.



Reply via email to