--- In [email protected], "Sheri" <sheri...@...> wrote: > > --- In [email protected], "entropyreduction" > <alancampbelllists+yahoo@> wrote:
> Well hopefully you can raise the limit and work out how to convert unicode > code points exceeding 0xFFFF in e.g., unicode.from_num, so we can use any > values from tables like this one: <http://www.utf8-chartable.de/>. Good news: Yes, I can do that. Bad news: All of current plugin is built on assumption of 1 unicode character == two bytes. get_char, set_char in their many variants and guises will fail or give incorrect results if multi-word unicode characters get in. First thing will to see if the current win api can cope. I'll implement append/create from number, then the test will be if to_utf8 returns the correct result. In the meantime, if you feel like experimenting, construct a utf8 string from several higher-plane characters, then do unicode.from_utf8(xxxx).to_utf8 and see if the same thing comes out as went in.
