--- In [email protected], "entropyreduction" <alancampbelllists+ya...@...> wrote: > > --- In [email protected], "silvermoonwoman2001" <sherip99@> wrote: > > > > The unicode plugin's character functions (such as length) apparently are > > dividing the number of UTF16-based bytes by 2 to get the length, which is > > true only for the Basic Multilingual Plane. Regex/utf8 works fine tho. > > Yes. I use wcslen, whenich returns string length. > > AsI've already said, many unicode services only work in the BMP, > because I use standnard Microsoft WCHAR services. Comparisons > won't work right, nor will any service that relies on finding a > position in a string (find, index, slice, etc). I'll update > documentation to say it's so, when I get a chance. Sometime much > later I'll try to build a new unicode plugin that piggybacks on > someone's existing code. No way I'm gonna try to reinvent the > wheels of surrogate pair detection, case folding, combining > character sequence, etc.
There are apparently some differences in Win2K vs XP <http://www.eggheadcafe.com/forumarchives/vcmfc/Dec2005/post24790482.asp> If I show the two high code points in a unicode.messagebox, I see only two box characters. I guess on Win 2k, you see four.
