--- In [email protected], "silvermoonwoman2001" <sheri...@...> wrote: > > The unicode plugin's character functions (such as length) apparently are > dividing the number of UTF16-based bytes by 2 to get the length, which is > true only for the Basic Multilingual Plane. Regex/utf8 works fine tho.
Yes. I use wcslen, whenich returns string length. AsI've already said, many unicode services only work in the BMP, because I use standnard Microsoft WCHAR services. Comparisons won't work right, nor will any service that relies on finding a position in a string (find, index, slice, etc). I'll update documentation to say it's so, when I get a chance. Sometime much later I'll try to build a new unicode plugin that piggybacks on someone's existing code. No way I'm gonna try to reinvent the wheels of surrogate pair detection, case folding, combining character sequence, etc.
