Hi Joe!

Joe Hildebrand schrieb:
What do you mean by character?
- Glyph?
- Codepoint?

Do you have to perform some sort of canonicalization before counting?
Combining characters make this particularly difficult, which is why we settled on something easy to describe and understand in JIDs.

For JIDs: This all has been already solved with stringprep which XMPP uses (i.e. Codepoints after stringprep normalization). For the case of roster items: Characters as how they are sent be user user that creates the roster item (i.e. Codepoints as transmitted).

I don't see where you have a problem with it. - And if you would have it, you must also have a problem in couting octets, as they very depending on the type of normalization as well.


Matthias

Reply via email to