On 10/24/19 9:40 PM, Kim Alvefur wrote:
We should refrain from using things like grapheme clusters in wire formats,
as those are subject to changes in upcoming Unicode versions and thus the
wire format would be understood differently depending on the Unicode version
implemented by the client.

Doesn't this also depend on the font?

If the font does not support certain graphemes it may be rendered as multiple (it may render 🤦‍♂️ as 🤦 and ♂️). The font rendering toolkit may be aware that this is a single grapheme since Emoji 4.0 and thus may consider it a single grapheme when selecting (for copy and paste, i.e. not allow to only copy the ♂️). If the rendering toolkit does allow to select only a part of this grapheme cluster and the user does so and instruct the client to make the selected text a reference, this would make things interesting again (because in the Unicode counting, you'd be in the middle of a character, so it would not be possible to actually do what the user instructed). Thus the font may be relevant for various UI/UX stuff and developers need to be aware of those when allowing the user to input stuff.

For output, the font would not be of any relevance, it doesn't matter if in the end the reference link is using a single grapheme or two graphemes because the font does not support that single grapheme from the newer Unicode version. Of course if the toolkit wants you to give highlight instructions in displayed graphemes, you'd have to deal with that, but I hope there is no toolkit doing that...

Does it make sense to do an Informational XEP for Unicode handling in XEPs?

Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org

Reply via email to