On Sun, Oct 20, 2019, at 18:39, JC Brand wrote: > You don't need tons of ways, you can just follow the instructions. If > the sending client is buggy, then this will become clear over time.
"Following the instructions" may mean different things to different clients in this case. One might treat it as an error, one might display it and break up the flag emoji, etc. This is not ideal. > Yes, you just render the two letters separately given that this is > what's implied by the information you've been given and it's also a > legitimate use-case. Assuming this is the desired behavior and we can actually do this: Now that they've been rendered separately, what if the receiving client copies and pastes the message. The highlight is not included, or just becomes plain text, does this mean the flag emoji is rejoined and now the copy/pasted message is different from the original? This doesn't seem ideal. > > What if it's between something and a zero-width joiner that would > > join it to another glyph, does that split that and now you have a > > dangling joiner? > > This is as clearly an error as setting an offset in the middle of a > UTF-8 encoding. Perhaps. Now we just have to enumerate all the other ways that Unicode handles things like this, and make sure all clients handle them the same way. This would of course be a problem if we were using bytes, for example, too, but the point is that it's not as simple as saying "these things are errors and these aren't". There are different ways to handle these, and Unicode has a lot of edge cases we likely won't think of. > > From a code perspective does this mean that highlighting always has > > to integrate with the text rendering engine? This seems like a > > *major* downside to me, as it likely makes the code much more > > complicated, and we may or may not even have the ability to > > manipulate how the text rendering engine handles things. > > It's not clear to me why you think highlighting will necessarily > require integration with the rendering engine. It should be possible > to identify unicode codepoints in a string independent of any > rendering engine. How do you propose breaking up a flag emoji, for example? We have to have a way to tell the text rendering engine "don't render this flag, show the letters". We could probably include a zero width space or something between the letters, but now when someone copy/pastes the message they are copying characters that weren't part of what the sender actually typed, which doesn't feel great. —Sam -- Sam Whited _______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________