Hi John, Thanks very much for your comments.
I can definitely see what you mean regarding the issues surrounding casefolding emoji -- thanks for spelling them out so clearly and providing the relevant links, I've certainly got some reading to do. Regardless of emoji, the expanded range of characters we can use thanks to adopting an IdentifierClass profile makes it more than worthwhile anyways. Looking at the issues surrounding this, I'll likely go forward with the UsernameCaseMapped profile for this. All else fails, later on if/when a consistent strategy is adopted across the industry for casefolding emoji we can always revisit this. Your point about case conversions I can understand, particularly in newer systems with the sort of audience that'd expect that. In this particular protocol, not doing case conversion would go against many years of established behaviour for both software and users so we'll probably stick with the folding for now. However, I appreciate the view and I'll definitely keep it in mind going forward. Thanks again for your reply, I have a much clearer idea of where to go with this. Regards, Daniel Oakley On 19 March 2017 at 06:07, John C Klensin <[email protected]> wrote: > Daniel, > > (top post) > > Two very quick comments/ suggestions: > > (1) You could do a lot worse than using a PRECIS IdentifierClass > profile, if only because it would give you consistency with > other applications based on IETF specs. Regardless of the number > of things users and designers "want" (often including a > plentiful supply of ponies), one thing I've learned from far > more years of experience with users and user interfaces is that, > in the last analysis, the dominant desire is for consistency and > predictability. It also turns that that consistency and > predictability makes things more secure, so that is a win across > the board. > > Where possible, I also prefer to avoid case folding in > identifiers. Yes, users "want" it and we generally assume that > it is important, but remember that Unix and closely-related > systems don't do that and most people stopped complaining years > ago. Avoiding it gets more important with non-ASCII strings > because there are some language and/or locality issues that > lead, for a small number of edge cases, to behavior that users > who haven't studied and accepted the rules think are > inconsistent with their expectations. If you must do case > conversions, UsernameCaseMapped is at least as good as anything > else you might choose and has just about the same consistency > properties as other standardized IdentifierClass profiles. > > (2) As much as people want them, emoji are really not suitable > for use in identifiers. Maybe they will be some years from now > but, at present, the names assigned by Unicode are not generally > accepted and used, the icons used for a given one are > significantly inconsistent across systems, there is no agreement > about matching rules or normalization (especially in the context > of modifiers), and so on. If you (or the IETF) were to invent > emoji normalization, someone else (such as the Unicode > Consortium, but they aren't the only candidate) may come along, > create their own version, and _really_ confuse your users. > FWIW, the Unicode Consortium seems to agree about unsuitability > in identifiers -- some issues with domain names notwithstanding, > their identifier recommendations (UAX #31: Unicode Identifier > and Pattern Syntax) do not allow emoji in identifiers. > > This topic has been discussed extensively during the last few > weeks on the IDNA-Update mailing list (archives at > http://www.alvestrand.no/pipermail/idna-update/, at least for > now). While parts of the discussion are specific to domain > names, you can learn a lot more there if you are curious. > > john > > > > > --On Friday, March 17, 2017 11:52 +1000 Daniel Oaks > <[email protected]> wrote: > > > Hey everyone, > > > > I do work with the IRC chat protocol. Specifically, right now > > I'm doing work around allowing proper Unicode support, and > > writing the casefolding specs that would required to allow > > that. > > > > My current solution is based on PRECIS, but I'm running into > > an issue and not exactly sure how to solve it. > > > > Essentially, we need to casefold 'nicknames' (usernames that > > clients are referred to by), and for 'channel names' (chat > > room names). It would be much preferred to use an > > *IdentifierClass* profile. Using a single profile for both > > name types is also much preferred for reasons of implementation > > simplicity and for other protocol reasons (while we can have > > different ones for both if it's necessary, sticking to a > > single one would be much preferred). > > > > The only real profile out there which matches that description > > right now is UsernameCaseMapped, which while does everything > > we want to for nicknames, disallows emoji in channel names > > (which some services have already knowingly allowed). > > > > I haven't dived deep into Unicode and normalisation, but would > > there be a way for an *IdentifierClass* profile to allow and > > appropriately normalise emoji? If so, would the best thing for > > us to do here be to actually create our own profile for IRC > > (channel) names? I'm wary of doing so seeing the advice > > against profile proliferation here > > <https://tools.ietf.org/html/rfc7564#section-5.1>, but given > > the restriction it's difficult for us to adopt an > > *IdentifierClass* profile for this without creating our own. > > > > Any advice on what we should do here would be much > > appreciated. Thanks for the work you've all done so far! > > > > Regards, > > Daniel Oakley > > > > >
_______________________________________________ precis mailing list [email protected] https://www.ietf.org/mailman/listinfo/precis
