On 11/2/12 8:05 PM, "Peter Saint-Andre" <[email protected]> wrote:

>Ask, and ye shall receive.
>
>###
>
>   For comparison purposes (e.g., when a chatroom server determines if
>   two nicknames are in conflict during the authorization process), an
>   application MUST treat a nickname as follows, where the operations
>   specified MUST be completed in the order shown (in particular,
>   normalization MUST be performed before all other mapping steps and
>   validity checks, consistent with [I-D.ietf-precis-framework]):
>
>   1.  The string MUST be normalized using Unicode Normalization Form KC
>       (NFKC).  Because NFKC is more "aggressive" in finding matches
>       than other normalization forms (in the terminology of Unicode, it
>       performs both canonical and compatibility decomposition before
>       recomposing code points), this rule helps to reduce the
>       possibility of confusion by increasing the number of characters
>       that would match (e.g., U+2163 ROMAN NUMERAL FOUR would match the
>       combination of U+0049 LATIN CAPITAL LETTER I and U+0056 LATIN
>       CAPITAL LETTER V).
>
>   2.  Uppercase and titlecase characters MUST be mapped to their
>       lowercase equivalents.  In applications that prohibit conflicting
>       nicknames, this rule helps to reduce the possibility of confusion
>       by ensuring that nicknames differing only by case (e.g.,
>       "stpeter" vs. "StPeter") would not be allowed in a chatroom at
>       the same time.
>
>   3.  Non-ASCII space characters from the "N" category defined under
>       Section 6.14 of [I-D.ietf-precis-framework] MUST be mapped to
>       U+0020 SPACE.
>
>   4.  Leading and trailing whitespace (i.e., one or more instances of
>       the ASCII space character at the beginning or end of a nickname)
>       MUST be removed (e.g., "stpeter " is mapped to "stpeter").
>
>   5.  Interior sequences of more than one ASCII space character MUST be
>       mapped to a single ASCII space character (e.g., "St  Peter" is
>       mapped to "St Peter").
>
>   6.  Other mappings MAY be applied, such as those defined in
>       [I-D.yoneya-precis-mappings].  (Note that mapping of fullwidth
>       and halfwidth characters to their decomposition mappings is not
>       necessary, since those mappings are performed as part of
>       normalization using NFKC.)

I think we should also add the confusable mapping (see:
http://www.unicode.org/reports/tr39/#Confusable_Detection) here.

This brings up another point, however.  I think that things that are
acting as registries (such as a chat server ensuring that there aren't two
people using the same nickname) MUST NOT transmit these
liberally-normalized names.  That means they probably have to keep at
least two (and maybe three) versions around:

- The précis-mapped version for equality checking
- The confusable-mapped version for uniqueness checking
- The original version (maybe, if the précis-mapped version is losing
information)


This way, the spaces would still exist, but they aren't an attack vector.

-- 
Joe Hildebrand



_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Reply via email to