The reasons the precis group got a spate of questions from me today was I
was prepping to do this review. There are a couple of issues that the
precis folk should pay more attention to.
> 1. Introduction
...
> Instead, this document builds upon the
> internationalization framework defined by the IETF's PRECIS Working
> Group [I-D.ietf-precis-framework], while attempting to ensure that
> the characters allowed in Jabber IDs under stringprep are still
> allowed and handled in the same way under PRECIS.
"the same way" means more backward-compatibility to me than I think we
intend here.
> 3.1. Fundamentals
>
> jid = [ localpart "@" ] domainpart [ "/" resourcepart ]
> localpart = 1*1023(localpoint)
> ;
> ; a "localpoint" is a UTF-8 encoded
> ; Unicode code point that conforms to
> ; the "JIDlocalIdentifierClass" profile
> ; of the PRECIS IdentifierClass
> ;
This implies 1023 codepoints, not 1023 bytes to me. Same issue for ifqdn
and resourcepart. 6122 just had 1*; I think going back to that would be
fine since we have a rule below that captures the max size.
> 3.2. Domainpart
>
> The domainpart of a JID is that portion after the '@' character (if
> any) and before the '/' character (if any); it is the primary
I think it's often surprising to people that foo/@bar is a valid JID with
"foo" as the domainpart and "@bar" as the resourcepart. The text above,
although pulled from 6122, might be better as:
The domainpart of a JID is that portion after the first '@' character (if
any) and before the first '/' character (if any);
and possibly adding the example.
> In general, the content of a domainpart is an Internationalized
> Domain Name ("IDN") as described in the specifications for
> Internationalized Domain Names in Applications (commonly called
> "IDNA2008"), and a domainpart is an "IDNA-aware domain name slot" as
> defined in [RFC5890]. The following rules apply to a domainpart that
> consists of a fully-qualified domain name and MUST be applied in the
> following order:
When do these rules need to be applied? Only before comparison or routing?
> 1. The domainpart MUST contain only NR-LDH labels and U-labels as
> defined in [RFC5890] and MUST consist only of Unicode code points
> that conform to the rules specified in [RFC5892] (which includes
> Unicode normalization). This implies that the domainpart MUST
> NOT include A-labels as defined in [RFC5890]; each A-label MUST
> be converted to a U-label during preparation of a domainpart, and
> comparison MUST be performed using U-labels, not A-labels.
This seems like an always rule, including for dumb clients.
> 2. All uppercase and titlecase code points within the domainpart
> MUST be mapped to their lowercase equivalents, preferably using
> Unicode Default Case Folding as defined in Chapter 3 of the
> Unicode Standard [UNICODE].
Dumb clients might get away with this and the system would still work.
> 3. Fullwidth and halfwidth characters within the domainpart MUST be
> mapped to their decomposition mappings.
Dumb clients have no shot at this one.
> Implementation Note: The foregoing order is different from the
> order for localparts and resourceparts as described below, to
> maintain consistency with the IDNA methods in both [RFC5892] and
> [RFC5895].
>
> After any and all normalization, conversion, and mapping of code
> points,
as well as conversion to UTF-8.
> a domainpart MUST NOT be zero octets in length and MUST NOT
> be more than 1023 octets in length. (Naturally, the length limits of
> [RFC1034] apply, and nothing in this document is to be interpreted as
> overriding those more fundamental limits.)
>
> 3.3. Localpart
>
> The localpart of a JID is an optional identifier placed before the
> domainpart and separated from the latter by the '@' character.
> Typically a localpart uniquely identifies the entity requesting and
> using network access provided by a server (i.e., a local account),
> although it can also represent other kinds of entities (e.g., a chat
> room associated with a multi-user chat service [XEP-0045]). The
> entity represented by an XMPP localpart is addressed within the
> context of a specific domain (i.e., <localpart@domainpart>).
>
> A localpart MUST NOT be zero octets in length and MUST NOT be more
> than 1023 octets in length. This rule is to be enforced after any
> normalization and mapping of code points.
and conversion to UTF-8.
> A localpart MUST consist only of Unicode code points that conform to
> the "JIDlocalIdentifierClass" profile of the "IdentifierClass" base
> string class defined in [I-D.ietf-precis-framework]. The
> JIDlocalIdentifierClass profile includes all code points allowed by
> the IdentifierClass base class, with the exception of the following
> characters that are explicitly disallowed in XMPP localparts:
(special precis focus)
I would have expected this to be phrased more similarly to step 2 of
http://tools.ietf.org/html/draft-ietf-precis-framework-17#section-5, or
for section 5 to just have a step about codepoints forbidden in a given
usage of the selected precis class.
> The normalization and mapping rules for the JIDlocalIdentifierClass
> are as follows, where the operations specified MUST be completed in
> the order shown:
Again, I think we need language about when these rules are applied. The
rest of the section is about what is allowed, not about how to compare.
> 1. Fullwidth and halfwidth characters MUST be mapped to their
> decomposition mappings.
>
> 2. Uppercase and titlecase characters MUST be mapped to their
> lowercase equivalents, preferably using Unicode Default Case
> Folding as defined in Chapter 3 of the Unicode Standard
> [UNICODE].
Nothing about SpecialCasing?
> A resourcepart MUST NOT be zero octets in length and MUST NOT be more
> than 1023 octets in length. This rule is to be enforced after any
> normalization and mapping of code points.
>
> A resourcepart MUST consist only of Unicode code points that conform
> to the "JIDresourceFreeformClass" profile of the "FreeformClass" base
> string class defined in [I-D.ietf-precis-framework].
>
> The normalization and mapping rules for the resourcepart of a JID are
> as follows, where the operations specified MUST be completed in the
> order shown:
Again, when are the rules applied?
> 1. Fullwidth and halfwidth characters MAY be mapped to their
> decomposition mappings.
(precis)
I need a hint as to when do this. "MAY" isn't nearly enough.
> 2. Map any instances of non-ASCII space to ASCII space (U+0020).
(precis)
I was hoping either the framework doc or the mappings doc would tell me
more about which characters to map here. RFC 3454 had table C.1.2, but I
don't see any hints about what I'm supposed to do now. Is the rule "has a
compatibility mapping to U+0020"? That doesn't hit U+200B which is in
C.1.2, nor does "has category Zs". draft-ietf-precis-mappings says
"Therefore, the special mapping table should be based on a well-
defined mapping table for each protocol", which although I don't
particularly like, I can live with - but we need the table here.
> 3. So-called additional mappings MAY be applied, such as mapping of
> characters that are similar to common delimiters (such as '@',
> ':', '/', '+', '-', and '.', e.g., mapping of IDEOGRAPHIC FULL
> STOP (U+3002) to FULL STOP (U+002E)) and special handling of
> certain characters or classes of characters (e.g., mapping of
> non-ASCII spaces to ASCII space); the PRECIS mappings document
> [I-D.ietf-precis-mappings] describes such mappings in more
> detail.
>
> 4. Uppercase and titlecase characters MAY be mapped to their
> lowercase equivalents, preferably using Unicode Default Case
> Folding as defined in Chapter 3 of the Unicode Standard
> [UNICODE].
Again, I need more about the MAY here.
> 6. IANA Considerations
>
> The following completed templates provide the information necessary
> for the IANA to add 'JIDlocalIdentifierClass' and
> 'JIDresourceFreeformClass' to the PRECIS Profiles Registry.
Should we also ask them to mark the status of nodeprep and resourceprep to
deprecated in the stringprep profiles registry?
> Appendix A. Differences from RFC 6122
>
> Based on consensus derived from working group discussion,
> implementation and deployment experience, and formal interoperability
> testing, the following substantive modifications were made from RFC
> 6122.
I think it might be nice to point out that this may have made
previously-valid JIDs no longer valid (or vice-versa), and that we suggest
careful testing before migrating user data.
--
Joe Hildebrand
_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis