My previous reply to this message was sent in a hurry right before the
relevant IESG telechat. This reply provides, I hope, a more measured
response.
On 4/23/14, 10:17 PM, John C Klensin wrote:
Recommendation: Hold this document until the various categories
are exercised by at least a sampling of applicable profiles.
As mentioned, we have 5 such profiles in somewhat advanced states:
1. draft-ietf-xmpp-6122bis (defines two profiles to replace two
stringprep profiles) - completed WGLC in the XMPP WG
2. draft-ietf-precis-nickname (defines one profile for use by both MSRP
and XMPP) - completed WGLC in the PRECIS WG
3. draft-ietf-precis-saslprepbis (defines one profile for usernames and
one for passwords) - awaiting WGLC but widely discussed in the PRECIS
and KITTEN working groups, IMHO it could undergo WGLC fairly soon
Make the problems with excessive profiles explicit;
Would the following text help?
###
If input strings that appear "the same" to users are programmatically
considered to be distinct in different systems, or if input strings
that appear distinct to users are programmatically considered to be
"the same" in different systems, then users can be confused. Such
confusion can have security implications, such as the false positives
and false negatieves discussed in [RFC6943]. One starting goal of
work on the PRECIS framework was to limit the number of times that
users are confused (consistent with the "principle of least
astonishment"). Unfortunately, this goal has been difficult to
achieve given the large number of application protocols already in
existence, each with its own conventions regarding allowable
characters (see for example [I-D.saintandre-username-interop] with
regard to various username constructs). Despite these difficulties,
profiles should not be multiplied beyond necessity. In particular,
application protocol designers should think long and hard before
defining a new profile instead of using one that has already been
defined, and if they decide to define a new profile then they should
clearly explain their reasons for doing so.
###
Suggestions for improvement are welcome. Right now I'm of the opinion
that such text belongs in the security considerations, but perhaps it
makes sense to place it much farther up in the document, even in the
introduction.
While thinking about this further in the middle of the night, I began to
wonder if anyone has done usability studies on user astonishment in
relation to internationalized strings. Do you know of any research we
can reference here?
impose
requirements on additional profiles that include an explanation
of why they are needed given the user-level interoperability and
astonishment problems that can result from even three of them
Do you think we need text beyond that quoted above?
and require a higher level of review and consensus for creating
new ones than "expert review".
In RFC 5226 terms, are you suggesting specification required, RFC
required, or IETF review? I get the sense that you'd prefer the latter.
Modify the Security
Considerations section to identify user astonishment and
consequent confusion about what rules are being applied in a
given PRECIS-User protocol as a risk and possible attack vector.
Again, would the text quoted above suffice or do we need a more detailed
description of the security implications?
[3] Not only should multiple profiles be discouraged for this
type of case by the Law of Least Astonishment, but general IETF
experience indicates that profiles are a bad idea and that
protocols that depend on them (and use a significant variety)
tend to be less successful than those that do not.
I continue to have difficulty imagining how we can do without something
like profiles, given the large number of identifiers already in the
field. One possible way to overcome part of that problem would be to
work on something like draft-saintandre-username-interop in the PRECIS
WG and to strongly encourage (force?) applications to use that, at least
for usernamey constructs.
[5] There are issues with the two classes that are defined in
this document that have not, I believe, been carefully addressed
in the WG. At a minimum, they need to be considered more
carefully in context with specific examples without preempting
the choices to be made for those examples. The following are
examples, not an exclusive list:
(i) The "Contextual Rule" categories of IDNA2008 have proven to
be extremely problematic, confusing, and controversial enough to
be one of the key issues in the UTR 46 battles that have impeded
IDNA2008 adoption in some important communities. While I still
believe that the IDNA2008 decision was the right one, inclusion
of those characters as Valid for both classes [4] should, given
those bad experiences, require significantly more explanation
and/or justification than this draft appears to provide. If the
intent of incorporating this list in PRECIS is to provide for
compatibility with IDNA2008, the list should be incorporated by
reference (either to RFC 5892 and its successors or to a
separate documents that updates and replaces the list in RFC
5892 as well as being used for PRECIS), not be provided as a
list of code points that could cause the two definitions to
diverge if changes are made to one or the other. The issue may
apply more generally to the other characters listed in Section
7.6 and to other IDNA-derived elements of Section 7.
As mentioned right before Section 7.1:
The first ten categories (A-J) shown below were previously defined
for IDNA2008 and are copied directly from [RFC5892].
Those are merely referenced here. Some of them are used here and some
are not. Would it help to say that these definitions might change in the
future if RFC 5892 is updated?
(ii) Characters with compatibility equivalents are problematic,
in large measure because of a Unicode issue that has been
extensively discussed in the past. Basically the compatibility
equivalency category mixes several unrelated and incompatible
(sic) groups of things under a single label. One extreme is
occupied by characters used with East Asian scripts that are
distinguished only by their widths. If the Unicode design
principles of "characters, not glyphs" and "plain text" had been
followed, these different-width variations would never have been
assigned different code points. As a contrasting example, the
relevance of, and distinctions among, special mathematical code
points distinguished by mathematical use and specific type
styles is less clear: treating them as separate characters may
be justified under some circumstances. A more extreme example
occurs with some Chinese characters that Unicode treats as
compatibility equivalents for other characters. That treatment
is usually (and perhaps always) appropriate when the "meaning"
of the characters is intended. But some of those characters
are, or historically have been, used in personal names. To the
extent to which those names are used as identifiers, e.g., for a
person thus named, treating them as invalid is inappropriate and
applying a compatibility mapping to them only slightly less bad.
The many issues with compatibility equivalents provide strong reasons to
exclude them from the IdentifierClass, which is what the WG did in
Section 3.2.3. They are allowed in the FreeformClass, but we provide
warnings about that in Section 10.3. I'll add a forward reference to
10.3 from 3.1 and 3.3. Suggestions are welcome if you feel that the
warnings are not strong enough.
(iii) Experience with IDNA strongly suggests that
"DefaultIgnorable" is a bad idea.
I don't recall discussion about this in the PRECIS WG, so thanks for
raising the issue now.
Forbidding some (or perhaps
all) of these characters may be reasonable; treating them as if
they were not present impedes possibly-reasonable future
extensions and provides an attack vector for phishing and other
types of unpleasant behavior that do not appear to be called out
in Security Considerations.
These characters are DISALLOWED, thus forbidden rather than treated as
not present.
It is also worth noting that one
difference between IDNA2008 and this specification is that the
former is not profiled (except by registries providing their own
subsetting rules) while the latter is intended as a base for
profiling. While the order in which categorization rules is
applied in Section 8 provides some protection (perhaps
sufficient protection if the spec is never updated or used as
the basis for a different profile),
Profiles are not allowed to override the order of rule-checking in the
algorithm used to determine derived properties.
it is worth noting that some
code points appear in more than one category (e.g., the
notorious ZWJ and ZWNJ in both JoinControl (Section 7.8) and
PrecisIgnorableProperties (Section 7.13)) and that this should
at least be called out explicitly.
There's no harm in doing so, for sure.
Peter
_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis