Re: [precis] Last Call: (PRECIS Framework: Preparation and Comparison of Internationalized Strings in Application Protocols) to Proposed Standard

Peter Saint-Andre Wed, 21 May 2014 15:08:23 -0700

My previous reply to this message was sent in a hurry right before therelevant IESG telechat. This reply provides, I hope, a more measuredresponse.


On 4/23/14, 10:17 PM, John C Klensin wrote:

Recommendation: Hold this document until the various categories
are exercised by at least a sampling of applicable profiles.


As mentioned, we have 5 such profiles in somewhat advanced states:

1. draft-ietf-xmpp-6122bis (defines two profiles to replace twostringprep profiles) - completed WGLC in the XMPP WG

2. draft-ietf-precis-nickname (defines one profile for use by both MSRPand XMPP) - completed WGLC in the PRECIS WG

3. draft-ietf-precis-saslprepbis (defines one profile for usernames andone for passwords) - awaiting WGLC but widely discussed in the PRECISand KITTEN working groups, IMHO it could undergo WGLC fairly soon

Make the problems with excessive profiles explicit;


Would the following text help?

###

   If input strings that appear "the same" to users are programmatically
   considered to be distinct in different systems, or if input strings
   that appear distinct to users are programmatically considered to be
   "the same" in different systems, then users can be confused.  Such
   confusion can have security implications, such as the false positives
   and false negatieves discussed in [RFC6943].  One starting goal of
   work on the PRECIS framework was to limit the number of times that
   users are confused (consistent with the "principle of least
   astonishment").  Unfortunately, this goal has been difficult to
   achieve given the large number of application protocols already in
   existence, each with its own conventions regarding allowable
   characters (see for example [I-D.saintandre-username-interop] with
   regard to various username constructs).  Despite these difficulties,
   profiles should not be multiplied beyond necessity.  In particular,
   application protocol designers should think long and hard before
   defining a new profile instead of using one that has already been
   defined, and if they decide to define a new profile then they should
   clearly explain their reasons for doing so.

###

Suggestions for improvement are welcome. Right now I'm of the opinionthat such text belongs in the security considerations, but perhaps itmakes sense to place it much farther up in the document, even in theintroduction.

While thinking about this further in the middle of the night, I began towonder if anyone has done usability studies on user astonishment inrelation to internationalized strings. Do you know of any research wecan reference here?

impose
requirements on additional profiles that include an explanation
of why they are needed given the user-level interoperability and
astonishment problems that can result from even three of them


Do you think we need text beyond that quoted above?

and require a higher level of review and consensus for creating
new ones than "expert review".

In RFC 5226 terms, are you suggesting specification required, RFCrequired, or IETF review? I get the sense that you'd prefer the latter.

Modify the Security
Considerations section to identify user astonishment and
consequent confusion about what rules are being applied in a
given PRECIS-User protocol as a risk and possible attack vector.

Again, would the text quoted above suffice or do we need a more detaileddescription of the security implications?

[3] Not only should multiple profiles be discouraged for this
type of case by the Law of Least Astonishment, but general IETF
experience indicates that profiles are a bad idea and that
protocols that depend on them (and use a significant variety)
tend to be less successful than those that do not.

I continue to have difficulty imagining how we can do without somethinglike profiles, given the large number of identifiers already in thefield. One possible way to overcome part of that problem would be towork on something like draft-saintandre-username-interop in the PRECISWG and to strongly encourage (force?) applications to use that, at leastfor usernamey constructs.

[5] There are issues with the two classes that are defined in
this document that have not, I believe, been carefully addressed
in the WG.  At a minimum, they need to be considered more
carefully in context with specific examples without preempting
the choices to be made for those examples.  The following are
examples, not an exclusive list:

(i) The "Contextual Rule" categories of IDNA2008 have proven to
be extremely problematic, confusing, and controversial enough to
be one of the key issues in the UTR 46 battles that have impeded
IDNA2008 adoption in some important communities.  While I still
believe that the IDNA2008 decision was the right one, inclusion
of those characters as Valid for both classes [4] should, given
those bad experiences, require significantly more explanation
and/or justification than this draft appears to provide.  If the
intent of incorporating this list in PRECIS is to provide for
compatibility with IDNA2008, the list should be incorporated by
reference (either to RFC 5892 and its successors or to a
separate documents that updates and replaces the list in RFC
5892 as well as being used for PRECIS),  not be provided as a
list of code points that could cause the two definitions to
diverge if changes are made to one or the other.  The issue may
apply more generally to the other characters listed in Section
7.6 and to other IDNA-derived elements of Section 7.


As mentioned right before Section 7.1:

   The first ten categories (A-J) shown below were previously defined
   for IDNA2008 and are copied directly from [RFC5892].

Those are merely referenced here. Some of them are used here and someare not. Would it help to say that these definitions might change in thefuture if RFC 5892 is updated?

(ii) Characters with compatibility equivalents are problematic,
in large measure because of a Unicode issue that has been
extensively discussed in the past.  Basically the compatibility
equivalency category mixes several unrelated and incompatible
(sic) groups of things under a single label.  One extreme is
occupied by characters used with East Asian scripts that are
distinguished only by their widths.  If the Unicode design
principles of "characters, not glyphs" and "plain text" had been
followed, these different-width variations would never have been
assigned different code points.  As a contrasting example, the
relevance of, and distinctions among, special mathematical code
points distinguished by mathematical use and specific type
styles is less clear: treating them as separate characters may
be justified under some circumstances.  A more extreme example
occurs with some Chinese characters that Unicode treats as
compatibility equivalents for other characters.  That treatment
is usually (and perhaps always) appropriate when the "meaning"
of the characters is intended.  But some of those characters
are, or historically have been, used in personal names.  To the
extent to which those names are used as identifiers, e.g., for a
person thus named, treating them as invalid is inappropriate and
applying a compatibility mapping to them only slightly less bad.

The many issues with compatibility equivalents provide strong reasons toexclude them from the IdentifierClass, which is what the WG did inSection 3.2.3. They are allowed in the FreeformClass, but we providewarnings about that in Section 10.3. I'll add a forward reference to10.3 from 3.1 and 3.3. Suggestions are welcome if you feel that thewarnings are not strong enough.

(iii) Experience with IDNA strongly suggests that
"DefaultIgnorable" is a bad idea.

I don't recall discussion about this in the PRECIS WG, so thanks forraising the issue now.

Forbidding some (or perhaps
all) of these characters may be reasonable; treating them as if
they were not present impedes possibly-reasonable future
extensions and provides an attack vector for phishing and other
types of unpleasant behavior that do not appear to be called out
in Security Considerations.

These characters are DISALLOWED, thus forbidden rather than treated asnot present.

It is also worth noting that one
difference between IDNA2008 and this specification is that the
former is not profiled (except by registries providing their own
subsetting rules) while the latter is intended as a base for
profiling.  While the order in which categorization rules is
applied in Section 8 provides some protection (perhaps
sufficient protection if the spec is never updated or used as
the basis for a different profile),

Profiles are not allowed to override the order of rule-checking in thealgorithm used to determine derived properties.

it is worth noting that some
code points appear in more than one category (e.g., the
notorious ZWJ and ZWNJ in both JoinControl (Section 7.8) and
PrecisIgnorableProperties (Section 7.13)) and that this should
at least be called out explicitly.


There's no harm in doing so, for sure.

Peter


_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Re: [precis] Last Call: (PRECIS Framework: Preparation and Comparison of Internationalized Strings in Application Protocols) to Proposed Standard

Reply via email to